计算机视觉
- Image(video) -> (sensing device) -> (interpreting device) -> Interpretations
- Automatic understanding of images and video
- Measurement: Computing properties of the 3D world from visual data
- Perception and interpretation: Algorithms and representations to allow a machine to recognize objects, scene and people
- 应用
- Faces and digital cameras
- Video-based interfaces
- safety and security
- Navigation, driver safety
- Monitoring pool
- Pedestrian detection
- Surveillance
- Vison for medical and neuroimages
- 困难:Gap between low level signal and high level meanings
- ill posed problem
- large variation: illumination, object pose, clutter, occlusions, intra-class appearance, viewpoint
- intra-class variation
- context
- complexity
- Progress chart by dataset
- Roberts 1963
- COIL
- MIT-CMU Faces (2000)
- UIUC Cars
- INRIA Pedestrians
- MSRC 21 Objects (2005)
- Caltech-101
- Caltech-256
- PASCAL VOC (2010):奠定了计算机视觉评价体系
- Faces in the Wild
- 80M Tiny Images
- ImageNet:多;长尾
- Birds-200
- Tasks
- Recogintion: General categories
- Large scale recognition
- Recognition in first-person view
- Object detection, instance segmentation
- image captioning
- image generation
- CVPR'19 by the numbers
- Submissions: 5160
- Accepted: 1294
- Registered Attendees: 9227
- Marr’s vision framework
- computational level
- algorithmic/representational level
- implementational/physical level
- Malik’s Perspective: Recognition, Reorganization, Reconstruction
- Important note: In general, computer vision does not work (except in certain situations/conditions)