Researchers in Carnegie Mellon University's School of Computer Science have developed a program that allows the computer to automatically generate 3D reconstructions of scenes based on a single image. They've found a way to help computers understand the geometric context of outdoor scenes and thus better comprehend what they see. It may ultimately find application in vision systems used to guide robotic vehicles, monitor security cameras and archive photos.
Above is a composite image of Carnegie Mellon's University Center, showing a photograph (top left) and three 3-D reconstructions derived from it.
Using machine learning techniques, Robotics Institute researchers Alexei Efros and Martial Hebert, along with graduate student Derek Hoiem, have taught computers how to spot the visual cues that differentiate between vertical surfaces and horizontal surfaces in photographs of outdoor scenes. Only about three percent of surfaces in a typical photo are at an angle, they have found. Carnegie Mellon researchers have shown that having a sense of 3D geometry can help computers identify objects, such as cars and pedestrians, in street scenes.
The researchers taught the computer by using 300 images gleaned from Google and showing it numerous examples of vertical and horizontal surfaces. This allowed a machine learning program to develop statistical associations between certain shapes, shadings and other characteristics typical of each orientation. The program also takes advantage of the constraints of the real world – skies are blue, horizons are horizontal and most objects sit on the ground.
A composite image, showing a photograph and three 3-D reconstructions derived from it, is available at www.cs.cmu.edu/~efros/img/popup.jpg. Animations of the 3-D models can be viewed at www.cs.cmu.edu/~dhoiem/projects/popup/index.html.
Filed Under: Software