Researchers at the Georgia Institute of Technology unveiled a novel video-editing solution this week that automatically sorts and edits untouched footage into the most picturesque highlights for a vacation reel that could fill anyone with envy.
The new approach is an algorithm—developed by students Daniel Castro and Vinay Bettadapura under the guidance of Professor Irfan Essa—that analyzes video for images with ideal artistic properties. It first considers geolocation, then composition, symmetry and color vibrancy to determine what is important or picturesque. Video frames with the highest scores are processed into a highlight reel.
Castro and Bettadapura conceived the approach after the latter returned from a two-week vacation, driving coast-to-coast across the southern United States. He ended the trip with 26.5 hours of footage from a wearable, head-mounted camera. He had no idea what do with it.
“The data was essentially useless because there was just too much of it,” said Bettadapura, who completed his Ph.D. in the fall and now is a Google software engineer. “We liked the idea of being able to automatically generate photo albums from your vacation, algorithmically.”
The algorithm turned 26 hours of video into a 38-second highlight reel in three hours. Because Bettadapura had worn a head-mounted Contour Action Camera that captured GPS data, the algorithm could filter by geographical location. That reduced the footage to 16 hours. Shot boundary detection further reduced it into 1,724 video shots or about 10.2 hours of video. It then processed for artistic quality and provided an output of the most picturesque content. Processing time is variable and depends on the number of computers used.
The algorithm can be adapted to user preferences.
“We can tweak the weights in our algorithm based on the user’s aesthetic preferences,” Bettadapura said. “By incorporating facial recognition, we can further adapt the system to generate highlights that include people the user cares about.”
Castro and Bettadapura presented their findings at WACV 2016: IEEE’s Winter Conference on the Applications of Computer Vision, March 7, 2016 in Lake Placid, N.Y. The pair will continue to work together testing the algorithm with multiple participants to help generalize the approach, incorporate facial recognition, and develop data visualization techniques that make it easy to browse and search specific moments. The implications of future, successful tests could echo far beyond their initial work at Georgia Tech.
“This research brings together multiple modalities to more efficiently understand large amounts of data,” said Castro, who is completing his Ph.D. in computer science and also working as an intern at Google. “We are trying to optimize how easy it is to understand all of the data we have in an efficient manner because otherwise it would be impossible to do so.”
Filed Under: M2M (machine to machine)