Facebook is making fast gains in computer vision AI. The Facebook researchers placed second in the 2015 Microsoft Common Objects in Context (COCO) image recognition competition, in which Microsoft Research took first place. This year Microsoft also won another industry competition, ImageNet.
“Teaching computers to be able to detect and differentiate between objects, to train them to understand what the patterns in the pixels mean, is something the Facebook AI Research (FAIR) team has been working on for the last year,” Schroepfer wrote in a Facebook post.
“They’ve made huge progress in a short time. FAIR has been able to create more than a 60 percent improvement in its object detection and segmentation technology in a year.” Those two tasks imply correctly identifying different objects in images and precisely narrowing down their locations in the images.
Facebook, Microsoft, Google, Baidu, and even Apple have all been building up their supplies of artificial intelligence talent in recent years, and these competitions provide opportunities to demonstrate technical progress. But what’s arguably more important for ordinary people who aren’t artificial researchers is how the research leads to better products and services.
Google and Apple have boasted this year about improvements in the area of speech recognition. This past spring Schroepfer demonstrated how Facebook could make machines answer questions about short snippets of text and identify sports in videos.
Last month, Schroepfer described an early implementation of a mobile app that could take questions about a photo from a blind person and then provide a spoken answer. Today, Schroepfer is more forward-looking, talking about how Facebook’s latest achievements in image recognition could affect everyday Facebook users.
“By enabling computers to recognize objects in photos, it will be easier to search for ‘the picture of the fruit bowl’ without you having to explicitly tag each photo you upload,” Schroepfer wrote.
“It will also help us make sure your feed is filled with the photos you most want to see. People with vision loss will be able to understand what is in a photo their friends share because the system will be able to tell them, regardless of the caption posted with the image. While there is still so a long ways to go, I’m really excited by the progress the team has made.”