AI expert predicts half of web searches will soon be speech and images

In the future, we will depend on additional interfaces to find information on the Internet, namely by speaking and by taking pictures of the things around us. Groups of computers in the cloud can now understand the words and sentences we dictate into our phones and identify objects that appear in the photographs.

And Baidu, the second-largest web search provider in the world, with its biggest user base in its home country of China, has been preparing its systems for a time when text will be just another option for searching, and not necessarily the default. “In five years, we think 50 percent of queries will be on speech or images,” Andrew Ng, Baidu’s chief scientist and the head of Baidu Research, said Wednesday during a Gigaom meetup on his area of expertise, deep learning.

A type of artificial intelligence, deep learning involves training systems called artificial neural networks on lots of information derived from audio, images, and other inputs, and then presenting the systems with new information and receiving inferences about it in response.