Generally, computers are useless at holding a conversation. They just take things a bit too literally. But Google is teaching computers how to make sense of the vagaries of human speech and text. Google is opening up those algorithms to outside software developers.
The tools released will help programmers build language-based apps and services that are less prone to annoying misunderstandings than many of today’s chatbots. And they should help get developers hooked on the powerful machine-learning techniques Google is honing.
Google’s own mastery of grammar and syntax helps the company deliver more accurate search results, and it will be increasingly important as more of its devices and services come to depend on voice control.
Smartphones based on Google’s software can, of course, already be voice controlled, and the company is widely thought to be developing home devices, similar to Amazon’s Echo, that depend more heavily on voice interaction. So releasing a tool that makes language understanding more accessible makes a lot of strategic sense.
“Most of our users interact with us through language,” says Fernando Pereira, who leads the company’s efforts in natural-language understanding and machine learning. “They ask queries, typed or spoken. And so for us to serve the user well, we have to make our systems understand what users want.”
Google’s SyntaxNet can learn to understand the meaning of words and phrases given their context and common usage. This works with the deep-learning framework previously released by Google, called TensorFlow. And it is the most complex and sophisticated component built using TensorFlow to date.
Google has also released a pre-trained parser for English, called Parsey McParseface (a spokesperson says the company was having trouble coming up with a name when someone suggested this catchy moniker). Text fed into the parser will automatically be broken into syntactic components such as nouns, verbs, subjects, and objects. This makes it easier for a computer to parse ambiguous queries or commands correctly.
Google usually relies on data and machine learning, and indeed some other approaches, such as Facebook’s, are trying to train computers to parse language by feeding them large quantities of largely unlabeled data (see “Teaching Machines to Understand Us”).
But Google’s language-understanding project, described in a paper online, is instead built around human expertise. For more than eight years, professional linguists have been working on annotating text for the company. And recent progress has been made by feeding those annotations into a large deep-learning neural network.
Understanding language is incredibly difficult for computers because language is often ambiguous. A search query as simple as “Find me cats in hats” may be interpreted as a request for either cats wearing hats or cats sitting in hats. While humans use general knowledge to disambiguate such sentences, Google’s technology uses machine learning. Its deep-learning system, trained with syntactic text, makes a judgment about the most likely correct structure of a statement.
Dave Orr, the product manager at Google responsible for finding commercial applications for the company’s research on language understanding, demonstrated the technology for me. He fed several articles from MIT Technology Review into an internal version of the language parser.
It made a couple of trivial errors, for example, confusing the word “will” at the start of a sentence with my first name, but generally seemed to annotate sentences with impressive accuracy, identifying syntactic structures that correctly captured the meaning of the headline or lead. “It’s the best parser anyone has created,” Orr says. “We think it’s close to human level.”