Inside Google’s Age of Augmented Humanity: Part 2, Changing the Equation in Machine Translation

When science fiction fans think about language translation, they have two main reference points. One is the Universal Translator, software built into the communicators used by Star Trek crews for simultaneous, two-way translation of alien languages. The other is the Babel fish from Douglas Adams’ The Hitchhiker’s Guide to the Galaxy, which did the same thing from its home in the listener’s auditory canal.

When AltaVista named its Web-based text translation service after the Babel fish in 1997, it was a bit of a stretch: the tool’s translations were often hilariously bad. For a while, in fact, it seemed that the predictions of the Star Trek writers—that the Universal Translator would be invented sometime around the year 2150—might be accurate.

But the once-infant field of machine translation has grown up quite a bit in the last half-decade. It’s been nourished by the same three trends that I wrote about on Monday in the first part of this week’s series about Google’s vision of “augmented humanity.” One is the gradual displacement of rules-based approaches to processing speech and language by statistical, data-driven approaches, which have proved far more effective. Another is the creation of a distributed cloud-computing infrastructure capable of holding the statistical models in active memory and crunching the numbers on a massive scale. Third, and just as important, has been the profusion of real-world data for the models to learn from.

In machine translation, just as in speech recognition, Google has unique assets in all three of these areas—assets that are allowing it to build a product-development lead that may become more and more difficult for competitors to surmount. Already, the search giant offers a “Google Translate” app that lets an Android user speak to his phone in one language and hear speech-synthesized translations in a range of languages almost instantly. In on-stage previews, Google has been showing off “conversation-mode” version of the app that does the same thing for two people. (Check out Google employees Hugo Barra and Kay Oberbeck carrying out a conversation in English and German in this section of a Google presentation in Berlin last September.)

While still experimental, the conversation app is eerily reminiscent of the fictional Universal Translator. Suddenly, the day seems much closer when anyone with an Internet-connected smartphone will be able to make their way through a foreign city without knowing a word of the local language.

In October, I met with Franz Josef Och, the head of Google’s machine translation research effort behind the Translate app, and learned quite a bit about how Google approaches translation. Och’s long-term vision is similar to that of Michael Cohen, who leads Google’s efforts in speech recognition. Cohen wants to eliminate the speech-text dichotomy as an impediment, so that it’s easier to communicate with and through our mobile devices; Och wants to take away the problem of language incomprehension. “The goal right from the beginning was to say, what can we do to break down the language barrier wherever it appears,” Och says.

This barrier is obviously higher for many Americans than it is for others, present company included—I’m functionally monolingual despite years of Russian, French, and Spanish classes. (“It’s always a shock to Americans,” Google CEO Eric Schmidt quipped during the Berlin presentation, but “people actually don’t all speak English.”) So a Babel fish in my ear—or in my phone, at any rate—would definitely count as a step toward the augmented existence Schmidt describes.

But in the big picture, Google’s machine translation work is really just a subset of its larger effort to make the world’s information “universally accessible and useful.” After all, quite a bit of this information is in languages other than those you or I may understand.

The Magic Is in the Data

Given the importance of language understanding in military affairs, from intelligence-gathering to communicating with local citizens in conflict zones, it isn’t surprising that Och, like Cohen, found his way to Google by way of the U.S. Defense Advanced Research Projects Agency (DARPA). The German native, who had done masters work in statistical machine translation at the University of Nuremberg and PhD work at the University of Aachen, spent the early 2000s doing DARPA-funded research at USC’s Information Sciences Institute. His work there focused on

Author: Wade Roush

Between 2007 and 2014, I was a staff editor for Xconomy in Boston and San Francisco. Since 2008 I've been writing a weekly opinion/review column called VOX: The Voice of Xperience. (From 2008 to 2013 the column was known as World Wide Wade.) I've been writing about science and technology professionally since 1994. Before joining Xconomy in 2007, I was a staff member at MIT’s Technology Review from 2001 to 2006, serving as senior editor, San Francisco bureau chief, and executive editor of TechnologyReview.com. Before that, I was the Boston bureau reporter for Science, managing editor of supercomputing publications at NASA Ames Research Center, and Web editor at e-book pioneer NuvoMedia. I have a B.A. in the history of science from Harvard College and a PhD in the history and social study of science and technology from MIT. I've published articles in Science, Technology Review, IEEE Spectrum, Encyclopaedia Brittanica, Technology and Culture, Alaska Airlines Magazine, and World Business, and I've been a guest of NPR, CNN, CNBC, NECN, WGBH and the PBS NewsHour. I'm a frequent conference participant and enjoy opportunities to moderate panel discussions and on-stage chats. My personal site: waderoush.com My social media coordinates: Twitter: @wroush Facebook: facebook.com/wade.roush LinkedIn: linkedin.com/in/waderoush Google+ : google.com/+WadeRoush YouTube: youtube.com/wroush1967 Flickr: flickr.com/photos/wroush/ Pinterest: pinterest.com/waderoush/