New Speech Recognition Engine Under the Hood at Vlingo; Startup Dumps IBM and Nuance for AT&T

Vlingo, the Cambridge, MA-based startup that makes a suite of speech-to-text applications used by millions of iPhone, BlackBerry, and Nokia mobile device owners, is about to get a brain transplant of sorts. It said today that it will largely abandon a core speech-recognition engine developed by IBM and maintained by Nuance Communications in favor of a system from AT&T Labs in New Jersey.

As part of the shift, says Vlingo CEO Dave Grannan, Vlingo and AT&T have agreed to a long-term strategic alliance. Vlingo’s speech scientists will be able to modify and improve the source code for the AT&T technology, called Watson, while AT&T will take a minority ownership stake in Vlingo. All of Vlingo’s applications will be running on top of the AT&T speech-recognition system by the first quarter of 2010, Grannan says.

Vlingo’s own speech scientists have developed software that exploits information collected from users—the way a Bostonian’s pronunciation of a dictated phrase like “I parked my car” might differ from a New Yorker’s, for example—to build statistical models that help improve speech-recogition accuracy over time. These models provide supplemental input that helps to guide a core speech-recognition engine as it transforms speech sounds into text. Vlingo didn’t build its own core engine—it has long licensed that part of its system from IBM.

The switch from IBM’s engine to AT&T’s is a “best of all worlds” situation for Vlingo, in Grannan’s words. For one thing, he says, the Watson technology simply works better than the IBM recognizer. “Watson is superior on speed and base-level accuracy,” he says. Once the transition is complete, users of Vlingo’s iPhone, BlackBerry, and Nokia apps should notice fewer wrong guesses in the transcriptions of their utterances. Grannan says they’ll also see a few new features, such as automatic punctuation, that Vlingo can now add because it will be able to tinker with Watson’s innards.

But just as important, the switch will help Vlingo disentangle itself from its strained relationship with Nuance.

Burlington, MA-based Nuance (NASDAQ: [[ticker:NUAN]]) is one of the Boston area’s biggest high-tech firms, and it is the world’s largest specialized provider of speech-related technologies. It offers software for mobile speech recognition that competes directly with Vlingo’s. In June 2008, after losing out to Vlingo on a contract to supply Yahoo with speech-recognition technology for its oneSearch service, Nuance hit Vlingo with a lawsuit alleging that

Author: Wade Roush

Between 2007 and 2014, I was a staff editor for Xconomy in Boston and San Francisco. Since 2008 I've been writing a weekly opinion/review column called VOX: The Voice of Xperience. (From 2008 to 2013 the column was known as World Wide Wade.) I've been writing about science and technology professionally since 1994. Before joining Xconomy in 2007, I was a staff member at MIT’s Technology Review from 2001 to 2006, serving as senior editor, San Francisco bureau chief, and executive editor of Before that, I was the Boston bureau reporter for Science, managing editor of supercomputing publications at NASA Ames Research Center, and Web editor at e-book pioneer NuvoMedia. I have a B.A. in the history of science from Harvard College and a PhD in the history and social study of science and technology from MIT. I've published articles in Science, Technology Review, IEEE Spectrum, Encyclopaedia Brittanica, Technology and Culture, Alaska Airlines Magazine, and World Business, and I've been a guest of NPR, CNN, CNBC, NECN, WGBH and the PBS NewsHour. I'm a frequent conference participant and enjoy opportunities to moderate panel discussions and on-stage chats. My personal site: My social media coordinates: Twitter: @wroush Facebook: LinkedIn: Google+ : YouTube: Flickr: Pinterest: