Lost In Translation: Ford Teams With Nuance Communications To Master Human Language

“Call John Smith.”

“I wanna call John Smith.”

At first glance, these sentences look pretty similar. But try telling that to the voice recognition technology behind Ford Motor’s SYNC system. You might as well be speaking Greek.

Voices recognition software has come a long way in recent years. Google’s Android platform, for example, allows users to search for information by speaking into their smart phones. But mastering the subtleties of human language remains beyond the reach of even our most sophisticated technology. (Remember how the IBM supercomputer Watson was kicking some serious butt on Jeopardy! until the final round when it answered “Toronto” to a question about U.S. cities?)

Voice commands are a key component to Ford’s SYNC system; the company, based in Dearborn, MI, promotes SYNC as a safety feature because it allows drivers to do stuff without taking their hands off the wheel.

With that in mind, Ford recently said it will partner with the appropriately named Nuance Communications, based in Burlington, MA, to develop software that can not only recognize specific words/phrases but the intent of the person speaking them.

Normally, SYNC relies on what’s called “structured commands.” The company basically records phrases that drivers must speak in order for the car to execute their wishes.

There are two problems with technique. First, there are an awful lot of commands. Drivers today can order their cars to do everything from make phone calls and find directions to play music and adjust the cabin temperature.

Secondly, Ford is only really guessing what people will say, which, more often than not, is not what they will actually say. For instance, Ford initially programmed SYNC to recognize the command “Play Tracks.” Unless you work in the music business, you probably don’t even know what a track is. A better command would be “Play Songs.”

“You can’t stop someone from saying something,” says Brigitte Richardson, Ford’s lead engineer on its global voice control technology/speech systems.

In other words, Ford can’t force people to adjust their speech to use its commands. People are going to speak how they are going to speak.

Working with Nuance, Ford wants to develop software based on more advanced algorithms called “statistical

Author: Thomas Lee

Thomas Lee came to Xconomy from Internet news startup MedCityNews.com, where he launched its Minnesota Bureau. He previously spent six years as a business reporter with the Star Tribune in Minneapolis. Lee has also written for the St. Louis Post-Dispatch, Seattle Times, and China Daily USA. He has been recognized several times for his work, including the National Press Foundation Fellowship on Alzheimer's disease, the East West Center's Jefferson Fellowship, and the MIT Knight Center Kavli Science Journalism Fellowship on Nanotechnology. Lee is also a former Minnesota chapter president for the Asian American Journalists Association and a former board member with Mu Performing Arts in Minneapolis.