Expect Labs Anticipates a Day when the Computer Is Always Listening

Today’s computers are like distracted middle-school students: you practically have to scream at them to get their attention. To tell a search engine you want a news article, you have to type a few words into the search on your phone or laptop. To tell Siri or Google Now that you need a map or a phone number, you have to hold down a button first.

But they’re just machines—why should we have to get their attention? Shouldn’t it be the other way around? Why aren’t our devices listening to us all the time, ready to respond the moment we need them?

Well, in fact, they’re starting to do this. Google’s new Moto X phone, which goes on sale this week, contains a low-power chip whose only job is to listen for the phrase “Okay Google Now,” which alerts the device to turn your next words into a command or a search query.

But that’s a small step—in a way, it just substitutes a trigger phrase for a button-push. Soon, your phone or laptop may be able to go much farther: tracking everything you say; searching for related personal data or Web resources; and showing the results to you proactively, just in case you’re interested.

At least, that’s the vision at Expect Labs. The San Francisco startup, which is backed by an array of high-profile investors like Samsung, Google Ventures, Telefonica, Intel, Liberty Global, IDG, and Greylock, is pushing a concept it calls “anticipatory computing”—and sooner or later, it’s likely to become part of everyone’s computing experience.

“In just a few years, the search engine on your phone is not going to be waiting around to be asked questions,” says founder and CEO Tim Tuttle. “You want it to pay attention continuously when something is happening in your life, so that it can anticipate your question before you pull your phone out of your pocket.”

Set aside, if you can, the fact that the National Security Agency may also want to pay attention continuously—and that a world where computers can anticipate our every need would, in effect, be a world of total electronic surveillance. That’s a privacy tradeoff that each individual consumer will have to consider carefully, in light of this summer’s revelations about the startling scope of the federal government’s eavesdropping programs.

Expect Labs' MindMeld app shows results relating to a conversation about restaurants in San Francisco.
Expect Labs’ MindMeld app shows results relating to a conversation about restaurants in San Francisco.

The thing you really need to understand about Tuttle and his crew at Expect Labs, who aim to release a showcase mobile app called MindMeld this fall, is that they don’t care so much about whether their software can answer questions or respond to commands, the way “virtual personal assistants” like Siri or Google Now can. Those tasks come down to speech recognition, semantics, and grammar. Most of the computing cycles involved go toward figuring out what the user meant and responding appropriately, not developing a bigger picture of the user’s context.

Anticipatory computing, in Tuttle’s world, is completely different. He says it’s about using signals and data from your devices to construct “a model that represents what is happening in your life,” then taking a more-is-better approach, “proactively searching across all the data sources you care about.” It’s about statistics, relationships, and educated guesses.

“If you have the ability to listen all the time, it dramatically improves usability, because then you can talk to your computer the same way you talk to a person, where you assume they are up to speed on what they’ve heard,” Tuttle says. “That’s what we are building toward.”

The MindMeld app for the iPad and Android tablets will demonstrate the whole idea in the context of teleconferencing. The app works like Skype or any other Voice-over-IP app, except that it’s constantly listening to your side of the conversation in the background and showing a series of appointments, contacts, Web clips, news articles, and other resources related to whatever you’re talking about. Say you’re calling friends to invite them over for dinner, and the conversation veers toward food choices. MindMeld might hear you mention Italian food and show a recipe for fettuccine Alfredo.

But as cool as that sounds, Expect doesn’t expect to stay in the app business. MindMeld is designed mainly just to demonstrate the capabilities of the company’s underlying “Anticipatory Computing Engine.” The real show will get started later this year when Expect Labs gives outside programmers access to the APIs, or application programming interfaces, that will let them use the engine to power their own apps. In other words, Expect Labs wants to provide the smarts that make other companies’ software anticipatory, whether that software is being used by smartphone owners, call-center employees, or drivers in connected cars.

Expect Labs was formed in 2011 by a team of researchers from MIT, Carnegie Mellon, and Stanford, “most of whom have PhDs in statistical search, natural language understanding, and speech recognition,” according to Tuttle. After getting his own computer-science PhD at MIT, Tuttle came west to found Bang Networks, a content distribution network for real-time data, and then Truveo, a video search engine acquired by AOL in 2006.

The insight that grabbed Tuttle, within a couple of years after he left AOL in 2008, was that “search is becoming conversational, real-time, driven by speech and language as a key input.” Smartphones and tablets were the main drivers of this change. “These devices are with us all the time and have access to all sorts of sensor data, including live audio and video,” Tuttle says he realized. “Those could become the inputs to let an intelligent discovery engine find you want you need.”

Tuttle assembled a group of computer geniuses (including former Nexidia researcher Marsal Gavaldà, machine learning expert Simon Handley, DNAnexus and scalable computing veteran Pete Kocks) and set to work building that engine. In its current form, the Expect platform has three main functions: First, it analyzes the signals coming in from a user’s device: audio, video, GPS, and more. Second, it uses these signals to

Author: Wade Roush

Between 2007 and 2014, I was a staff editor for Xconomy in Boston and San Francisco. Since 2008 I've been writing a weekly opinion/review column called VOX: The Voice of Xperience. (From 2008 to 2013 the column was known as World Wide Wade.) I've been writing about science and technology professionally since 1994. Before joining Xconomy in 2007, I was a staff member at MIT’s Technology Review from 2001 to 2006, serving as senior editor, San Francisco bureau chief, and executive editor of TechnologyReview.com. Before that, I was the Boston bureau reporter for Science, managing editor of supercomputing publications at NASA Ames Research Center, and Web editor at e-book pioneer NuvoMedia. I have a B.A. in the history of science from Harvard College and a PhD in the history and social study of science and technology from MIT. I've published articles in Science, Technology Review, IEEE Spectrum, Encyclopaedia Brittanica, Technology and Culture, Alaska Airlines Magazine, and World Business, and I've been a guest of NPR, CNN, CNBC, NECN, WGBH and the PBS NewsHour. I'm a frequent conference participant and enjoy opportunities to moderate panel discussions and on-stage chats. My personal site: waderoush.com My social media coordinates: Twitter: @wroush Facebook: facebook.com/wade.roush LinkedIn: linkedin.com/in/waderoush Google+ : google.com/+WadeRoush YouTube: youtube.com/wroush1967 Flickr: flickr.com/photos/wroush/ Pinterest: pinterest.com/waderoush/