Stephen Wolfram Talks Bing Partnership, Software Strategy, and the Future of Knowledge Computing

needs to pull in—specific types of knowledge, from linguistics [for example]—to how you make servers work well. That was a tremendous exercise in being able to pull the right resources together and get very disparate groups to communicate well.

We have a “who knows what” database—if we have a question about who in the company knows about mechanical engineering [say]. It has nothing to do with geography. It has to do with what questions you ask of people, and having a company culture where it’s conceivable you might have “random question X.”

X: Let’s talk about the genesis of Wolfram Alpha, which was released in May.

SW: I’d been kind of thinking about what makes knowledge computable for a long time. Like many of these things, the idea is only clear after you’ve built something serious about it. Looking back, I was sort of embarrassed to find things I was doing when I was 12 years old—gathering scientific information and putting it on a typewriter. I’d been thinking about how one makes knowledge systematic for a long time.

In the beginning of the ‘80s, when I was starting to work on NKS [A New Kind of Science], I had built a [computing] language called SMP. I was wondering how far you could get formalizing knowledge, and how does this relate to AI-ish things. At the time, I thought making all knowledge formal was too hard, we can’t do it.

After finishing NKS [in 2002], I was thinking—you can get complexity from simple rules. Can we make a large swath of human knowledge computable? I got more serious about that. At the beginning, it was really unclear this would be possible. There’s just too much data in the world, too many topics, you can’t understand the linguistics [of queries], you can’t deliver the stuff fast enough.

In linguistics, we used the NKS system. For years, people were trying to do natural language processing and making computers understand written text. It turns out to be really hard, but what do you mean by “understand”? For us, we have a very clear target: is this related to something we can compute?

X: Can you give more details on how it works? How do you interpret a query?

SW: We’ve had to build our own big edifice of linguistic processing to handle what we want. I wasn’t sure if it was possible. I thought there might be too much ambiguity. You might have to see the person—see if they were dressed in a spacesuit or in surgeon’s garb—to get enough context. As it turns out, it hasn’t been a huge problem. There’s enough sparsity in human expression. By the time someone is asking anything real, you have enough context. The whole thing is full of heuristics. Any sequence [of terms or numbers] could be anything. But if it’s the name of a town with a population of 20, and it’s 6,000 miles away from where the query is being asked, that’s unlikely [to be relevant].

X: Where does Wolfram Alpha get the data with which it computes answers?

SW: The truth is very little of our data comes from the Web. The Web is a great place to know what‘s out there, but in the end, for every one of thousands of domains, we’ve gone to the primary data source and gotten the most original, most useful source. One exception to that, I suppose, is what happens with linguistic things. Wikipedia is really useful to us—if we have an entity, a chemical, a movie, what do people actually call this?

The reason that Wolfram Alpha has been at all possible for us is we’re starting with

Author: Gregory T. Huang

Greg is a veteran journalist who has covered a wide range of science, technology, and business. As former editor in chief, he overaw daily news, features, and events across Xconomy's national network. Before joining Xconomy, he was a features editor at New Scientist magazine, where he edited and wrote articles on physics, technology, and neuroscience. Previously he was senior writer at Technology Review, where he reported on emerging technologies, R&D, and advances in computing, robotics, and applied physics. His writing has also appeared in Wired, Nature, and The Atlantic Monthly’s website. He was named a New York Times professional fellow in 2003. Greg is the co-author of Guanxi (Simon & Schuster, 2006), about Microsoft in China and the global competition for talent and technology. Before becoming a journalist, he did research at MIT’s Artificial Intelligence Lab. He has published 20 papers in scientific journals and conferences and spoken on innovation at Adobe, Amazon, eBay, Google, HP, Microsoft, Yahoo, and other organizations. He has a Master’s and Ph.D. in electrical engineering and computer science from MIT, and a B.S. in electrical engineering from the University of Illinois, Urbana-Champaign.