Today I’m going to tell you about an up-and-coming Silicon Valley startup, Neo Technology, that sells a radical new type of database.
I know, databases are about as exciting to the average technology user as carburetors and doorstops. But before your eyes glaze over and you click on to the next article, let me explain why you should care.
It’s pretty clear that the biggest winners in Silicon Valley in the past decade have been the companies that understood and exploited connections—between Web pages, in the case of Google, or between people, in the cases of Facebook, LinkedIn, and Twitter. To build their empires, all of these companies had to painstakingly develop several new types of databases capable of representing and sorting through such connections. One type is called a graph database. I wrote about an important example, Google’s Knowledge Graph, back in December.
Neo’s database, called Neo4j, is the first commercial, off-the-shelf graph database. Any company can use it; no longer do you have to build your own graph database to take advantage of connected data.
Do I have your attention yet?
“There are two types of data: atomic data about single individuals, and connective data about how various elements are connected,” argues Neo’s co-founder and CEO, Emil Eifrem. “There are a bunch of industries that have only exploited atomic data so far. And what we are seeing—what has played out in several industries—is that when a guy or girl comes along who starts exploiting the connections, it revolutionizes that industry.”
Of course, every startup CEO talks about how his company’s technology is revolutionizing the world. But Eifrem, an uncharacteristically brash Swede, goes even farther. He thinks the companies that fail to understand the connections in their data will inevitably be left behind. “Whoever you are, you are eventually going to have to exploit connected data in your industry, or you are going to go out of business, because somebody else will,” he says.
For many applications, analyzing connections at large scale means abandoning the relational database model that has dominated the computer industry for the last 40 years. It’s not that relational databases can’t hold connective data; they’re just not very good at it. Neo4j, by contrast, was designed from the ground up to represent relationships between entities, right down to the way the data is recorded on a disk.
To see the power of a graph database, consider this example. Eifrem says a big social network that he isn’t allowed to name approached his company to ask for a demonstration of Neo4j. It handed the startup a sample dataset representing 1,000 people connected in a network; each person had an average of 50 friends.
The assignment: select any two people at random and find out if they know each other directly, or are connected by a mutual friend, or a friend of a friend, or a friend of a friend of a friend. That’s the kind of thing that sites like LinkedIn or Facebook need to do all the time, by the way. And when they do, they usually need the answer in less than half a second, or users get impatient.
When Neo’s engineers loaded the network data into a standard MySQL relational database and ran the query, it took two whole seconds to get an answer. When they put the data into a Neo4j graph database, the query took 2 milliseconds—a thousandth as long as the old method.
And when they expanded the sample to a million people, the response time was still the same: 2 milliseconds. (The math is complicated, but the fundamental advantage of a graph database is that the data is stored in a way that makes traversing a web of connections lightning-fast, almost regardless of the size of the web.)
“Any time you enable things that are not just 10 or 20 percent better, but 1,000 times better, then your entire world changes and you can do completely new things,” says Eifrem. “We ended up winning that customer.”
Graph databases aren’t magical, and in some ways they’re harder to work with than relational databases or other types of so-called “NoSQL” databases. But companies in numerous industries are starting to put the technology to work.
Eifrem says his San Mateo-based startup, which is backed by $24 million in venture funding from Fidelity Growth Partners Europe, Sunstone Capital and Conor Venture Partners, has paying customers in areas like hardware (Cisco), professional networking (Glassdoor and Viadeo), publishing (Bloomberg), telecommunications (Deutsche Telekom, Telenor, and SFR), office equipment (Pitney Bowes), and content management (Adobe Systems).
“Ten years ago the big Web giants like Facebook, Google, and Twitter had to hire the best and brightest out of Stanford to build tools to process this data,” says Eifrem. “Now you can buy it from us the same way you can buy a relational database from Oracle.”
Eifrem himself is one of Sweden’s best and brightest. He says he taught himself to program as a teenager by building a text-based role playing game. (He confesses that he thought spending 18 hours a day programming would make him more attractive to “hot Swedish blond chicks.” He didn’t say how that theory worked out.)
Eifrem’s mandatory stint in the Swedish Armed Forces in the late 1990s ended in the middle of the academic year, and to kill time before starting at university he joined a startup called Windh Technologies that was building