distributed safely across different servers, because related data is always pre-joined inside individual documents—just as it is in the real world.
“A classic example from the relational database world is, say we want to store customer orders,” says Merriman. “So you’d have an order table, and a line-item table, and for a given order there might be 10 items in the line-item table. But if you’re doing something more complex, like adding a ‘ship to’ and ‘bill to’ address, all of a sudden you might have 20 or 30 tables representing a single order. When you do a query, you have to go find 30 pieces of information, and if they’re not on a single machine, bringing them together is very hard. But in MongoDB, you would have order documents, with one document per order, and that’s the end. It’s all in one place already.”
MongoDB and its cousins are called NoSQL databases because they don’t necessarily depend on the old structured query language used to retrieve data from MySQL and similar databases. But a better term would be “non-relational” databases, Merriman says, since it’s the self-contained nature of each document that makes them so scalable.
Interestingly, however, Merriman and Horowitz didn’t originally think of 10gen as “the MongoDB company.” When they set out in 2007, they wanted to build a service that hosted other Web companies’ software and provided a complete infrastructure stack; MongoDB was just the database layer in that stack. Today that kind of offering is called a “Platform as a Service” (PaaS), and it’s a niche being exploited successfully by companies like Heroku, Cloud Foundry, Microsoft (with Azure) and Red Hat (with OpenShift). But back in 2007, it was too early—companies weren’t ready to hand over their business-critical functions in this way. “It’s been a slow curve to maturity in PaaS,” says Merriman. “Five years from now, you will see a lot more of that.”
After less than a year, 10gen shifted gears to focus on the data layer of its platform. “We had beta users who were getting really excited about MongoDB,” says Merriman. “They were asking us ‘Can I use this elsewhere, outside the stack?’ We said, ‘Gee, that could make sense—it’s half of what we’re building anyway. So let’s make something that can run anywhere.'”
10gen released the first open-source version of MongoDB at a 2009 NoSQL meetup in San Francisco, organized by hosting provider Rackspace and streaming music startup Last.fm. For Merriman and Horowitz, life morphed into a series of speaking engagements at Ruby, Java, PHP, and Python conferences, where they’d tout MongoDB’s scalability and 10gen’s services. Following a script established by Red Hat, Acquia, and other companies in the open source world, Merriman and Horowitz decided that 10gen’s best strategy was to keep MongoDB free, continue to improve its root code, and try to make money selling services like deployment support, application development consulting, and training.
Merriman says his big worries during this period were the typical ones share by all tech entrepreneurs: “Will anyone ever use it? Will they like it? Will it be big companies or just small companies? Will they pay us for services? Is our market share good?”
But once a few big outfits like Shutterfly and Craigslist started using MongoDB, word about 10gen began to spread organically among developers. At first, the database mainly appealed to startups where developers “didn’t want to buy these $100,000 Sun boxes” to run their Web applications, but preferred to spread them across lots of cheap servers, Merriman says. And that’s where