From DNA sequencing to high finance to time-wasting games, our ever-growing networks of powerful computers are generating piles of data about virtually every facet of life. That makes it a pretty good time to be a high-level data scientist.
Stephen Purpura should know—he’s one of them. Educated at Harvard and Cornell, the former Microsoftie has seen the giant tech companies stockpile the top people in his field. And that means fewer minds unlocking all the potential of “big data” for the mere mortals of the business world.
Purpura hopes to change that equation with a new company, Context Relevant, that has raised $1.3 million in seed financing from Madrona Venture Group and several angels, including Geoff Entress, Mike McSherry, and Cliff Kushler.
The startup, headquartered in Seattle, is still working on early versions of its product. But broadly, the company is hoping to turn the specialized knowledge of top data scientists into a software product that other companies can use to decode the data piling up on their servers.
“As customers start to try and monetize their big data—deepen customer relationships, find ways to make the right inventory decisions—they need data science help. But they cant afford to acquire it,” says Purpura, Context Relevant’s co-founder and CEO. “We hire data scientists so that they don’t have to.”
Other companies working in this area include Palantir, which does a lot of work in defense and finance, and Precog, based in Boulder, CO. And New York-based Context Matters helps drug companies make sense of data related to regulatory decisions, insurance reimbursement, and other issues relevant to the pharmaceutical industry.
A software-as-a-service offering for enterprise computing isn’t a revolutionary concept, of course. But as Purpura describes it, the arms race for data scientists has led to an industry that is badly fragmented.
Since high-performing individual data scientists are so valuable and require so much expensive machinery to do their jobs, he says, there tends to be a lot of one-off projects rather than broadly applied solutions.
“The challenge in this area has always been that all the solutions are custom. Microsoft and Google pay a ton of people who are trained like me to build custom solutions,” he says “It’s literally so bad that every set of keywords in Google has its own algorithm.”
And from the point of view of the scientists—well, there’s just more incentive to work in that system than there is in making a software product that puts the equivalent of a smart data scientist to work for lots of other people.
Other businesses, however, are apparently ready to unlock more of the knowledge that data scientists have to offer. Christian Metcalfe, the startup’s co-founder and product VP, likens the current climate to the early days of Web software, when software and infrastructure providers built a business on creating websites for retailers or manufacturers who had other things to take care of.
“We’re hearing from customers that range from everything from financial services to big Web data to even credit card fraud: They don’t care about owning the scaffolding. They don’t care about owning all the code from the ground up. What they care about is the results,” Metcalfe says. “It doesn’t matter to them whether it was their data scientists or a group of data scientists working on our application. In fact, they prefer the application part because it lets them get up and running faster and more cheaply.”
Context Relevant says the seed funding gives it about a year of runway to bring on more scientists and refine the product through tests with potential customers.
“This is an early phase in experimentation for us,” Purpura says. “If we had takena a Series A, the expectation would be that we would be raising revenues dramatically. We want a little bit less pressure to allow us to feel out and make sure that we’re going after the right things.”