Depending on whom you ask, “big data” is either (a) still the future of technology and business, or (b) the fastest way to kill a serious conversation.
Christopher Gillett understands both ends of the spectrum. On the one hand, the newly hired CTO of Boston startup Shareaholic sees the buzzword as increasingly meaningless. “The term ‘big data’ has really jumped the shark,” he says. “Everyone is hyping that they’re big-data people.”
On the other hand, Gillett (pictured) has built his 25-year career working on things like high-performance software, analytics, and grid computing. From his time at Digital/Compaq and Oracle through his more recent experiences at local startups, he has seen the evolution of technologies and business problems that have come to be known as big data.
Shareaholic, which makes tools for sharing and discovering Web content, is the fifth Boston-area startup Gillett has worked for since 2001 (more on the company below). His tour of duty over the past decade includes some well-known names in local tech lore: Compete, Vertica Systems, Visible Measures, Swoop.
So I asked Gillett to share his thoughts and lessons, looking back at those companies, and looking ahead at his new one (some of it gets a bit tech-y):
—Compete: “We very nearly built Hadoop at Compete. We had constructed a system that decomposed SQL-like queries into small jobs, and then dispatched those jobs to multiple nodes in a grid using a queuing and job management system, merged and sorted results, etc. The semantics of the system very closely resembled MapReduce. If I had a cool looking stuffed elephant maybe I would have been the next Doug Cutting…”
—Vertica: “Working at Vertica was motivating and humbling. It was a real privilege to interact daily with super smart people like Mike Stonebraker, Andy Palmer, and Shilpa Lawande. Working on database tech from the inside out reminds you that ‘big data’ is not just about volume (petabytes and petabytes of it), but also about doing good computer science, algorithm design, etc., in order to efficiently represent and manage all of it. It’s not the size of your data, it’s how well you can get at it.”
—Visible Measures: “We built the original systems used at Visible Measures as the ‘big data’ movement was just getting under way. I went to the first-ever Hadoop Summit in California, and there were perhaps 300 people in attendance. Visible Measures was one of the first companies to be described as ‘Powered by Hadoop,’ and for a time our grid was one of the largest ones used outside of Yahoo and academia. Now there are two summits (east and west) each year, and close to 3,000 people attend each one. Hadoop in 2007 felt like the tip of the iceberg, and it really was.”
—Swoop: “There are very few companies out there who will tell you they’re planning to boil the ocean and then actually do it. Swoop is such a company: big data, and big (really big) brains.” (See this recent Xconomy profile of Swoop.)
—Shareaholic: “We reach over 300 million people every month, and we’re doing it with a state-of-the-art tech stack and a team of talented and motivated engineers. There are big challenges and huge opportunities here, with interesting problems to solve every day. It’s an exciting time in the life of the company, especially as we grow the engineering team and continue to hire great talent.”
Sounds like it’s still early days for Shareaholic, though the company has been around since 2009. CEO Jay Meattle, who worked with Gillett at Compete, says he is focusing on building out the management team and “positioning us to be a long-term sustainable company.”
Shareaholic has about a dozen employees and seems to have momentum in terms of Web publishers using its software. Now the challenge, as always, is how to make money.
The key to that, if you listen to Gillett and Meattle, lies in making sense of all the company’s data—understanding the content of Web pages, recommending other relevant content for readers, and using all that knowledge to drive more traffic for publishers. Part of Gillett’s charge, at least, is to take the startup’s technologies in areas like natural language processing and machine learning and “roll that up in an intelligent way and expose it to a really big user base,” he says.
To sum it all up, I asked Gillett for his overarching lesson about big data—and the pace of startups today.
“Don’t think that anything you know today is relevant tomorrow,” he says. “Technologies that were transformative even 36 months ago are commonplace now.”