Netezza Chief Talks About “Formative” PTC Days, IBM Deal History, and the Future of Big Data

It was frustrating for a lot of companies in that era. The basic economical principles seemed to no longer apply. And, of course, that proved to not be the case over time.

It was an allure for many people to leave an established enterprise technology company and go try their hand at a startup or become part of a startup. The bubble burst at exactly the time I went to Endeca. I joined in summer 2001 and we were raising money. Steve Papa had founded Endeca, and they were doing OK, maybe 20 employees. We were closing that series of investment on 9/11. I stayed there for five years. We built the company from the ground up. When I got there, a lot of the technology heavy-lifting had been done. They had built the product but they hadn’t quite found the market for it. We had to go and find the market, move the technology into the market, find the value proposition, build a sales force around it, build a marketing engine around it, finish the product for various use cases, and build a company. That was a tremendously fun activity. I joined as the COO, and left as the CEO.

X: So now it’s 2006. What brought you to Netezza?

JB: It was founded by Jit Saxena in 2000 with Foster Hinshaw, who went on to Dataupia. Netezza had built a remarkable technology. It had some great technologists, a lot of real in-depth enterprise, big-scale complex systems experience, and a very interesting blend of hardware and software—what we call an appliance today. The data warehousing market was very well understood. The existing technologies were very complex, expensive, difficult to deploy, very long time to value, and increasingly unwieldy and slow. There was lots of end user resentment around [business intelligence]. Data warehousing had sort of become a dirty word; it was not something with which you could easily succeed. The competitive landscape was Oracle, Teradata, IBM, and a bunch of little companies. But to this day, primarily our competition remains Oracle. And we’ve made a very conscious effort to go after Oracle’s weakness in this particular aspect of database processing: their stuff is very expensive, very slow.

We [at Netezza] were able to demonstrate orders of magnitude—10, 100, 1,000 times—better performance, in a system you could deploy in days, not months or years. You could get value from it quickly, and you could scale.

X: So where did all this “big data” come from, and where is it going?

JB: Big data has become an interesting buzzword in the industry, but what is it, really? Look around at what’s happening in the world. Data collection is becoming ubiquitous. I drove in here from Natick, I went through two toll booths, I made five phone calls—I produced a pretty big data trail on my way here. I logged into the Internet this morning and went to three news websites, all of which I have subscriptions or logins or cookies for. I met with a buddy for breakfast who started a company that has this very cool online coupon technology that’s based on my preferences, who I am, what my purchase history is. It’s everywhere. We’re now talking about appliances that report back to the electric grid. We’re looking at sensors on every vehicle known to man.

A single rack of Netezza is a massively parallel grid of compute and storage nodes, linked in a special way. In 2006, we were in the process of building our first 100 terabyte machine. It was eight refrigerator-size racks, approximately 1,000 nodes of compute. I remember thinking nobody is going to buy a 100 terabyte machine. This is crazy, this is huge, right? Wrong. They sold like hotcakes. Catalina Marketing bought the first one. Today, in a TwinFin, our current product, a single rack can hold well over 100 terabytes of data with compression. We have systems running today with north of 1 petabyte.

We’re primarily focused on analytics around high-value structured data. Transactions, loyalty cards, customer data, credit card/financial data, healthcare records, call detail records in the telco industry. Take IBM’s Watson [the Jeopardy machine]. That’s a workload optimized system; behind the scenes it’s parsing through massive amounts of unstructured data. This is changing the world.

It’s not just about capturing and managing this data. That’s done. It’s understanding the knowledge from these data assets in real time, not just to understand history, but to understand the future. Predicting, and then optimizing based on those predictions, will have the most significant and profound impact on business that we’ve seen from the technology industry. It’s way beyond

Author: Gregory T. Huang

Greg is a veteran journalist who has covered a wide range of science, technology, and business. As former editor in chief, he overaw daily news, features, and events across Xconomy's national network. Before joining Xconomy, he was a features editor at New Scientist magazine, where he edited and wrote articles on physics, technology, and neuroscience. Previously he was senior writer at Technology Review, where he reported on emerging technologies, R&D, and advances in computing, robotics, and applied physics. His writing has also appeared in Wired, Nature, and The Atlantic Monthly’s website. He was named a New York Times professional fellow in 2003. Greg is the co-author of Guanxi (Simon & Schuster, 2006), about Microsoft in China and the global competition for talent and technology. Before becoming a journalist, he did research at MIT’s Artificial Intelligence Lab. He has published 20 papers in scientific journals and conferences and spoken on innovation at Adobe, Amazon, eBay, Google, HP, Microsoft, Yahoo, and other organizations. He has a Master’s and Ph.D. in electrical engineering and computer science from MIT, and a B.S. in electrical engineering from the University of Illinois, Urbana-Champaign.