Data Domain Founder, Kai Li, on EMC Acquisition and the Future of Data Storage

as an independent business unit. I think it’s a good strategy.

X: Tell me about Data Domain’s key technology. Where did it come from?

KL: In IT, there are three big pieces interacting together in data centers. One is servers for information transformation. This is about computing—we can make faster and faster computers every year to help us transform information. Whether it’s mathematical models, simulations, or games, all of those involve transformation; we always need more compute power. The second piece is moving information—this is communication. This industry is quite big, led by Cisco. We need faster, safer ways to move information. The third thing is storage. We have to store and protect data. Our society is so related to data now. Almost all important information is digitized. Think of how we run businesses today. Even each person has lots of data generated from multiple devices and computers. Storing and protecting information are becoming very important.

The data growth rate has been really high, roughly on the curve of Moore’s Law. This poses a lot of issues. As the data keeps growing, we need to update our storage systems. We have to back up data to provide data protection to satisfy users’ requirements. So one thing I was thinking about in early 2001 was to address this issue. The key question is to figure out the most painful things people are doing in data centers.

X: So you got into disaster recovery for data centers?

KL: Yes. Probably the most painful thing we were dealing with was backups. Data centers were using tape libraries for decades. Many things are bad about using tape. After writing data to tape, you may not be able to read it back in reliable form. And to satisfy requirements for disaster recovery, you need to ship data to a remote site. After California implemented the legislation that if a company loses customer records, it has to make that information public, we’ve seen a lot of news about banks and big organizations losing customer data due to tape transportation—moving tape offsite. They need to do this every day. Since humans are handling this, there’s a non-zero probability something will go wrong.

I started thinking about whether we can develop a new kind of product that can replace tape forever for data protection purposes. That’s when we invented deduplication storage systems, with my co-founders. The system we invented is able to achieve lossless compression of roughly 20-to-1. If you can shrink the footprint of data by that much using a disk-based storage system, you can

Author: Gregory T. Huang

Greg is a veteran journalist who has covered a wide range of science, technology, and business. As former editor in chief, he overaw daily news, features, and events across Xconomy's national network. Before joining Xconomy, he was a features editor at New Scientist magazine, where he edited and wrote articles on physics, technology, and neuroscience. Previously he was senior writer at Technology Review, where he reported on emerging technologies, R&D, and advances in computing, robotics, and applied physics. His writing has also appeared in Wired, Nature, and The Atlantic Monthly’s website. He was named a New York Times professional fellow in 2003. Greg is the co-author of Guanxi (Simon & Schuster, 2006), about Microsoft in China and the global competition for talent and technology. Before becoming a journalist, he did research at MIT’s Artificial Intelligence Lab. He has published 20 papers in scientific journals and conferences and spoken on innovation at Adobe, Amazon, eBay, Google, HP, Microsoft, Yahoo, and other organizations. He has a Master’s and Ph.D. in electrical engineering and computer science from MIT, and a B.S. in electrical engineering from the University of Illinois, Urbana-Champaign.