Amazon’s Cloud Computing Service Sees Opportunity in Genomic Data Overload

this can become a real earnings driver for Amazon. Much sequencing is done at centralized labs around the country that already have invested in expensive servers to store their data in a secured place on campus. So there’s incentive to keep using those tools to get the most value out of them. Some labs are generating so much data from the instruments that they aren’t always sure they have enough bandwidth to transmit it all to Amazon’s servers. The raw experimental data is so precious to a biologist’s career that it can be hard to just send it away to a vendor for safe-keeping, rather than have it under lock and key on campus.

And many researchers struggle with how to analyze, visualize and interpret the data being spit out by the sequencing machines. The software that needs to run on top of Amazon’s storage capacity and databases—the bioinformatics piece of the puzzle—is still a cottage industry with home-made programs, piecemeal open source alternatives, and a lot of researchers still using old-school spreadsheets like Microsoft Excel.

Yet even before companies like DNANexus achieve major market traction with simple and easy-to-use bioinformatics software, many researchers still feel compelled to store the data in anticipation of the day when it will be easier to sift through. And Amazon isn’t the only company wooing them. Microsoft and Google have their own cloud computing services to offer. At least one competitor, Seattle-based Isilon Systems, is making visible inroads in the life sciences market by selling of clustered servers. Isilon now generates 15 to 16 percent of its revenues from life sciences customers, up from 2 percent in early 2008, CEO Sujal Patel said at a recent Xconomy forum. Isilon’s customer roster includes a lot of heavy hitters, like Merck, Genentech, Sanofi-Aventis, Bristol-Myers Squibb, Illumina, Complete Genomics, the Broad Institute of MIT and Harvard, Stanford University, and Johns Hopkins University.

Amazon’s Singh knows this terrain well himself. He got his doctorate in chemistry from Syracuse University, and spent eight years of his career in the biotech industry, including stints at San Diego-based Accelrys and Seattle’s Rosetta Inpharmatics. The past two years, he’s been working as a business development manager for Amazon Web Services, with a particular emphasis on getting to know what the life sciences market wants from cloud computing.

Amazon has done a number of things to ease the transition for customers to cloud-based storage, Singh says. It has worked to obtain public databases and make them available to researchers. One example last month came from three recently completed pilot projects for the 1,000 Genomes Project. Putting that data out there for researchers, and enabling them to share it, has generated

Author: Luke Timmerman

Luke is an award-winning journalist specializing in life sciences. He has served as national biotechnology editor for Xconomy and national biotechnology reporter for Bloomberg News. Luke got started covering life sciences at The Seattle Times, where he was the lead reporter on an investigation of doctors who leaked confidential information about clinical trials to investors. The story won the Scripps Howard National Journalism Award and several other national prizes. Luke holds a bachelor’s degree in journalism from the University of Wisconsin-Madison, and during the 2005-2006 academic year, he was a Knight Science Journalism Fellow at MIT.