Genomic Advances of the 2000s Will Demand an Informatics Revolution in the 2010s

needed to drive normal biological processes. Chemical modifications of DNA induced by environmental toxins have been shown to influence many of the common human diseases that are of significant public health concern today, such as type 2 diabetes and cancer.

While this finding on its own was not so surprising, the astonishing observation was that these chemical modifications to DNA can be transmitted to subsequent generations, even after exposure to the agents inducing the changes were stopped. Skinner later demonstrated that these types of environmentally induced changes could affect fundamental behaviors like mate selection, demonstrating a potentially more rapid evolutionary selection mechanism that does not require mutations in the actual DNA sequence.

A recent related discovery published in Nature by Decode Genetics, an Icelandic company that has helped lead the way in establishing how changes in DNA associate with disease, demonstrated that mutations in the sequence of DNA that are inherited from, say, a mother, can have very different consequences relating to disease risk and progression than the very same mutations inherited from the father.

2. Highly parallel sequencing and genotyping technologies have revolutionized our ability to associate changes in DNA with disease.

The maturation of second generation, highly parallel DNA sequencing and genotyping technologies, along with the completion of the sequencing of the human genome, has enabled an astonishing wave of discoveries about how the specific forms of DNA inherited from our parents can cause disease or differences in our response to treatments. While hundreds of examples of rare, single gene mutations in our DNA that cause disease have been discovered over the past 30+ years, finding common changes in DNA that affect our risk of disease turned out to be incredibly difficult. Before this past decade, only a handful of examples of genetic risk factors existed for common human diseases. However, technologies able to fully characterize all of the common DNA variation in the human genome at lower cost have dramatically increased the number of causal genes identified. Scientists now have catalogued nearly a thousand genes in which common DNA changes affect the population risk of more than one hundred different disease associated phenotypes, including those associated with type 2 diabetes, heart disease, multiple different types of cancer, arthritis, Crohn’s disease, schizophrenia, and Alzheimer’s disease, as well as other human traits like height, eye color, and hair color.

While this wave of discovery has been truly impressive, few of the DNA changes were found to directly affect the function of proteins directly implicated in diseases like Alzheimer’s. In fact, most changes in DNA associated with common human diseases appear to be affecting the rate at which genes represented in the DNA are transcribed into RNA and then translated into proteins (as opposed to directly affecting the function of the protein). Further, these findings actually turned out to explain very little of the disease variation in the human population. That is, while these DNA variations were associated with disease, they were unable to explain very appreciable amounts of the overall disease variation in the human population. This has prompted a new search in the life sciences for the “missing heritability” relating to human disease. Given the low percentage of variation explained by common, simple variations in DNA, the hunt is on for other types of variation (including the environmentally induced changes mentioned above) that had not been thought to play a key role in disease, but that now may represent some of its significant explanations.

3. Whole new classes of RNA discovered to be critical to cellular and higher order biological processes.

Emerging from recent genetics research is a greater appreciation that in order to understand and treat disease, we will need to fully characterize the role that whole new classes of non-coding RNA discovered over the last 10 years play in biological processes. While non-coding RNAs

Author: Eric Schadt

Eric Schadt is the director of the Mt. Sinai Institute for Genomics and Multi-Scale Biology in New York, and the chief scientific officer for Pacific Biosciences, a company developing new gene sequencing technologies. He is also a founding member of Sage Bionetworks- an open access genomics initiative designed to build and support databases and an accessible plaform for creating innovative dynamic disease models. Dr. Schadt joined Pacific Biosciences in May 2009 from Rosetta Inpharmatics, a subsidiary of Merck & Co., Inc. in Seattle, where he was Executive Scientific Director of Genetics. Dr. Schadt's work at Rosetta involved the generation and integration of very large-scale sequence variation, molecular profiling and clinical data in disease populations to construct the molecular networks that define disease states and link molecular biology to physiology in ways that can impact clinical medicine. Dr. Schadt has contributed to a number of discoveries relating to the genetic basis of common human diseases such as diabetes and obesity, which have been widely published in leading scientific journals. His research has provided novel insights into what is needed to master diverse, large-scale data collected on normal and disease populations in order to elucidate the complexity of disease and make more informed decisions in the drug discovery arena. Prior to joining Rosetta, Dr. Schadt was a Senior Research Scientist at Roche Bioscience. He received his B.A. in applied mathematics and computer science from California Polytechnic State University, his M.A. in pure mathematics from UCLA, and his Ph.D. in bio-mathematics from UCLA.