[Updated: 6:40 am Pacific] Stephen Friend had plenty of doubters last year when he quit a high-powered Big Pharma executive job to start a nonprofit seeking to spark an open source style movement for biology.
One year later, the former chief of cancer research at Merck has pulled off another unlikely feat. He’s corralled four world-class biologists who have agreed to pool their raw experimental data and models on the connections between genes, proteins, drugs, and disease states into a public database. This is the stuff scientific careers are made of, and can lead to blockbuster papers in top journals like Science and Nature, which is why it’s usually held close to the vest.
The biologists willing to take this plunge are Atul Butte of Stanford University, Trey Ideker of UC San Diego, Andrea Califano of Columbia University, and Eric Schadt of Pacific Biosciences (and soon to be at UC San Francisco). Getting stars like that on board is a critical step in the quest that Friend embarked on more than a year ago when he founded the nonprofit Sage Bionetworks. Seattle-based Sage, after some iffy early moments, has stitched together a support network that has committed more than $20 million, more than enough to fund operations for two years. Major drugmakers (Pfizer, Merck), government agencies (the National Cancer Institute, the Washington state Life Sciences Discovery Fund), and philanthropies (the CHDI Foundation) have made some of the earliest investments. They have all lined up to support the vision of allowing biologists on a grand scale to pool data and their collective brainpower in open databases. The hope is to better connect the dots between malfunctioning DNA, RNA, and proteins, to see how those things get manifested as symptoms of disease that a doctor observes in the clinic.
It all looks great on a whiteboard, but without real participation by key researchers, it’s hard for something like this to ever move beyond the theoretical stage. With stars like these contributing their data, Friend has said, the ultimate result would be more effective drugs, just as programmers contributing open-source code can create better software. On my most recent visit to Friend’s office in Seattle last week, he said these four biologists are carrying out a potentially groundbreaking experiment for biology analogous to the federal government’s investment in the Arpanet of the ’60s, which gave rise to the Internet of today.
This experiment is “really modeled” after the Arpanet, Friend says. “Four universities decided to send data back and forth. This is an equivalent project. What would happen if four groups at top universities wanted to share their data? We’ve asked what does it really take to do that?” [Updated: the Sage network actually has five sites, since the four additional universities have joined the founding group in Seattle.]
This isn’t the only effort that attempts to get biologists to pool data and experimental models in some kind of consortium, although others tend to be more specific to a certain disease, or type of data. The bigger vision is part of what Ideker says enticed him to get serious about Sage.
“After talking with Stephen, and later working together with him, it became quickly apparent that he had the right combination of vision, assembled expertise, and contacts to make this work,” Ideker says. “Athough Sage may not be the only game in town, it is likely to move the ball furthest downfield.”
Butte added: “The real open question is whether working with the Sage Federation will be ‘more’ (in some way) than just working with a group of collaborators… in terms of scalability, data, transparency, or something else…it’s just not clear what yet. But of course, it’s worth trying this model out.”
One year into this experiment, Sage is now at the point where the questions aren’t about what it’s trying to do, but how it’s doing it. Friend racked up more than 250,000 miles of travel