Houston—The University of Houston and Rice University library systems are working together to help preserve online federal data, as part of a national project aimed at safeguarding scientific research.
“The ultimate goal is to raise awareness of having access to this information and preserving this information,” says Lisa Spiro, executive director of digital scholarship services at Rice’s Fondren Library.
Houston’s “data rescue” project will be held at the Fondren Library on March 4. About 50 people have so far signed up to participate, Spiro says. (To join the MeetUp, click here.)
“The library’s on fire,” says Neeraj Tandon, a self-described “data nerd” who will lead a web-scraping training session for volunteers March 4. “This is public knowledge that, instead of sitting in a building, sits on servers a little bit out of our reach.”
Tandon, a psychiatry physician resident who uses data to do research in neuroscience and genetics, says he finds reports showing government website pages depleted of previously available data scary. “This is the first time we’ve seen something like that of this scale,” he says. “Before, you thought, it’ll be here there forever.The urgency to save those ‘books’ has definitely gone up considerably.”
Rice’s Spiro says preserving data is not a necessarily new endeavor—libraries have served as designated repositories since the federal government began posting official documents online in 1993. A consortium of university libraries, the Library of Congress, the Internet Archive, and the U.S. Government Publishing Office has conducted “end-of-term” data harvests since 2008 as a new administration typically makes some changes to various websites to highlight certain priorities, for example.
With the election of Donald Trump, however, concerns grew that data related to climate change, for example, was especially vulnerable in this new administration. The Libraries Network is a national organization working with Data Refuge, a project being spearheaded by the library system and environmental humanities program at the University of Pennsylvania.
Data Refuge and another organization, the Environmental Data and Governance Initiative, have led pro-bono “data rescue” events in about a dozen cities across the U.S. and Canada. These meetups are aimed at attracting programmers and others with technical skills to help preserve and save data that might be endangered.
On Wednesday, the Internet Archive was named a semifinalist for the 100&Change award, a $100 million grant given by the MacArthur Foundation to help solve a critical problem affecting society. (Other semi-finalists include projects to provide virtual access to medical specialists in underserved areas of the U.S., and a Rice proposal to improving newborn survival in Africa, among others.)
Spiro says she and her librarian colleagues at Rice are working with the Libraries Network to determine which agencies the Houston effort will target. At an Austin Data Refuge event last weekend, participants focused on data from the National Oceanic and Atmospheric Administration’s Atlantic Oceanographic and Meteorological Laboratory, among other data sets, according to the group’s Facebook page.
“We in libraries have been participating in data storage and organization and ensuring access to that data for decades,” says Kathy Weimer, head of Rice’s Kelley Center for Government Information, Data, and Geospatial Services. “That’s the whole essence of librarianship.”
Even beyond the current political environment, she adds that “the Web is an inherently unstable medium. We want to ensure the data resources will be around for a long time.”