Zoë Corbyn of Nature magzine
Scientists are failing to make raw data publicly available, even when prompted to do so by journals, says a study published last week in PLoS ONE.
The study of 500 papers from the 50 highest-impact journals reveals wide variation in data-sharing policies and in researchers' adherence to them. The findings come amid a growing push for sharing raw research data -- both to facilitate further research and to better prevent fraud or error.
Twenty-two of the 50 journals surveyed required public sharing of specific raw data as a condition of publication, and another 22 encourage data sharing without binding instruction. Six of the 50 journals give no instruction on data sharing at all.
Looking at the first ten papers published in each journal in 2009, the researchers found that, of the 351 papers covered by some data-sharing policy, only 143 fully adhered to that policy. Neglecting to publish microarray data -- such as those produced in gene-expression studies -- was the most common offense.
"The current state is not optimal," says study leader John Ioannidis, an expert in data reproducibility at Stanford University School of Medicine in California. "Some journals have pretty good policies and some of the papers adhere to these, but there is plenty of room for improvement".
Slow to change
The study also found that researchers rarely volunteer data. Of all 500 papers analyzed, only 47 had their full primary data sets -- rather than just the raw data specifically requested by the journals -- publicly available. None of the papers published in journals without data-sharing policies deposited their full set of raw data online.
The results echo those of a July study that also found data-sharing practices wanting. That study, led by Heather Piwowar, a bioinformatics researcher at the University of British Columbia in Vancouver, examined more than 11,000 gene expression studies published between 2000 and 2009. Of those, the percentage that had raw data available online increased from less than 5% in 2001 to 35% in 2009.
Ioannidis and Piwowar says that more journals should adopt data-sharing policies and ensure that scientists consistently follow the rules. "You need an extra editorial office and maybe more," says Ioannidis.
Piwowar speculates that journal editors shy away from introducing data-sharing policies for fear of deterring submissions. "Journals can get away with not having policies because it is not yet generally regarded as the norm," she says.
Steven Wiley, director of biomolecular systems at the Pacific Northwest National Laboratory in Richland, Washington, says the current study does not address the question of why scientists might defy data-sharing policies. Sharing data "is time-consuming to do properly, the reward systems aren't there and neither is the stick", he says.She urges editors in each field to come together to implement policies simultaneously, as was done with several evolution journals in January 2011 under an NSF funded initiative called the Joint Data Archiving Policy.
Even if compliance increases, Wiley says that the scientific community will still need to focus on developing standardized formats to make accessing data more efficient and feasible. "Of all the data that are made available, what fraction is actually used by someone else? I bet the majority isn't," he says.
Although the question isn't addressed in either study, that's something Piwowar is hoping to find out.
This article is reproduced with permission from the magazine Nature. The article was first published on September 14, 2011.