François Dongier
PLoS ONE: Empirical Study of Data Sharing by Authors Publishing in PLoS Journals -
"Conclusions We received only one of ten raw data sets requested. This suggests that journal policies requiring data sharing do not lead to authors making their data sets available to independent investigators" - François Dongier from Bookmarklet
I think that around 10% is consistent with previous studies - most journals say they will pull the papers if authors don't comply but I know of no examples where this has been followed through (either by the people saying the data wasn't made available or by the journal - this is by no means only the journal's fault. Will be watching this one very very closely... - Cameron Neylon
It does seem a rather limited study...would have been good to see some more solid and definite statistics. Would also be helpful to see important correlations...time since publication etc.. - Cameron Neylon
What I want to know is, what's PLoS going to do about this? "Policies requiring data sharing" are meaningless without some backup. Maxine told me Nature will take a hand in cases of refusal to share (e.g., but they are rather conservative in their approach. I'd love it if PLoS would put some teeth in their policies, too. - Bill Hooker
Agreed. I think the expectation ought to be that PLoS would pull the papers. It would be good to test Nature's policy as well but there is a sense in which going out looking for scapegoats feels like entrapment - we are all pretty guilty on this. I would struggle to lay hands on data for some papers more than a couple of years old. How do you send a strong message but make it positive as well rather than vindictive? Or is there no other way? - Cameron Neylon
That pretty much covers it. Generally "its too much work to make it pretty enough for someone else to see" or "it didn't seem important enough" are the main reasons - Cameron Neylon
Unfortunately it doesn't, Cam -- not even close. Karen lists only the honest (though still inadequate) reasons for "data not shown", which (imo) far more often means "experiment not done, or not done properly". There is no page limit in cyberspace: data not shown should be a thing of the past. If someone won't show me the data, why should I trust them? (Self-servingly, I'd add a grandfather clause. Many of my papers use "dns", because I didn't know any better at the time.) - Bill Hooker
Alright, aside from dishonesty, which I almost typed. And yes, paper recently out with my name on it has a data not shown in it which I didn't catch until too late. Basically I couldn't persuade the primary author just to dump the raw output into a txt or pdf file and upload it as supplementary. But if anyone wants the data I will make sure they get it. But I think dishonesty - at least serious dishonesty is a relatively rare problem. The problem of it "not being pretty" is the main one. - Cameron Neylon
I agree that too much work (laziness, etc) or just not wanting to share (fear of competition) are more important factors than dishonesty. Possibly things will improve when concrete incentives are found that really encourage people to publish their data. Maybe some cultural change will come with projects (such as Wolfram Alpha, Linked data, OpenGovernment, etc) demonstrating that curating and exposing the data allows it to be reused in unexpected ways. Ideally, somehow, such reuse would compensate the data "owner" for the curating effort. - François Dongier
Bill - you might also find this presentation by Phil Campbell on NPG's experiences regarding data sharing to be of interest: There are several anecdotes regarding refusals to share data and how these were handled. - Hilary
Hilary, thanks, that's great. - Bill Hooker
Hilary, that's brilliant. Good to have some documentary evidence of what has happened in the past. - Cameron Neylon
An interesting recent example described in this (free online) Nature Editorial: Basic issue was researchers could not provide cell lines because of consent forms used by hospital for patients. Nature published formal correction (as paper not reproducible until more cell lines created, which aus are doing - after going through hospital ethics ctte, re-asking patients to sign different consent form). Joining-up needed right down the line. - Maxine