If you're in an area where there's a lot of low-hanging fruit, so to speak, maybe there's an issue, but even then some repetition is necessary, so scooping is just an imaginary problem which affects the uncreative
- Mr. Gunn
I know that this will probably not be popular, but I have to disagree. In my experience, the most crucial part is to be creative and get the good idea. If you share those ideas too early, you run the risk that someone uncreative takes your idea, finishes the work faster than you, and takes the glory. In which case it is precisely you, the creative scientist, who has been affected. And by the time you don't get your next grant, the problem is by no means "imaginary" any longer - but your lab might be.
- Lars Juhl Jensen
Lars, I don't disagree with that. Especially if you have collected the data for a specific purpose, e.g. you only publish your raw data for a protein structure after you solve it, and perhaps identify a mechanism of action. My problem is with the paranoid fear around "my data" which seems to be a direct result of the publish or perish model
- Deepak Singh
and you must be reasonable. You can't keep working on a mechanism of action for 3 years and hold on to the coordinates :)
- Deepak Singh
Deepak, we completely agree that there is no excuse for not making data available after publication. But I don't really see that as having to do with being scooped - if you wait until after publication then scooping is obviously not a risk. Am I missing the point here?
- Lars Juhl Jensen
Lars, I see two situations. One, where you collect data for the sake of collecting data (a big genome or something). There, you MUST make it available immediately. There's enough in there for many people
- Deepak Singh
"You can't keep working on a mechanism of action for 3 years and hold on to the coordinates" actually you can thats what "Deep throat" and all the M15 (5 horsemen) did.. they scooped every darn bit of data and retained the same coord's for over 25yrs !
- Peter Dawson
In the other situation too, in an ideal world, I would consider releasing the data. The only reason we hold on to it today is that we have a "first past the post" publishing model, so my previous statement fell into the trap I advocate against. We are trying to fit into an existing system, rather than changing the system. For the case you describe, how long do you hold on to the data. Until you submit? Until it gets published? 6 months?
- Deepak Singh
The attitude I've seen from some scientists towards "scooping" borders on obsession. And that's the part that really makes me worry, cause it's all about the publication or finishing a thesis and not about good science anymore
- Deepak Singh
Peter, you can do it. It's just not the right thing to do
- Deepak Singh
I fully agree that the problem is that science is so focused on publishing today. If it was not a matter being "first past the post" then there would indeed be no reason to hold onto data. I always try to make data available as soon as the paper is published - but there has been cases where collaborators did not agree to that.
- Lars Juhl Jensen
Lars, in that case, we have little to disagree on :)
- Deepak Singh
I also agree that if the primary purpose of a project is to produce data, then the data should be made available as early as possible.
- Lars Juhl Jensen
I still maintain that scooping is very real, though ;-) (But I hate it too!)
- Lars Juhl Jensen
Another case for using CC licensing on some datasets. Protection against slimy #$(#*&(@#*$&@!)#
- Deepak Singh
I've recently spoke to a person from structural genomics center, who opened my eyes on the fact how many enemies you can gain after publishing a paper from "somebody's else" data even if release (for example coordinates) was couple of years ago. Such people call it scooping and will make sure everybody in the community know how bad person you are (aka bottomfeeder). It's all crazy - Nobel prize washes brains of otherwise smart people...
- Pawel Szczesny
Well, Pawel, that's exactly why this fetish about scooping is so stupid. Everyone here reading this is not only more creative, but more technically skilled, which is why I think most people here have little fear of someone taking their idea and doing more with it than they could.
- Mr. Gunn
Actually, the use of CC licenses on datasets is a bit problematic. The problem is that CC licenses are based on copyright law and as such they legally only cover "creative works". It is clear that a scientific paper counts as such, it is equally clear that facts of nature are not covered by copyright, but as far as I know it is unclear if data sets qualify as "creative works".
- Lars Juhl Jensen
Pawel, I also cannot comprehend how someone can find it unfair that you use a data set that they themselves released. Thankfully, I have not run into such individuals in the microarray / protein interactions field - otherwise I would have a lot of enemies by now ;-)
- Lars Juhl Jensen
AFAIK sequencing community has unformal agreement that people hold off others data for six months (NIH funded projects cannot put sequences "on hold" at NCBI). There's nothing like that among crystallographers. And that's the only reason my idea for bug tracking service for scientific data was put on hold - I've been warned that I shouldn't do it as long as I have academic affiliation (or success rate of my colleagues grant applications is going to decline). Strange, isn't it? But true.
- Pawel Szczesny
Lars ... good point, and actually part of the reason for the "open data" premise. We just need to figure out an appropriate attribution mechanism
- Deepak Singh
I thought crysallographers also had informal "not until published" agreements in place?
- Deepak Singh
Deepak, I think it's rather "not until I squeeze every possible bit out of the coordinates" and possibly originates since DNA story (no other field has so strong sense of ownership). Anyway, I've just mentioned that as an example what "scooping" means to different people. My guess is that all kinds of different attitudes towards data ownership could be easily addressed by removing manual labor from the data generation pipeline. I dream about virtualized and automated laboratories...
- Pawel Szczesny
The "to be published" remark in released coordinates is extremly abused rendering an "not until published" agreement in my opinion useless.
- marcin
Pawel, that would be one aspect. The crux of the matter, though, from where I sit, seems to be in the system. Take away the publication pressure and my most hated word loses all meaning.
- Deepak Singh
I sincerely hope what Neil says can become reality one day. Its a rat race, esp. in biology research and I cannot understand why 2 people cannot come up with the same idea, incommunicado. That said, there is a dark underbelly to this term, exemplified by the Watson & Crick saga. Publication pressure and massive egos don t help science either.
- Aarthy
Deepak, you've just given ultimate solution, mine was a partial one :)
- Pawel Szczesny
we had a discussion about this with respect to release of DNA sequences as well this morning and the suggestion was even there, which is often given as the example of good practice, these social norms are abused. I'd be interested in Matt Wood's take on that.
- Cameron Neylon
The question is. Do we prepare for the 5% of negative cases that will arise in whatever model you choose, or focus on the 95% of honest scientists?
- Deepak Singh
I once talked to a lady from sequencing lab in US. Jumping on her data before this unformal 6 months time span happened few times, but I was told overall they do not worry much (and they write publications as fast as possible). She was more concerned about her employees missing opportunity for a good paper than about somebody making huge discovery they would miss.
- Pawel Szczesny
Scooping is just a consequence of the normal step-wise progress in science and the saturation of certain fields of research. To say that it doesn't exist is either because you are lucky to work on an open field or have your head buried in the sand. It happens, too often for comfort. Also, to say that only the uncreative worry with scooping is unfair, to say the least. In wet labs, the main limiting factor is often the experimental system (project turnover of 2-3 years), not the researcher's creativity.
- Ricardo Almeida