e.g. what platform do you use in your community? Are you happy with that? How steep was the learning curve for you and your collaborators?
- ReaderMeter
"Dryad accepts data in any format as long as it is associated with a primary publication" I guess this is too restrictive as it makes the platform unsuitable for hosting research datasets the Wikimedia Foundation generate but that do not make the object (at least not immediately) of scholarly publications.
- ReaderMeter
"An interesting aspect of Taverna is that workflows can be stored at http://myexperiment.org, and once set up can even be run directly on that website without installing Taverna." Noel do you know how to do that? - we have some Taverna workflows uploaded to MyExperiment - it would be useful to run them on MyExperiment as web services
- Jean-Claude Bradley
I can't figure out if that's true or not now. Maybe I'll amend the text, at least until the slides are sent around.
- Noel O'Boyle
looks like an awesome blog for anyone teaching chemistry - I'll look into doing some experimentation with my organic class with some of these tools and ideas
- Jean-Claude Bradley
Well papers are academic currency. Any way to increase your wealth (utility be damned) :)
- Rajarshi Guha
But three? Within six months? 200% inflation rate is going to kill this system... :)
- Pawel Szczesny
@Pawel... agreed. This is the same paper. It is not uncommon for biology groups to publish the 'tool' separetely from the 'science', but this sounds ridiculous... plagiarism it is... actually, all journals I have been reviewing for in chemistry, do not allow results to be published before... I can't believe there are so many angles to this tool that those journals would have allowed it... within 6 months... that means they must have been submitted simultaneously :)
- Egon Willighagen
Actually they're doing themselves a major disservice. By publishing the same thing 3 times they effectively divide their citations by 3, which harms their H-index.
- Paul Gardner
And none of these 3 papers cite Jmol! Or even mention it...
- Egon Willighagen
NAR often includes previously published databases and software.
- Matt Hodgkinson
I've also heard of a rejection to the NAR webserver issue b/c of a Bioinformatics Application Note. But I'm not sure if this is a general policy. Once you're in the NAR db / webserver issue, you can re-submit after 2 years.
- Michael Kuhn
I'll play devil's advocate. Apart from the reaction against CV stuffing is there any good reason not to do multiple publications for a service? If the argument were, for instance, to reach a series of different audiences?
- Cameron Neylon
multiple pubs in multiple venues are fine. But pubs are currently a currency and basis of competition (amongst other things); from this POV, spamming journals with multiple articles devalues the individual articles
- Rajarshi Guha
Agreed but surely its the author's choice to balance that devaluation against potential value gain of reaching new people? I guess what I find interesting is that people feel that protecting against publication inflation is a bigger concern than getting information out efficiently. Similar case where a piece in PLoS Currents was subsequently published elsewhere and everyone got their...
more...
- Cameron Neylon
I checked the website, and the have an attribution clause... I could not find the attribution requirements, but nothing stops them from asking people to cite all *3* papers...
- Egon Willighagen
@Cameron... I think it's a problem of inflation, and devaluation. 3 papers is simply more rewarding, and everyone not publishing more or less the same thing trice is effectively punished.
- Egon Willighagen
Perhaps, but is that not a symptom of measuring the wrong thing? If we actually measured re-use (e.g. citations) and three papers meant the number of citations were cut in three for each paper and the total number was the same then we'd be ok right? No devaluation? The problem here is not that its being published three times but that we value the wrong things (number of papers) in a system that enables (or even encourages) cheating.
- Cameron Neylon
What @egon said. My basis for this argument is that, in principle, multiple pubs in different venues are fine (I'm not sure how different the venues were for this case). And in a world where the nuances (or lack thereof) of these multiple pubs are taken into account, this would be fine. But in the real world, where jobs/grants/promotions are (unfortunately, frustratingly) based on a...
more...
- Rajarshi Guha
@cameron - absolutely! We are measuring the wrong thing. But, that's what we're measuring. So to stay in the race, we (well, not me, it doesn't matter to me much anymore) play the game, whose apparently best strategy is to publish as much as we can. I'm sure that with your and others' efforts this will change one day - but people still want to get their jobs/grants/promotions ...
- Rajarshi Guha
Agreed - and this isn't a case where I'd argue much in their favour. But the thing with PLoS Currents was a bit different but got a very similar response. Interested whether people feel that's as egregious a case.
- Cameron Neylon
is there a link to the PLoS Currents discussion?
- Rajarshi Guha
Not sure if these particular 3 papers are what I usually think of as duplicate papers. The "Acta Crystallographica Section F" one is part of a special issue about the JCSG pipeline, so I think it's reasonable there even if it's duplicating things. And my opinion is that the NAR database/server issues are also a special case - as they provide a resource to the community and often describe websites that have been published elsewhere. In short, not the most straight forward example of duplicate publications.
- Mickey Kosloff
Cameron, if you're playing devil's advocate, don't forget to send an invoice to NPG, because they will profit the most from perceived inflation of papers outside of Nature* ecosystem. :) But let me play the game as well - if we allow for such marketing strategy, it gives yet another advantage to people who use English natively and have no problems to write five different stories on the same discovery. Yet another penalty for not being British? Thank you so much, Cameron ;).
- Pawel Szczesny
Mickey, while I agree these are "special cases", not clear duplications, I still don't really get it why it's allowed in a first place. When I was reviewing a manuscript for NAR special issue I'd asked authors to improve the service in comparison to the original (published few months earlier) despite clear policy on allowing duplicates. Today, probably I would refuse to review for NAR special issue at all...
- Pawel Szczesny
Incredible how people behave like you expect them to behave in these comments. Very revealing and eye opening.
- pn
"In principle, researchers directly annotating genes they themselves characterized would be more efficient, but this practice has not yet caught on because annotation is time consuming and annotation guidelines are complicated"
- Attila Csordas
from Bookmarklet
Good point about the need for better search and analysis tools
- Mr. Gunn
from YouFeed
He's braver than me....not sure I want to go there. I did share my Paternal Haplogroup and projections of what I might look like fat, bald and old though: http://tinyurl.com/5upk6p9
- Antony Williams
Just wondering how the type of data placed into the public domain here differs from that in Genomes Unzipped? http://www.genomesunzipped.org/project In the latter case it also appears that at lot of care was put into the ethical considerations of the people concerned plus there's strength in numbers.
- Dan Hagon
from Android
Think the rub is both a) standards are still lagging behind users (formats for marking up identications and quantitations are still in development), and b) mass spec manufacturers and key software tools are still not fully supporting the standards that already exist.
- Neil Swainston
from iPhone
I will use CHEMINF, which is a bit cryptic (inherited from OBO), and elaborate (deliberate, to allow proper provenance of data)... paper should be submitted any day now...
- Egon Willighagen
Bioclipse can generate RDF directly from the ONS Google Spreadsheet, but I have not had time yet to look at that... will try that soon.
- Egon Willighagen
Thanks Egon. I don't see a closing to the <chem:Measurement> or an opening for </rdf:Description>. Should the </rdf:Description> be </chem:Measurement>. Also the units are M. Would it be ok just to stick that at the end of the number, e.g. 2.380M, or do I need to put units separately from the value?
- Andrew Lang
Yes, the rdf:Description should be a closing chem:Measurement. You can better change the predicate to molarConcentration, I think. That way, the field stays a float.
- Egon Willighagen
SVG should be the norm for these types of diagrams. Particularly since it supports RDFa annotations. Inkscape is excellent. I'm also looking into Apache Batik for programmatic generation of diagrams.
- Dan Hagon
from Android
I love SVG , but it cannot be used it if there are too many points/objects.
- Pierre Lindenbaum
Tooting my own horn here, but our lab has come out with a lightweight sequence format specifically to address the metadata-as-free-text problem. 61 proteomes from the Reference Genome Annotation Project are already available. See http://seqxml.org and http://www.ebi.ac.uk/referen...
- Dave Messina
I do some annotations, clustering and enrichment analysis in Cytoscape via its plugins. I haven't use igraph yet, but I guess it's just a matter of time as soon as I move my analysis entirely to R.
- Pawel Szczesny
Gene2pubmed provides a nice lower bound but what about the upper bound? This article http://psb.stanford.edu/psb-onl... suggests that we may see a recall of about 0.55 for gene2pubmed in identifying genes in articles. That would suggest that the number over all of pubmed may be closer to 6%.
- Benjamin Good
Joachim (http://joachimbaran.wordpress.com/)) posted some additional data as a comment on my blog post. He suggested: "...I would now say that MEDLINE's baseline 2010 has 1.7M * 90% / 73% * 90% / 98% = 1.9M gene mentions in its titles + abstracts. That would mean that 19% of titles/abstracts -- for which there is an abstract -- have a gene mention..."
- Benjamin Good
That sounds like a good upper bound - so we've got somewhere between 611,108 and 1,900,000 (*2 if Joachim's data extends to papers without abstracts) articles with information about genes. (With 'information' defined loosely enough that a mention indicates its presence.)
- Benjamin Good