or have an alternate mashup with impact factors - Deepak
Would be cool, but ISI's kinda mashup-hostile, aren't they? Do they even have an API? - Mr. Gunn
Everyone is hording attention data regarding papers. There is no easy way to access individual citation information, number of downloads etc. Check out the "conversations" about a paper from Alf (http://scintilla.nature.com/co...). It is probably not fast enough to rank pubmed queries but it looks great for single queries. - Pedro Beltrao
20351 journals in ftp://ftp.ncbi.nih.gov/pubmed/J_Medline.txt . Let's find ~200 volunteers that would find 100 impacts factors and complete a shared list of journals on the web :-) - Pierre
Thank you for the link Lars, I'm going to have a closer look at this site - Pierre
What about setting up an open project / open data impact factor website? With the growth of content on PubMed and Open Access in general, a joint effort could easily to this... We'd store things as RDF using the just released Bibliographic Ontology... etc. - Egon Willighagen
+1 Egon. Today, I'm currently writing a trivial prototype to sort the papers from a xml-pubmed result, let's discuss this later. At this time I'm about to put a list of journals in mysql (just one table {title,MedAbrr, NCBI-ID, ISSN, impact} ). - Pierre
@Mr. Gunn: PubMed doesn't have a proper API but it has EUtils - Kambiz Kamrani
EUtils was not implemented for (Journals AND xml). It doesn't contain any information about the impact factors (or RSS, editor, etc...) - Pierre
Kambiz - pubmed has eutils, but you can't get impact factor from that. You need to deal with the luddites at ISI for that, ;-) - Mr. Gunn
legal issue: can we build a public database of impact factors ? The list may belong to ISI. - Pierre
I am almost certain that you would be in legal trouble if you were to make a public database of ISI impact factors. To get access to the impact factors in the first place, your institution must sign a contract with Thomson Scientific. I would be extremely surprised if that contract did not forbid you to redistribute the data to third parties. - Lars Juhl Jensen
Lars, that's what I suspected. Whatever, I'm still writing a program to sort a set of papers. - Pierre
Pierre, that's great because you should be able to sort by eigenfactors :-) On their website they explicitly state that "Eigenfactor is completely free and completely searchable". Since their "Article Influence" score correlates well with the impact factor, you should be able to get to get a good sorting this way. - Lars Juhl Jensen
It's a great way to shield themselves from critique, isn't it? Forbidding any sort of published comparison? I say screw 'em and use the eigenfactors instead, or alternatively, reverse-engineer the IF algorithm from public data and use that. - Mr. Gunn
If you rev-eng the IF Thomson will just whine that you got it wrong (while not showing you how they did it "right). I love the idea of being able to sort by impact, but not impact factor -- I'd rather we all just ignore that particular metric. Pierre, these might also be of interest: http://www.scimagojr.com/, http://www.journal-ranking.com... - Bill Hooker
In the end, we need to move this discussion to what is important. Is it the number of views? Is it the download? Is it the number of times the paper has been cited? Is it the number of trackbacks? A hybrid (most likely) We need to rethink and redefine impact and come up with our own score (Eigenfactor is just a start) - Deepak
Perhaps the Biogang has found its first target? A "killer app" to kill the Impact Factor could have an enormous effect. Like it or not, funding=resources<requests so there has to be a way to rank and sort scientists. Better metrics will mean better science getting done. - Bill Hooker
Deepak, I was originally thinking about the number of cites of a paper... that's a basic question I have... given some article, what papers cite that work directly or indirectly (by citing a paper that cites the the query paper directly). But surely we can do better... Another thing we could bring in is the annotation of the citations... does the paper citing the query paper use the method described in the later, does it review that matter, does it contradict that paper, or just as related literature... - Egon Willighagen
Egon, we should start simple, then think about how we can do it better, and in a way that makes things more meaningful. Love your ideas above. and yes, I think this will be a great BioGang project. - Deepak
Is there an industrial impact factor which is based on the patents, products, trials based on the very paper? I would love to take a look on that Technological Impact Factor list, probably quite different from the one based on purely academic merits. - Attila Csordas
Egon, it is very likely that in many cases we could distinguish types of citations by observing how relevant papers are connected (in other words, a paper (paper 3) that cites another paper (paper 2) which is using some method (paper 1) is likely to cite also the original method paper (paper 1)). Your idea would also allow to find subgroups and possibly merge them (in cases where competitors doesn't want to cite each others publications). - Pawel Szczesny
Attila, correct me if I'm wrong, but don't scientific papers and patents contradict? I have somehow impression that a patent is actually something that _works_ :). - Pawel Szczesny
I would agree with trying to stay away anything that tries to apply the journal impact factor to an individual paper in that journal (and by extension authors); it's an inappropriate statistic. But still, it's clear we need some way to aggregate the collective judgement of many scientists, in terms of ranking the importance of papers. Some combination of views, downloads, citations, could prolly do it, but how exactly isn't obvious to me. - NatBlair
wouldn't you want to cite it by either the direct citation count for the paper, or by the citation counts for the authors, rather than the impact factor of the entire journal? - Richard Akerman
Richard, that would probably be the best way to go in the end. That would remove some of the stranglehold that publications have (although journal influence will be implicit in whatever metric you use). - Deepak
I agree that it is obviously better to judge individual papers based on how much they actually got cited rather than based on the impact factor of the journal in which they got published. But how will you sort papers that have been published within the past half a year and thus have not been cited yet? By number of downloads? Or is the impact factor perhaps not that bad a proxy for the expected number of future citations? - Lars Juhl Jensen
@Lars: Due to the terribly skewed nature of the distribution of number of citations to papers in any given journal, the impact factor, which signals the mean citations per paper, is never a good proxy for the majority of papers in that journal. Now, downloads and views might be better for more recent things, but I don't think many journals are releasing that info. - NatBlair
I think this would be a great biogang project! - Mr. Gunn
NatBlair - I think we're looking for a journal-independent measure of impact. Certainly papers in some journals will have more citations just because they're read more, but that's also the opportunity for more impact. If you'd like to know "would the same paper have gotten as many citations/reader if it were published in another journal?", well that's a great question, but not one we can answer. Maybe some sort of naive Bayesian approach can find a classifier that predicts the number of citations? - Mr. Gunn
@Mr. Gunn - I agree, and was mostly supporting trying to remove ourselves from the idea of judging a given paper solely by the "Impact Factor" (trademark of ISI) of the journal it is published in. Sry if I wasn't clear. The broader question of how assess of the impact (i.e. the importance/influence/etc.) of a paper, is a tough one. Citations are one kind of vote, downloads/views are another. Something like the Faculty of 1000 (esp if opened up to more people) ratings are again another. How to combine em? - NatBlair
Perhaps, rather than Impact Factor as predictor of expected cites for a given paper (bad for reasons NatBlair gives), we could use an extension of the h-index idea. If Deepak's h-index (or whatever) is >> Bill's, then we expect Deepak's latest paper to be cited more than Bill's. Further, if the "cumulative h-index" for Deepak's group is higher than that for Bill's group, we have more evidence for the same prediction; and so on for the department, university, and so on. - Bill Hooker
Aha, there's a word limit. Good, I need one. Last point: my suggestion does not solve the "rich get richer" problem, but I'm not sure any prediction method will... maybe an algorithm based on publication/cite histories of many authors? - Bill Hooker
I also think that ranking individual papers on a quality measure of the journal is not very appropriate. As others have said, it really is not a good approximation to the quality of the individual paper due to the long tail distribution (a few papers determine the impact factor). Unfortunately there is no easy/timely way to evaluate single papers. It would be great to have citations, number of downloads and/or citations per paper of the authors. - Pedro Beltrao
there was some related discussion over here: http://tinyurl.com/58s7jh Somewhat connected to Pawels points - I talked about with Julius Lucks and the folks at PLoS about trying to move methodology out of the (supplementary material of) papers and onto services like OpenWetWare. Then you could track uses of a specific protocol and get a sort of impact factor for that protocol as opposed to the papers that they are found in. Thinking about it - we could harvest PLoS to populate OWW as well :) - Cameron Neylon
I see the point about the long-tailed distribution, but I still think that people are underestimating the importance of the journal. I have no doubt that the exact same paper will be download, read, and cited much more if it is published in Nature compared to if it is published in the Bangladesh Journal of Botany (not that there is anything wrong with that journal). Maybe the median number of citations is be a better measure than the mean? - Lars Juhl Jensen
The H-index is interesting. I wonder if there's some way to back test it to see how predictive it really is. Like using 5-10 years ago to predict 1-4 years ago. - Mr. Gunn