Martin Fenner › Comments

Nils Reinton
"Over 80 million of free scientific articles, patents, theses and posters" - Nils Reinton from Bookmarklet
Pretty cool - could need an 'advanced search' option... - Björn Brembs
I didn't know this one. - Martin Fenner
Found this in a LinkedIN group ( it's made by Brice Sagot and he writes "Please contact me if you have open access articles that are not indexed in this search engine (Open archives, universities websites …). Thanks !". So suggestions for improvements I amsure could be directed to him, - through LinkedIn then I guess. - Nils Reinton
its still "powered by google Custom Search ", so the claim " open access articles that are not indexed in this search engine " is a far fetch.. most uni's have thier robo.txt file set to spideraway - Peter Dawson
Martin Fenner
Google Scholar Citations, Researcher Profiles, and why we need an Open Bibliography -
Martin, is there any insight in how open/close this Google initiative is going to be? Will there be an API, and a 'liberate' functionality like for other services? - Egon Willighagen
Egon, I would love to know. I hope to find out more in the coming weeks. - Martin Fenner
Martin Fenner
In search of the right literature search engine(s) -
Nature Precedings, No. 713. (25 February 2011) BackgroundCollecting scientific publications related to a specific topic is crucial for different phases of research, health care and ‘effective text mining’. Available bio-literature search engines vary in their ability to scan different sections of articles, for the user-provided search terms and/or phrases. Since a thorough scientific analysis of all major bibliographic tools has not been done, their selection has often remained subjective. We have considered most of the existing bio-literature search engines ( and performed an extensive analysis of 18 literature search engines, over a period of about 3 years. Eight different topics were taken and about 50 searches were performed using the selected search engines. The relevance of retrieved citations was carefully assessed after every search, to estimate the citation retrieval efficiency. Different other features of the search tools... - Martin Fenner
Martin Fenner
Did you receive spam because you published a paper? -
Yes, and I've sent some. *ducks* An argument for preserving a way to connect with authors en masse, as research subjects: I don't know how to maintain these possibilities and avoid spammy spam. hmmm. - Heather Piwowar
I think there are better ways to contact corresponding authors. I'm obviously biased, but I think a unique author identifier such as ORCID would solve some of the problems. At the same time it is essential that the ORCID system is built in such a way that it doesn't make it too easy to spam people, e.g. a profile option "I'm OK to be contacted by questionnaires". - Martin Fenner
Yup, that could work! - Heather Piwowar
Bora Zivkovic
I WILL be at #solo11 this year, finally attending the sister conference to #scio12
Bought my ticket today as well. Looking forward to seeing you again, Bora! - Björn Brembs
Really excited to finally make it. Anton Zuiker and I will do a session... - Bora Zivkovic
I'm really looking forward to see both of you in London. The conference is both similar and very different from #scio11. - Martin Fenner
Martin Fenner
Ten Simple Rules for Building and Maintaining a Scientific Reputation -
PLoS Comput Biol, Vol. 7, No. 6. (30 June 2011), e1002108. Philip Bourne, Virginia Barbour - Martin Fenner
Martin Fenner
Avastin At the FDA Today: Passion Should Lose -
wow that's a big molecule. - Andrew Lang
And a big story ); - Martin Fenner
Martin Fenner
Please join us for the Science Online London Conference in September -
I'm thinking about it... - Björn Brembs
Please go ahead and suggest one or more interesting sessions in the wiki. - Martin Fenner
Starting to look like I may be back in Europe so thinking about it now too, especially if there is a session on crowdsourcing research on disease outbreaks (on top of E. coli, BGI is now trying to do something similar with Scarlet Fever - Scott Edmunds
Daniel Mietchen
Anyone know of examples of extensive amounts of supplements to published papers? Especially looking for multimedia supplements. seems to be the current record holder at PLoS ONE, with 43 supplementary files. No multimedia, though - just figures and tables. has 27 supplementary files, of which 24 are videos. - Daniel Mietchen
Yes, Neil, and thanks. I was asking a bit in the other direction because the Neanderthal genome paper was already on our list. If anyone knows of a supplement exceeding these 175 pages (which includes 51 figures and 58 tables), though, please share it here. - Daniel Mietchen
Thanks for pointing me to the Neanderthal paper. Used the info in a presentation today. - Martin Fenner
Just noticed that the above comment by @neilfws is gone. I was sad to see him leave, but I understood the reasoning behind that. I do not understand why he deleted the account. For the record, the paper he mentioned is at . - Daniel Mietchen
What happened to neilfws? - Björn Brembs
Bjorn, Daniel I assume he simply forgot the lesson of Eva Amsen (she also left FF deleting account) assuming that it's like the Facebook - you delete account and everything stays anyway. - Pawel Szczesny
Martin Fenner
Go To Hellman: Our Metadata Overlords and That Microdata Thingy -
Go To Hellman: Our Metadata Overlords and That Microdata Thingy
Eric Hellman's thoughts on - Martin Fenner from Bookmarklet
Martin Fenner and Pre-Existing Communities -
"I have been reading tweets and blog posts expressing various levels of disappointment and unhappiness about not using RDFa, not using Microformats or not having been developed in the open with the community. Since other people’s perspectives differ from mine, I feel compelled to write down my take." - Martin Fenner from Bookmarklet
Pierre Lindenbaum
I'm waiting for #pubmed 'Author ID': (mid 2011)
I hope this will be shared with ORCID! - Björn Brembs
Agreed - Egon Willighagen
Expect public ORCID service in Spring 2012, developer access a little bit earlier. Will aim to work with PubMed Author ID (and other author identifier systems). - Martin Fenner
I like the end of it where they say "rapidly evolving area" :) ... I obviously have no idea of the challenges involved but from the outside it looks nothing like "rapidly evolving". - Pedro Beltrao
lol @Pedro: indeed! - Björn Brembs
Martin Fenner
ScholarlyArticle - -
Initiative for structured content by Google, Microsoft, Yahoo. Also includes scholarly article. - Martin Fenner from Bookmarklet
Martin Fenner
Data Citation Awareness -
"This guide is intended for eResearch infrastructure support providers and researchers.  It includes some suggestions about citing data as well as discussing issues around data citation and activities underway to develop a culture of data citation." - Martin Fenner from Bookmarklet
Björn Brembs
Wordpress for scientists -
Congratulations! - Martin Fenner
@Martin: Do I understand correctly that your BibTex plugin is not set up for automatic installation via the WP dashboard? - Björn Brembs
Heather Piwowar
Still on the hunt for #Scopus access. Can it be found in the British Library on a Sunday? I'd thought so, but no luck yet. Stay tuned....
do you want me to check something for you or do you need to sit down for a while with it? - Christina Pikas
I need to do about 25 large queries and export the results. I think I'm getting close to a path.... should get access on Thrusday. If I don't have success in the next few weeks, I might come knocking to seek collaboration, if we could make that work? - Heather Piwowar
absolutely. just let me know. - Christina Pikas
I would also be happy to help in one way or another. Sciverse developer access sounds like the best solution. - Martin Fenner
Thanks, Martin! The response has been fantastic :) I'll let you know if I need to take you up on offers... - Heather Piwowar
Heather Piwowar
Has anyone thought of citation-shortening, like url-shortening but as work-around for max # cites rules in journals? ugly but necessary? ...
... esp for citing large amounts of data reuse. See here for early version of idea . Mostly thinking of shortening not of each citation but rather as a pointer to a registry entry that expands into all of the actual citations -- for instances when need to cite many datasets for example - Heather Piwowar
... trick of course to get citation databases to index the expended versions and link, but in their best interest. - Heather Piwowar
... thought inspired by @cboettig comment on recent blog post: - Heather Piwowar
Citation shortening is often done by citing review papers. - Martin Fenner
True! But that "method" doesn't work as well for citing datasets :) Other ideas for getting around the max#cites issues for mega data reuse? Meta-analyses often include citations in table or supp mat, but then not part of the citation databases, so less credit and links. - Heather Piwowar
How about we campaign to kill off the max # of cites rule, instead? Even those dinosaurs who insist on a print version can print selected references with a note "full refs online". So there's no good reason for arbitrary restrictions on length, # figs, # cites, etc. (Pre-emptive: yes, restrictions can help authors focus and prevent rambling. So can editors, who are always telling me about all the value they add.) - Bill Hooker
Agreed, that would be great, but I think it will be hard. It requires many journals to buy in. At least some journals care a lot about the online and paper copies being the same. - Heather Piwowar
I don't have any good ideas, but I agree with Bill. Any limit to citation length should have very good reason in modern day. I don't think citing review articles is very often effective--it just compounds the citation context problem even further. - Steve Koch
Someone suggested this for FigShare saying, I've uploaded 20 datasets and I want to cite different groups of them in different papers, could we have a tool where you group different sets of databases and this generates a new handle (or DOI). It's a great idea, just not top priority on the dev list right now. - science3point0
cool! yup. ideally the mapping would be stored somewhere that citation databases could easily index what the new handle points to. Does that mean that DataCite should handle it, I wonder, so that the mapping is part of the metadata for the collapsed citation and easily accessible to citation databases? - Heather Piwowar
If look at how the ISA-tab communities metadata tools ( structures things into: investigation, study and assay - using these should provide some of that type of functionality as many individual (and combinations of) datasets can be grouped under a single investigation. I think they are currently working with Datacite, so future versions of their tools should hopefully allow doi's to be created for much larger citable units. - Scott Edmunds
[I should declare that we (BGI/GigaScience) are working to give our databases an ISA-tab compliant structure, although a lot of the EBI and a growing list of other databases use it too] - Scott Edmunds
Steve Koch
To my library friends: question about where to publish original research about library operations ... question from our recent cyberinfrastructure day
At our "cyberinfrastructure day" yesterday, I met a colleague from the UNM library. She approached me with question about how to publish some research she had done on some aspect of library operations in her work. (Sorry I forget the details.) Her specific desires: (A) to get the information out there without waiting for peer review and (B) to publish in a form that will allow others to build on what she starts. She is also a little concerned about (C) credit, but not as much as A and B. She was thinking wikipedia when she approached me, but I thought not the best venue, since it's mostly original research that she wants to publish. A publicly editable Google Doc or just a wiki on wikispaces or other generic wiki site is an option. But I figured I'd ask library science friends on here if there's a better venue to reach a Library2.0 audience? Thanks! - Steve Koch
I would suggest writing a knol ( It's citeable and offers reasonably good metrics. PLoS is already using knols as a publishing platform for their Currents journals ( - Jan Jensen
The Digital Libraries section of ArXiV could be an option that would cover A and C. - Martin Fenner
Thanks! Great ideas. I will send her a link to this thread. Thanks! - Steve Koch
Exactly the kind of thing we are building Knowledgeblogs for Can issue DOI's etc. Please get in touch if you might be interested in setting one up. Also check and for examples and for more about how we go about it - Daniel Swan
Martin Fenner
Direct links to figures and tables using component DOIs -
excellent timing, Martin, CTT is preparing to do exactly this from No. 11 onwards - thanks - Claudia Koltzenburg
Claudia, what is your preliminary experience with component DOIs? Are they difficult to implement for a journal? And will you offer them for tables and figures, or also for other content? - Martin Fenner
we are preparing to start on component DOIs so this is exactly what we are looking into at the moment, Martin, so far we were thinking of tables and figures only, and now there is another little golden pearl from Martin Fenner in our to-do bucket to relish :-) will keep you updated, - Claudia Koltzenburg
new question: can PMC deal with component doi in their DTD? (CTT has just passed PMC content review successfully) - should probably check PLoS xml @PMC - Claudia Koltzenburg
I don't know the details, but the example I use in my blog post uses this URL in PubMed Central: In other words, both PubMed Central ID and component DOI. - Martin Fenner
thanks, will find out more - Claudia Koltzenburg
DataCite Hannover branch (our doi registration agency) says CTT is the first they know to be explicitly interested in component doi, they are ready to register them. Now we need to find out more about how we can save extra xml export hours by combining the NLM schema that is focussed on text with the DataCite schema that was created for Data, (re NLM and component doi we will now talk to ZBMed in Cologne) ... and then there are figures, tables, films, audio - would OAI-ORE be of any help? - Claudia Koltzenburg
does anyone know if anyone else is finding out if component doi are difficult to implement for a journal? keen on others' experiences, esp. from fields relevant for PMC, actually keen to learn about some short track, too :-) - Claudia Koltzenburg
Martin Fenner
PubMed and beyond: a survey of web tools for searching biomedical literature -
Database, Vol. 2011 (18 January 2011) The past decade has witnessed the modern advances of high-throughput technology and rapid growth of research capacity in producing large-scale biological data, both of which were concomitant with an exponential growth of biomedical literature. This wealth of scholarly knowledge is of significant importance for researchers in making scientific discoveries and healthcare professionals in managing health-related matters. However, the acquisition of such information is becoming increasingly difficult due to its large volume and rapid growth. In response, the National Center for Biotechnology Information (NCBI) is continuously making changes to its PubMed Web service for improvement. Meanwhile, different entities have devoted themselves to developing Web tools for helping users quickly and efficiently search and retrieve relevant publications. These practices, together with maturity in the field of text mining, have led to an increase in the number and... - Martin Fenner
Martin Fenner
Nice overview. Comments: updated downloads only once a year?? But proper software can fix that, not? And the launch data is a bit sad. It's a shame that projects that involve big organizations think in years, rather than months or weeks. I would love to see ORCID adopt the Release Soon, Release Often paradigm. Simple solutions, quick turn-around, lowering of costs. - Egon Willighagen
Egon, Principles describe the minimal requirements, and a promise we can keep. It's possible that we can provide datasets more often than once a year. As to the speed of the project, I think it is very important that the different stakeholder are involved in all stages of development. It would be easy to launch a system with all the requirements sooner, but you need the support of the community. But I still hope that developers can start playing with the API at the end of this year. - Martin Fenner
Ah, good to know :) And it's a good approach. About the time lines... I think the community is eager to get going... I also strongly believe they would be happy with alpha systems where things are expected to change. I would suggest to not be afraid to release something drafty, where you clearly mark it as such. That has worked best in Open Source, for decades now, and I see no reason... more... - Egon Willighagen
Egon, slide 28 is actually a screenshot of the Alpha system. If you don't worry about the ORCID identifiers and APIs changing in the coming months, you can start using the development system now. Just send me an email. - Martin Fenner
Me too! :-) - Björn Brembs
peter murray-rust
Southampton’s Blog3 and ScholarlyHTML -
Blog3 looks really interesting, and Ruby on Rails is a great platform. But it is very difficult to build another blogging tool that gets enough traction - the established players are several years ahead. - Martin Fenner
It's not really intended as a blogging platform but as a more generic science communication platform but I haven't seen recent iterations. - Cameron Neylon
Yes, it's very much not just a blogging platform. It's the RDF triples that do it, I understand - allowing you to interrogate the data in a much more useful way. Also very nice automatic data pulling from Chemspider. - Matthew Todd
I think there is scope for generic routines that could be used in other bloggin platforms - peter murray-rust
Are there other 'reviews'? Jeremy is interested in them, for future funding applications... - Egon Willighagen
Claudia Koltzenburg
"I'd just read the methods section and be done with it." - asks for component doi, doesn't it?
Good point about using component DOIs not just for tables and figures. Direct link to methods, discussion, bibliography also important. I don't know yet whether you can nest component DOIs, e.g. a figure in the results section. - Martin Fenner
now, nesting is a *really* good point, CTT will test this, too. Further, I wonder what we could learn from microblogging & Co. here? any ideas? E.g. if I referred to this passage <begin>a figure in the results section<end> from your reply without quoting it in quotation marks but putting a link/pointer behind "nesting" that highlights exactly those 6 words that your example consists of - wouldn't that be cool? where is this done? - Claudia Koltzenburg
would there be any use cases for generating component doi the moment they are needed by readers? sth like on-the-fly? - Claudia Koltzenburg
How would on-the-fly work with DOIs? Aren't they persistent identifiers? - Martin Fenner
yes, wasn't sure how to describe my idea better so said "sth like" but meant only one aspect of "on-the-fly", namely the inception: created when needed (and should then be a persistent identifier)... best of both worlds-like. What do AML geeks say to this? - Jason Priem to the thread, please :-) - Claudia Koltzenburg
I'm here! And I'm useless! Not only am I not an AML geek, I don't even know what it stands for.... But I think the idea of on-demand, arbitrarily-scoped DOIs is interesting (and why just for things smaller than a paper? A collection of papers could have a DOI, too.) On-demand DOIs would be key for annotations, which are ultimately just special cases of papers. - Jason Priem
thanks, & sorry, I meant ALM ;-) can you imagine any use case in this regard - "Article Level" Metrics? - Claudia Koltzenburg
Daniel Mietchen
Is there any journal that permits to submit articles in HTML or other web-native formats?
what about equations? - marcin
I'm fine with any web format accepted for submission by a scientific journal (possibly even blogs using or similar). There is a list of wiki-based journals at , so I am mainly after non-wiki journals in this thread (and not after PLoS Currents either, which has many similarities with wikis). - Daniel Mietchen
Let's use EPUB, please! Web native (HTML+CSS zipped with metadata) -- but also easy to grab for offline use. Reflowable. Supported by several major mobile devices and software packages. - Jodi Schneider
There are some: publishes in WordPress and has received HTML submissions. - Jodi Schneider
Background on EPUB: . - Daniel Mietchen
Update: - "XML submission, editorial, publication and dissemination workflow". - Daniel Mietchen
What about journals that fully rely on Open Journal System ( - Paulo
Seems possible, but I don't know of any OJS-based journals that would allow submissions to be in HTML or XML or other online format. - Daniel Mietchen
Information Research requires XHTML - Heather Piwowar
Re "Let's use EPUB": How do you *get* to ePub from a common writing tool...without adding new software costs to your current situation? (A real question: So far, my attempts--using CALIBRE--have resulted in really crappy ePub. And neither OpenOffice nor Word has ever heard of ePub as an output format.) - walt crawford
Try eCub to convert from XHTML? - Mike Chelen
Mike: Thanks...although, AFAIK, getting to fully-formatted XHTML from Word or OO is also non-trivial. I must be missing something (probably true)... - walt crawford
Walt: What happens when converting to ePub from OO HTML file? It seems to be XHTML compliant and loads ok in eCub. Alternatively plain text could be used as the source format. - Mike Chelen
Mike: Actually, the problem with OO is that it does a truly crappy job of importing style-based Word documents, basically throwing away most formatting. The "alternative" answer is saying "don't do formatting." - walt crawford
I apologize for the threadjack. Maybe many/most articles shouldn't be formatted anyway. I'm dealing with non-scientific book-length items, and losing all the formatting in order to do ePub doesn't work. But that's irrelevant to Daniel's original question. Sorry. - walt crawford
Sorry, I missed this discussion back in August. Not a journal, but ArXiv allows manuscript submissions in HTML: - Martin Fenner
thread picked up here re OJS-based journals: - Claudia Koltzenburg
SIGIL is another good tool for creating ePub documents. Apples answer to Word, Pages, can save in ePub format. - Jan Jensen
Since epub keeps coming up in such discussions, does anyone have a pointer to a good analysis of its strengths, weaknesses, alternatives? As far as I can tell, it seems to work fine for novels and such, but if you want to use it to publish data-intensive papers with lots of tables, figures and equations, it is much less useful. - Daniel Mietchen
ePub is basically XHTML package with the CSS and image files into a .zip archive. So it looks as good (or bad) as the XHTML. An example ePub from the PLoS Comp Biol paper for the Beyond the PDF workshop is here: Needs a lot more work, particulalrly with references and tables, but looks very readable to me. - Martin Fenner
こちらは防波堤沿いの道路を走っていた車を運転していた人が撮影した動画です。これで助かったのはまさに奇跡です!! - Ami Iida
芭蕉 については です。でも面白いのビデオ。 - Daniel Mietchen
Walt: Are the Word files in an XML based format such as DOCX? Maybe there is a better tool than Open Office to do the conversion to HTML, or directly to EPUB, since they should all support CSS ok. - Mike Chelen
btw, Libre Office now - Claudia Koltzenburg
Followup: The newest version of LibreOffice does a better job of importing Word documents. I might try an ePub output one of these days. But, Mike, "maybe there is a better tool" begs the question: For ePub to succeed more broadly, us poverty-stricken writers can't be told to go buy more tools. - walt crawford
Walt: Hopefully a conversion between XML based formats should be more practical. Then a variety of tools can include such a feature, hopefully including free and open source utilities. Great to hear about LibreOffice, will have to checkout ePub support again. - Mike Chelen
Word2010 *is* XML-based (as was Word2007), that is, .docx. - walt crawford
Andrew Lang
well done - I like the slide about meetings - looks like a lot of fun :) - Jean-Claude Bradley
Thanks Jean-Claude. Style inspired by Cameron. - Andrew Lang
Very nice. - Martin Fenner
Missing a slide about the odds of landing such a job, though. - Bill Hooker
Martin Fenner
Cover Image created by ePub Export Wordpress Plugin -
Cover Image created by ePub Export Wordpress Plugin
Updated ePub Export Plugin 1.1 automatically creates cover images. Makes library view look much nicer. - Martin Fenner
Cameron Neylon
What does "themselves" mean? Personally? RT @petermurrayrust #acsanaheim Kuras SOAP 15% of chemists paid OA fees themselves
Good (important) question. - Bill Hooker
The number is actually much higher for some other disciplines (look at slide 52): - Martin Fenner
Mr. Gunn
Here's a Github repository of citation-style-language (CSL) styles. Let's get forking, scholars! #forking -
Is there a good tutorial on how to use CSL? Say, I have JSON from CiteULike (or Mendelay), who do I use CSL to create nicely formatted HTML? Do I need to hack the CSL styles to get HTML output? Do I to add RDFa to the HTML? In short, where can I found the "CSL Design 101"? - Egon Willighagen
egon +1 What should be easy and obvious is buried somewhere... - Noel O'Boyle
CSL Primer is a good start: - Martin Fenner
Martin, that tells me how to change a CSL sheet, but not how to I am expected to use it to convert JSON to XHTML+RDFa... that's what I am looking for... - Egon Willighagen
See ; I know that citeproc-js outputs HTML, and other CSL processors probably do as well. Also, CSL processors aren't very well suited for generating RDFa (CSL is designed for presentational instead of structural output). - Rintze Zelle
How do Iconvert this to BST for LaTeX, or vice versa? - joergkurtwegner
So far, you don't :). - Rintze Zelle
So, CSL destroys semantics... is that a design choice, or merely current collateral damage? What are the intrinsic problems that make it unsuited for HTML+RDFa (or HTML with microformats) generation? - Egon Willighagen
A design choice. I guess the fundamental problem is that content manipulation with CSL is rather limited (e.g. access to (sub)fields, character escaping, etc.). It's much easier (and robust) to use a real programming language to create structured output, and I think there's no reason why such a library couldn't be bundled with a CSL processor. Zotero uses JavaScript translators for this, but these depend heavily on the Zotero infrastructure. - Rintze Zelle
@egon - there's nothing stopping people from adding RDFa support to CSL processors (depending on how they're designed I suppose). In fact, my (far from complete) Python implementation uses HTML + RDFa (+ a CSL-specific attribute or two) as its internal model, and so is designed to be able to output RDFa as the core format. - Bruce D'Arcus
@egon - on how to use, you're looking for documentation related to different implementations, the most relevant of which would be citeproc-js (for server or client-side javascript) or pandoc/citeproc-hs (for a wicked fast haskell version that can be used with markdown). - Bruce D'Arcus
Martin Fenner
Why aren't more publishers providing their citation style in CSL format? Submitting a paper should be fun not a pain
The PLoS journals are among those not providing a CSL style for download. Only Endnote - Martin Fenner
PLoS should remedy that, IMHO. - Björn Brembs
peter murray-rust
Scholarly HTML – latest thoughts -
One important aspect of Scholarly HTML for me is that we should try out things and see what works and what doesn't. Instead of spending the next 12-24 months trying to define what Scholarly HTML is and should look like. - Martin Fenner
Wasn't that what you all were doing at the workshop? - Egon Willighagen
There wasn't enough time for hacking. I have a half-ready bibliography tool for Wordpress that I hope to finish soon. One question I struggle with is how best to integrate CSL and whether to use citeproc.js or citeproc.php. - Martin Fenner
Please use the .js one... more general, and I might be able to use it for my efforts than... creating HTML from CiteULike JSON... - Egon Willighagen
Citeproc.php was written for the Drupal Biblio module. Integrating it with a Wordpress module would make some sense. But I hear you. - Martin Fenner
Other ways to read this feed:Feed readerFacebook