Sign in or Join FriendFeed
FriendFeed is the easiest way to share online. Learn more »
OpenSci Info

OpenSci Info

Following the open access, data, standards, notebooks, and other information sharing to support science research and education.
Twitter
Mike Chelen
Open-Access-Statistics project DINI - Deutsche Initiative für Netzwerkinformation - http://www.dini.de/projekt...
Open-Access-Statistics project DINI - Deutsche Initiative für Netzwerkinformation
"Scientific Publications cover a wide variety of publishers, hosts, business models, usage models, publication stages, logical and technical presentation. Therefore it is important to learn which portions of the publication space can be and which agents want to be included in the sampling. For those willing to participate only two aspects are relevant: 1. What data needs to be gathered? 2. How can it be transferred to the statistics provider? Open-Access-Statistics (OA-S) is a joint project adressing these questions. Starting in July 2008 an infrastructure for the standardised accumulation of heterogenous web log data with an emphasis on institutional repositories will be built. In tight cooperation with the Network of Open Access Repositories (OA-N) various added value services will be made available to users." - Mike Chelen from Bookmarklet
Mike Chelen
OpenSci Info
@neilswainston wave promises independent servers and a more robust API compared with IM services, have to wait & see if it is delivered
Mike Chelen
auto download biotorrents from linux command line with rssdler and rtorrent #workinprogress - http://www.biotorrents.net/forums...
"download all files on the biotorrents rss (see http://www.biotorrents.net/links...) - on server using only command line interface server os: debian etch 4.0 and lenny 5.0 bittorrent programs: rtorrent http://libtorrent.rakshasa.no/wiki http://packages.debian.org/search... deluge http://packages.debian.org/search... rss programs: rssdler http://code.google.com/p... podget http://packages.debian.org/search... rss sources: http://www.biotorrents.net/rss... http://www.biotorrents.net/rssdd... http://www.biotorrents.net/rss_per..." - Mike Chelen from Bookmarklet
rssdler works nicely, podget might be easier to install though since there are packages for distros like debian - Mike Chelen
Mike Chelen
Fwd: "Wikipedia for academic research". Post a summary of your research to increase its impact. http://acawiki.org/Home (via http://friendfeed.com/plosone...)
it looks like semantic mediawiki? - Mike Chelen
ah thanks, interesting to see which extensions they are using - Mike Chelen
Noteworthy: so far, 18 summaries of articles in PLoS Biol., 12 in PLoS Med. (None in PLoS ONE). - Jim Till
Perhaps noteworthy: the vast majority of edits in the last thirty days are by two people. It's hard to build critical mass... http://acawiki.org/index... - Andrew Su
the site looks pretty new, sometimes starting fresh produces the best results, although it can be useful to find some existing data to import initially - Mike Chelen
The site launched this week. The PLoS articles were seeded using the Editor Summaries that PLoS Bio and PLoS Med routinely publish - Peter Binfield
Peter: wondering if the data was converted from another format, or does PLoS supply RDF directly? is there any description of the process used? - Mike Chelen
We didnt work with them on this (other than to have some early meetings). I suspect they just copied and pasted... However, it can be extracted from our XML file of course. - Peter Binfield
Yes, it's copy and paste at present. A better workflow would be a good enhancement to Semantic MediaWiki if anybody's looking for a project! - Jodi Schneider
Matthew Todd
BioMed Central | about us | Duplicate publication - http://www.biomedcentral.com/info...
Biomed Central's policy on publishing open science defined: "Articles may be submitted to BioMed Central’s journals when data have been previously discussed or posted in such venues as blogs, wikis, social networking websites or lab electronic online notebooks. However, given the rapidly evolving nature of these resources, where discussion of data or manuscripts posted to these venues has subsequently been incorporated into the manuscript, the BioMed Central journal Editors may make their own assessment as to whether there may be duplication in the submitted manuscript." It's excellent that we are gaining some clarity here. To my mind the idea of open science is precisely to have "discussion of data or manuscripts posted to these venues" being subsequently "incorporated into the manuscript". In these cases the Biomed Central manuscript editor decides. We really need some test cases, and we need explicit policies from other publishers. - Matthew Todd from Bookmarklet
Alerted to this by email reply from Iain Hrynaszkiewicz. Initially prompted by this wikipage: http://canwepublishopenproject... - Matthew Todd
If I can just find some bloody time to sort out the paper I should be able to give you a test case pretty soon...submission to go to Biology Direct along with a snapshot of the online lab notebook. - Cameron Neylon
Great, Cameron (wouldn't it be great to be able to buy time?) Submission of grant applications of open science projects will require us to be 100% clear to our referees that open science can be published in good journals. The more publishers agree to accept this kind of paper, the stronger the case that open science can result in high quality, high-impact publications. It's an important message to make clear to those unfamiliar with, or healthily sceptical of, the concept. - Matthew Todd
My suspicion is that if I could buy time I wouldn't be able to afford it :-) - Cameron Neylon
Mat is this what you are looking for as test cases? We published articles in 2 BMC journals - J. Cheminformatics and Chemistry Central after writing the articles on a public wiki.http://usefulchem.blogspot.com/2009... and http://usefulchem.blogspot.com/2009... - Jean-Claude Bradley
I think the writing is explicitly covered by the policy. I guess its the grey area around "discussion of results that later is incorporated" that we need to flesh out a bit more? Actually I think they're just giving themselves wiggle room for silly cases to be honest but being able to point to examples is helpful. - Cameron Neylon
Not BMC but http://frontiersin.org/neuroin... was composed mainly at http://en.citizendium.org/wiki... and partly at http://en.citizendium.org/wiki... . To make sure this would not cause problems, I had asked the journal beforehand. Also, by pasting the document versions before and after revision into the wiki, I could... more... - Daniel Mietchen
Good example Daniel - if we are looking more broadly than BMC I think most fully OA publications would accept pre-prints. I also suspect hybrids would not - for example AuthorChoice at ACS - but I don't know for sure. - Jean-Claude Bradley
Mike Chelen
Mike Chelen
"used as source for Amazon EC2 EBS snapshot: snap-7044d219" - Mike Chelen from Bookmarklet
Mike Chelen
Fwd: Accessing #NCBI Entrez web services with #YQL Yahoo Open Data Table - Egquery example https://login.yahoo.com/config... (via http://friendfeed.com/yql...)
convenient way to query some NCBI services, allowing some server-side processing with results available in XML or JSON - Mike Chelen
Mike Chelen
Mike Chelen
"Links any Document Object Identifiers for resolution with http://dx.doi.org" - Mike Chelen from Bookmarklet
currently works with plaintext DOIs like http://www.plosone.org/article... and with DOI class spans like http://www.nature.com/nature... - Mike Chelen
RegExp missing some chars, e.g., % if URL-encoded. We tend to use 10\.(?:\d{4})/(?:[^ "'<&]+) but that can break too - it's annoying that the DOI spec doesn't limit the chars allowed! This is for searching the whole text (incl. within href), not element-wise so might not suit your code exactly. - Fergus Gallagher
Fergus: going to take a look and give that a try, probably only will try to work with displayed text, but it might be cool to add compatibility with any kind of field attribute or xhtml. thanks! - Mike Chelen
One special case worth looking for is "COiNS" OpenURL. Our (jQuery) code is var z = jQuery(".Z3988:eq(0)"); if (z.length) { var r = z.attr("title").split("&"); for (var i=0; i<r.length; i++) { var x = decodeURIComponent(r[i]); if (/^rft_id=info:doi\/.*?(10\.\d\d\d\d\/.*)/.exec(x)) { var doi = RegExp.$1; doSomething(doi); return; } } } - Fergus Gallagher
Mike Chelen
Mike Chelen
"OSCAR3 (Open Source Chemistry Analysis Routines) is software for the semantic annotation of chemistry papers. The modules OPSIN (a name to structure converter) and ChemTok (a tokeniser for chemical text) are also available as standalone libraries." - Mike Chelen from Bookmarklet
Mike Chelen
access PLoS article level metrics by DOI for use in Javascript and more - Mike Chelen
Mike Chelen
PathVisio / WikiPathways tool for creating and analysing biological pathway diagrams - http://www.pathvisio.org/
PathVisio / WikiPathways tool for creating and analysing biological pathway diagrams
I'd really like to see that refactored as a collaborative Google Wave gadget. - Dan Hagon
Me too, re: Google Wave gadget. Then add SBML support, use SBGN in the display, support MIRIAM annotations and I can retire penniless. - Neil Swainston
@Neil, there are a few other tools which support SBML and SBGN (see http://sbgn.org/Communi...). Wikipathways seem to be inventing yet another pathway format and dont provide a conversion to any other existing "standard". Shame as it would benefit everyone if they did. - Frank
Thinking about it a little more, I'd really like to see the above refactored as a collaborative Google Wave Gadget. I've been involved in about five network reconstruction "jamborees" now, which involve flying loads of people around the World to sit in a room and discuss things that they could do with PathVisio (if it supported SBML...) or Payao. Anyways, this costs a fortune (see the... more... - Neil Swainston
@Neil: WikiPathways is intended exactly for that type of collaborative pathway creation. WikiPathways pathway format is based on, and developed in cooperation with, http://www.genmapp.org. So admittedly it's not a widely supported standard, but at least it wasn't a complete new invention. SBML / SBGN support is on its way. Re Google Wave: unfortunately, all this work predated Google Wave by several years... - Martijn van Iersel
Mike Chelen
"Discussion of issues relating to the use of Debian for science research, including useful packages, particular problems faced by scientists using Debian, how to make Debian more useful to scientists, etc." - Mike Chelen from Bookmarklet
there are a number of useful software packages that have been integrated already, and debian makes a great foundation for science projects since it can run well on servers and desktops. it is also used as the basis for other popular derivatives like ubuntu - Mike Chelen
here are some of the packages currently available: http://blends.alioth.debian.org/science... - Mike Chelen
Mike Chelen
Symposium on the Data Sharing Plans and on the Scientific Benefits of Data Sharing in GEOSS - 16 Nov 2009 - http://sites.nationalacademies.org/PGA...
Symposium on the Data Sharing Plans and on the Scientific Benefits of Data Sharing in GEOSS - 16 Nov 2009
Show all
"The Global Earth Observation System of Systems (GEOSS) 10-Year Implementation Plan explicitly acknowledges the importance of data sharing in achieving the GEOSS vision and anticipated societal benefits. The Plan, endorsed by nearly 60 governments and the European Commission at the Third Earth Observation Summit in Brussels in 2004, highlights the following GEOSS Data Sharing Principles: 1. There will be full and open exchange of data, metadata, and products shared within GEOSS, recognizing relevant international instruments and national policies and legislation. 2. All shared data, metadata, and products will be made available with minimum time delay and at minimum cost. 3. All shared data, metadata, and products being free of charge or no more than cost of reproduction will be encouraged for research and education." - Mike Chelen from Bookmarklet
good to see a specific focus on "open exchange of data" and it will be interesting to hear the plans to achieve this - Mike Chelen
Mike Chelen
Student coalition for open access now represents over 5 million internationally - http://blogs.unimelb.edu.au/library...
"The student Right to Research Coalition, a group of national, international, and local student associations that advocate for governments, universities, and researchers to adopt Open Access practices, has now grown to include some of the most prominent student organizations from the United States and across the world. The recent addition of 8 new organizations brings the number of students represented by the coalition to over 5 million, demonstrating the broad, passionate support Open Access enjoys from the student community." - Mike Chelen from Bookmarklet
helps to appreciate the global reach and scale of these scientific concepts - Mike Chelen
Mike Chelen
NIH Notice on Development of Data Sharing Policy for Sequence and Related Genomic Data - http://grants.nih.gov/grants...
wonder which existing published statements from CC science commons and other groups might discuss some of the topics raised? - Mike Chelen
Mike Chelen
OpenSci Info
@mlangill @opensci is happy to help seed these torrents =) beginning with all_plos_pdf #opensci - http://twitter.com/mikeche...
Mike Chelen
Re: tasks overview wishlist: Canonical citing reference [Debian Science] - http://lists.debian.org/debian-...
"Dear all, last year, Michael opened a discussion to have bibliographic information displayed in package summaries: http://lists.debian.org/msgid-s... In the discussion that followed, we talked about where to store this information, and in which format, since adding more content to the debian/control file is not an easy thing (it ‘costs’ a lot because it goes to pivotal files like the Packages.gz files on our mirrors). A four line summary is available here: http://wiki.debian.org/DebianS... This year, some progresses are being made. For the display, Andreas has modified the ‘Web sentinels’ so that they can display bibliographic informations. See http://debian-med.alioth.debian.org/tasks... for instance. But currently the limitation of the system is that the bibliographic information is in a quite remote location, in the Blends ‘tasks’ files. I am currently working on a new workflow which would help the... more... - Mike Chelen from Bookmarklet
Mike Chelen
Mike Chelen
"Any object in Amazon S3 that can be read anonymously can also be downloaded via BitTorrent. Simply add a "?torrent" query string parameter at the end of the REST GET request for the object." - Mike Chelen
Is that a security risk or a feature? :) - Owen Greaves
Since the files are already public, it should be expected that people would download them. S3 bandwidth isn't free, so letting others help with distribution seems in the interest of the content author :) - Mike Chelen
with a little more info about how the .torrent information is generated, could make a fantastic basis for file distribution - Mike Chelen
This is a great feature of S3. Perl gurus might also want to see: http://search.cpan.org/~qantin... or http://code.google.com/p... - Todd Harris
Mike Chelen
Mike Chelen
PLoS articles by citation type, journal and publication year: 2009, 2008, and earlier sharp contrast - http://manyeyes.alphaworks.ibm.com/manyeye...
PLoS.articles.by.citation.type.journal.and.publication.year.sharp.contrast.2009.2008.and.earlier.png
see which journals received each type of citation (scopus, crossref, and pubmed) and compare among the most recent years - Mike Chelen
Mike Chelen
Shows that PLoS One has an overall equal rate of citation to PLoS Biology, and that more of One's articles have been published in a recent year. - Mike Chelen
Has anyone generated a slightly nicer data object out of this data yet? Been thinking of graphing the correlations of downloads versus citations versus whatever and similar for different journals which really requires a bit of cleaning up the data to be effective but if someone has already done it? - Cameron Neylon
Cameron: what else needs to be done to make the data more usable? the source data here is available as TSV http://manyeyes.alphaworks.ibm.com/manyeye... and CSV or XLS too, is there any other format that would be better? - Mike Chelen
I was wanting to do some analysis that included comparing papers based on time of publication i.e. "what is the average trajectory of downloads?" as well as comparing these across journals so I was hoping someone might have converted to either SQL and/or a set of python objects containing lists of downloads/citations/pageviews by month. Not difficult to do myself but just wondered whether someone else had already. - Cameron Neylon
is this other ManyEyes dataset helpful? http://manyeyes.alphaworks.ibm.com/manyeye... it contains the "per day" figures for most of the metrics, including PDF and XML downloads. some spreadsheet and CSV files are also in a github repository http://github.com/mchelen... which might be convenient for import to SQL - Mike Chelen
That would certainly do one of the things I had in mind but the big problem I was having was with wanting to come up with average initial rates and saturation points to see if there are any characteristics of "hot" vs "slow-burn" papers. I saw some evidence of this in the very crude graph analysis I did when the stats first came out. - Cameron Neylon from twhirl
how could the change in rates for each article be determined given only the totals? while the plos website includes a chart of an article's recent history, the data released so far can show how older and newer articles compare in terms of downloads per day like PDF files http://friendfeed.com/mikeche... - Mike Chelen
Cameron: here can be seen which years and journals have articles with the most downloads per day http://friendfeed.com/mikeche... is this close to what you have in mind? - Mike Chelen
Looks very nice Mike. We should have all the 'missing' usage data (pre Aug 2005 and first 200 PLoS ONE articles) added sometime this week. - Peter Binfield
Peter: great thanks, looking forward to it! any preferences or suggestions about where people might want to look for or share data analysis results? - Mike Chelen
Mike, just realised that I've got a somewhat different dataset that I think hasn't been publicly released yet which includes all these parameters by month - but as Pete points out there are some dates missing. - Cameron Neylon
Actually Cameron, you dont. We have released the usage data down to the month level (and you may be referring to that), but not the citation/bookmarks/blogs/comment/notes etc data (although we track cumulative data on these items, by the month, we only started tracking it in March, so dont really have enough monthly data to release - though we could if people felt it was valuable) - Peter Binfield
Cameron: how about this to see differences in citation source (scopus, pubmed, crossref) grouped by journal: http://friendfeed.com/mikeche... or similarly to look at the combined citation types of each journal: http://friendfeed.com/mikeche... - Mike Chelen
Peter & Cameron: ah yes, there is history data is in the other sheets, hadn't even started looking past the first one =) - Mike Chelen
Mike Chelen
"fpocket is a very fast open source protein pocket (cavity) detection algorithm based on Voronoi tessellation. It was developed in the C programming language and is currently only available as command line driven program. A GUI is in development. fpocket includes two other programs (dpocket & tpocket) that allow you to extract pocket descriptors and test own scoring functions respectively. As the algorithm is very fast it can be used on a large scale level (PDB size for instance)." - Mike Chelen from Bookmarklet
Mike Chelen
Internet Archive: Free Download: CTPUG Meeting 13 - 2008 - Mayavi - http://www.archive.org/details...
Internet Archive: Free Download: CTPUG Meeting 13 - 2008 - Mayavi
"Speaker: Stefan van der Walt" - Mike Chelen from Bookmarklet
Mike Chelen
New Tools for Old Traumas: Using 21st Century Technology to Combat Human Rights Atrocities - http://www.americanprogress.org/events...
"One of the major developments in the human rights field over the past decade has been the increased application of new technologies, such as satellite imaging, database and data analysis tools, medical forensics, mobile phones, and social networking software to situations in which human rights are under threat. The convergence of scientific innovation and human rights advocacy may well represent a major breakthrough in the struggle for human dignity. Full realization of that promise will require far greater collaboration between government, business, the scientific community, and human rights NGOs than we have seen to this point. Our panel will describe ways in which new technologies are revolutionizing human rights work and make recommendations for how the U. S. government can play a leadership role in promoting the nexus between technology and human rights." - Mike Chelen from Bookmarklet
panelists include members of PHR and AAAS - Mike Chelen
Other ways to read this feed:Feed readerFacebook