Sign in or Join FriendFeed
FriendFeed is the easiest way to share online. Learn more »
Jonathan Eisen
Am looking for systems for my lab to make electronic lab notebooks - suggestions? wiki? OWW? software?
Jonathan - we're collaborating with the group at Southampton (see http://www.ourexperiment.org/racemic...) which is basically a blogging functionality but we are hoping to enhance it as we use it, e.g. to have better automatic interactions with department instruments. It was important for us to be able to be involved with a group that are actively changing their system in response to our needs. It was also crucial to be involved with an open platform. - Matthew Todd
I like WordPress blogs, versioned, subscribe to RSS feeds of your student's work, HUGE GPL community. - Dave Lunt
Depends a lot on what you want from it and what kind of work it is supporting. The Southampton systems are getting a lot better and possibly more importantly a lot easier to add functionality onto. For some types of science they are probably good enough - for others, as I guess Mat is finding with wanting to put chemical structures in a native way - its not quite there yet. I find it quite good for experimental work where there are fairly stereotyped experiments being done and ok if the experiments keep changing. Its less good currently out of the box for computational work but it is probably pretty easy to add functionality to autoblog from existing analysis pipelines. I've been working on some Python libraries and hoping to do some dropbox integration soon. - Cameron Neylon
Steve Koch's group are the masters with using OpenWetWare, wordpress could probably do a lot of what you want if you've gt people prepared to do a bit of php wrangling...I think the main thing is not to think "I want an ELN" but to think hard about what you want to capture, and how important it is to structure that. A group DropBox folder that people dump Word files into can work just fine if what you want is just a backup and notification mechanism. - Cameron Neylon
I've long wanted somebody to work with Peter Sefton to make ICE into an ELN (http://adlaustralia.org/idea200...) - Bill Hooker
CAMERON - I want a system where people record EVERYTHING they are doing in their research with links to all data, analyses, output, etc. And I want access to it from anywhere. And I want to be able to search it intelligently. Dropbox won't cut it. - Jonathan Eisen
I built our lab site on drupal, and I have considered adding this sort of functionality into it. Development Seed's custom drupal distribution Openatrium (http://openatrium.com/) looks pretty cool too as an "intranet in a box." I can see that being compatible with your concept. - Walton Jones
@Walton - interesting - Matthew Todd
Jon - OK, that's a big ask to make it work. Technically all this is do-able and it is exactly the concept we are working on. Capture everything, and make connections between things. So what is next on my list is trying to connect DropBox to our blog system. This would mean that dropping any file in gets uploaded - the idea being that a web service is watching the subscribed drop box and that it can make some intelligent guesses about what the filetype is - then push it to the blog with appropriate metadata. The blog in turn provides functionality that makes the linking up process relatively easy. What we don't have at the moment is anyway to effectively leverage the graph that you build up. On the other hand because we expose everything Google does that for us reasonably effectively but not very intelligently. But I think the key for you won't be the technical capability of a system, it will be having the user interfaces that work well for your group. But I'd be very interested in trying to figure that out. See http://dl.dropbox.com/u... for a recently rejected grant. - Cameron Neylon
We've been using Wikispaces as the lab notebook and Google Spreadsheets for numerical data for a few years and it meets our requirements. Both have version tracking and we have code that enables full archiving of the notebook and associated raw data files. They are free and hosted so nothing to install and maintain locally. Google Spreadsheets has a nice API for querying and visualization of data if used as a platform to organize data collected from multiple experiments. - Jean-Claude Bradley
@Frank that sounds good. Are there easy ways to link a normal CMS to GIT/SVN? They tend not to be very user friendly, bad interface, which reduces uptake of the whole system in my experience. GIT plugins maybe for WordPress/Plone/Drupal? - Dave Lunt
I don't think there is any Git plugin for Wordpress but if memory serves there are some Wiki frameworks built on a Git backend. I still think worrying about the back end is putting the cart before the horse. Your average lab would probably not benefit from using a versioning system unless they understand what they're for. If they do, then they can probably just use it at the command line in most cases. Its how the front end interacts with day to day research activities and they way they are recorded (or not) that matters - Cameron Neylon
An issue tracker such as Trac integrates well with revision control systems, and includes wiki functions. This provides mostly annotation and discussion for the repository, so contributors must still push changes to the code itself. - Mike Chelen
Wordpress is great for logging stuff, but it kinds fails at the intelligent search aspects, unless there's something I'm not aware of - Mr. Gunn
@MrGunn Good point, I agree. Still, it kind of just works as an ELN, and its easy to actually start with today. http://thisiscolab.com/blog... - Dave Lunt
My requirements for an ELN are: (1) Support separate projects. (2) Complete text search. (3) Version manage all content, (4) Everything can be linked and tagged, (6) Display syntax-highlighted code snippets (7) Display images, video, etc (8) Synchronize with my pdf bibliography of references (9) Embeddable intelligent spreadsheets, (10) Display rss feeds and provide an rss feed, (11) Provide user comment forms, (12) Include an idea forum to propose, rank and track ideas, (13) Bug reporting & tracking for errors, (14) Can edit offline and then push and pull content to server, (15) Easy to add/update/edit all content from any browser. ** Additionally, two very valuable concepts for an ELN: (16) Linking to other data: A central service that connects me to the content of other scientists using it. Makes it easy to share discover, connect, link, use their data/code/references/etc. (17) Linking to other researchers: Should connect me to a community of scientific users. Be able to see how they do things, find the right experts and get answers to questions. OWW+Mendeley+Github+handful of other plugins accomplish this for me reasonably well at the moment. - Carl Boettiger
I'm wondering how many people here are experimental (wet-lab) people, and how many are theoretical/computational/electronic people? A lot of the things I'm interested in are to do with capturing the messy process of what happens in an experimental lab so that it can be searched and shared. Is that what everyone else has in mind? - Matthew Todd
My problem is I do both and want one system Sent from my iPad - Jonathan Eisen from email
I've been working on this tool that seems to answer you need: www.symbyoz.com. Give it a try? - Joel
Neil, I think the main distinctions are that in the wetlab things aren't automatically captured in the same way that they all sit on a disk somewhere for computational work. One of the big problems I have is getting people to see the value in making a record of all the samples that they create. There are very few systems that make this easy and natural. And it seems to make no sense when most of them get destroyed, but its the anchor point of any workable system that tracks relationships. Another key issue can be the simple practical issues with having easy access to a computer in a wetlab. This is a big issue in chemistry, usually less so in a biolab. - Cameron Neylon
Actually I'd also throw your comment back at you. Descriptions of computational _process_ would be a lot less problematic if computational scientists stopped thinking that just collecting all the outputs was job done and learnt how to keep a high quality wet lab style notebook :-) Versioning systems capture outputs but rarely is there a good record of what _happened_ beyond some sort of commit message or a log file. - Cameron Neylon
I agree with the sentiment but not the details. I think a versioning system is overkill and not really the right paradigm for the lab scientist in most cases. In my experience you generate the data once, and then it doesn't change much. You process it, generating new stuff, but the kind of cycling, tweaking, and branching that VS are built to support doesn't really apply as much. You need to capture all of that stuff into a repository of some sort, and versioning should be provided by the back end anyway, but its building a good capturing system that helps to get the relationship between these objects that's important - and conventional versioning systems don't do that. The key problem is the interface between the physical samples and the digital data IMO - if we can make that easy then we're a lot of the way there. A versioning system for samples is what I'd like to see... - Cameron Neylon
There are so many different kinds of data. One researcher might be managing thousands of microscope images every day, while in the same time another could record only a half dozen numerical results. Saving revisions of the images could be pointless if each is only written once and never modified. Whereas putting a document under revision control is a great way to ensure that nothing important is lost over many edits. There are also large ranges in the interfaces for versioning software, between something like Dropbox where file history is automatically stored, and Git where each commit may be manually entered. - Mike Chelen
Ah ok. I think we have a philosophical difference here then. I don't see processing as versioning, for much of my work at least, because it usually generates new objects of new types. I do agree that where you are manipulating a single object in a repeated way, perhaps with branching, then versioning (and branching) is a good way to think about it but I'm less sure that it is a useful way of thinking about most of what I do. Any my fundamental objection remains that versioning systems (generally) fail to provide a good way of capturing or thinking about the _process_ that converts one thing to another. So I think the provenance problem or the process problem is the more interesting one. I don't think we fundamentally disagree, just the emphasis on whats important is different. - Cameron Neylon
...and I do think that versioning systems (including branch and merge) should be a basic feature of any file system. Just not sure that they need to be surfaced for most users in many use cases. - Cameron Neylon
I'd go further actually, generating, storing, analysing and publishing research objects, explicitly including samples and other physical objects. And I think the "computational thinking" approach might be even better applied to the physical world. - Cameron Neylon
Cameron I think that the requirements for version tracking may differ between labs. In my lab we often have lots of undergrads recording their lab notebooks and it would be unusual if there was no error correction for an experiment at some point. It is also not uncommon that the Google Spreadsheets we use to record the raw data and show the mathematical processing often initially contain errors or omissions. This is why for us the ability to periodically take a snapshot of the entire notebook and linked data files is so important. One key reason is the ability to cite a specific archive from manuscripts written at a certain point in time. - Jean-Claude Bradley
If you want everything consistently done in git the documentation part could be handled with toto - "a git-powered, minimalist blog engine" (http://www.cloudhead.io/toto) (not sure if you need commenting functionality) - Konrad Förstner
We've been very happy with http://www.unfuddle.com Unfuddle. It includes notebooks, though I don't know how appropriate these would be for experimental data. The integration of the subversion repository with tickets / milestones / projects is very nice. - Ruchira S. Datta
biological data, at the point of capture, is really messy, and you often have to iterate several fast optimization kinds of assays or experiments before you hit on the set of conditions under which you can capture clean or meaningful data. Then once you do get close to clean, it's off to the next set of optimizations for the next thing. That's the dynamic that computational tools fail to capture and why I always fell back on Excel. You never have the luxury of working on one experiment long enough to get it tweaked so that the data is clean enough to be run through a workflow without tons of manual intervention. If grad students were techs, doing the same set of assays over and over within defined conditions, yeah, that would work great, but things are always changing so the more optimized the routine, the smaller number of situations in which you can use it. - Mr. Gunn
A bit late to the party (via Cameron's post on the Daily Scan), but back in February I was sat in a dull seminar and made these notes. A Data Analysis Deposition System e.g. to store all the R code used to turn data X + Y into results Z. - load source code for analysis pipeline - check data dependencies, on submission and periodically in future (e-mail owner if links go down.) - libraries for code interpretation e.g. R, C, SPSS. - What do physicists use? - Easiest if pipeline integrated with development of analysis, otherwise might be hard to summarise months of work (this is itself a motivation). - Most obviously useful for bioinformaticians, but in theory could be applied to any data analysis. - Many interested parties: Universities, Publishers, Government (Government operations services?). - 'Legal' obligations - OPEN SOURCE! - Separate DOIs for data, methods, analysis - Modular research. Sub-version control. - What about private databases? e.g. Soay sheep? Only applicable if full data released in full. The idea is motivated by the idea of transparent research and publicly available data:- Links on data access. http://data.gov.uk/ http://news.bbc.co.uk/1... http://www.gapminder.org/ http://www.ted.com/talks... http://data.un.org/ What do all the genetic epidemiologists use? Things to consider:- Minimum information http://www.mibbi.org/index... Data audits promoted by JISC http://www.dcc.ac.uk/tools... Some good attempts:- Labkey https://www.labkey.org/ Omixed http://www.omixed.org/documen... Both of these are mostly about specifying exactly what data is. As such they have to be hard coded and heavily maintained. I was thinking more of a light, linking system to show analysis flow and store links on versions etc. Components would include: data (versions), software (versions) actions and flow, intermediate data (versions), results (versions). Systems could be... more... - Dave
Open Source www.Bikalabs.org develops web and CMS based LIMS - used in chemistry, agriculture, water quality, environment and inter-laboratory sectors. A public health lab branch is currently in development. Your requirements are research orientated, batching per project and full text admin are gaps in Bika that can be plugged. Full Plone CMS functionality is available. Analysis work flow and results management, all content types, LiveSearch etc. are well covered. Next version will be in Plone 3 with full document versioning, done manually currently - lemoene
I've been using git for curating my (text based) data analysis in the social sciences, and I've been impressed with its usefulness for keeping a very detailed history of what I've been doing. The only problem is for large binary objects, but apparently that's being addressed. Gitalist: http://search.cpan.org/perldoc... is a front end to git which can be easily modified to provide the kinds of features useful for researchers (standalone and multi-user -- and there are lots of bioinformatics workers familiar with that kind of code too). - k d
There has been some very interesting discussions here, and a lot of questions that are commonly asked. There appears to be a lot of discussion about using software that has not been specifically designed to be an ELN and some people find that these are perfect for what they need them to do. However, it is important to outline your requirements (as has been done above). I work for Kinematik (www.kinematik.com) that makes a series of different modules to support R&D ranging from an ELN to integrated project management, laboratory resource management and more. This is all available through a browser and so it can be accessed from anywhere. There are differnent options avialble from all vendors to help suit budgets etc, such as Hosted or SaaS. - Aaron Norman
I'm developing a collaboration tool for researchers called SciTecMed [ http://www.scitecmed.com ]. It is like a mashup of dropbox and github. I'd appreciate an opportunity to show the demo and learn more about your needs. - SciTecMed
hey scitecmed - drop me a line - am interested in this jonathan.eisen@gmail.com - Jonathan Eisen
I just discovered this interesting discussion. Two systems for electronic lab notebooks that I don't think have been mentioned so far are: eCAT, a commercial system from Axiope http://www.axiope.com/ and Yogo / Neurosys http://neurosys.msu.montana.edu/ . I haven't tried either myself as yet. - Mark Longair