CSV aint much use without the metadata associated with the bizarre directory structure you're using to store the Excel files :-)
- Cameron Neylon
And I've got to say, I'm not scared of command line tools but if I want to quickly munge, normalize or check some data, then Excel is usually the fastest way to do it. Maybe I type too slow...
- Cameron Neylon
I'm in danger of ranting here. How many times can I click the Like button?
- AJCann
heh, not gonna happen. Here's the use case - Researcher has columns of data & wants to a) do stats b) make a graph He throws it into excel, selects the ranges, and gets the result. Next time, his data is different, has different dimension, different labels, different steps in analysis. Almost everything is a once-off, for which Excel is easier, and it stays this way until you're much later in your project, at which point all your data is already in Excel, so you just keep using it.
- Mr. Gunn
Mr Gunn I think that's the really key point, the fact that you're doing a different thing every day and you don't start off with a nice collection of scripts to do the processing for you. Not saying it wouldn't be better if you did have them or that it wouldn't be worth building them up but once you start down one path its hard to go down another, esp if no-one around you is.
- Cameron Neylon
The other thing that command line people miss is the simple comfort of being able to actually see the data. It also places an interesting limitation on when excel is useful. Once you get beyond a set of things that can be comfortably managed on one screen is when people start to think about shifting to other platforms.
- Cameron Neylon
+1 Cam and MrG. Once you get beyond a dataset that you can see on one screen, maybe with a bit of scrolling, then Excel loses some of the comfort factor. But in addition, when you are always doing small-scale datasets, each one different from the last, it's just as easy to fire up Excel. Where I think "those researchers" lose out is that an Excel habit probably limits what you think to...
more...
- Bill Hooker
I'd *love* to learn more stats and better tools to go with my new mathematical understanding. But I'm not going to, because I simply don't have time. So how about this, data nerds: as well as building me tools, give me some guidelines for making my data palatable to you, so that I can turn to you for help and advice as readily as I now turn to Excel?
- Bill Hooker
Provide better tools... like R? Sorry, I do not understand the question.
- Egon Willighagen
Egon +1 ... I say this as someone who uses Windows to run one and only one application, Excel, which I like a lot. It's great to summarize some pieces of information, summarize, do simple pivots, but if you want to do any serious data analysis, data mining, pattern recognition, etc, forget it. The people we get frustrated by who do multi-worksheet joins. That's asking for trouble, or if you have 10000 rows. Is there a place for excel, yes, but if you want to do serious data analysis on complex datasets, no.
- Deepak Singh
I'm curious too - what else are people going to use? There aren't a lot of tools that are easy to use for entering and processing data. I hope they aren't suggesting researchers use Access instead.
- Elizabeth Brown
Even at EBI where Windows is like wearing a sign round your neck saying 'I'm not a proper geek' people have to take account of the fact that Office is one pervasive little suite (well, Word and Excel anyway) and develop for it sometimes (though the ISA infrastructure stuff just apes it, in Java). When I had a real job I used to generate reams of data which I processed with scripts...
more...
- Chris
from twhirl
MySQL is easy .... has some very nice interfaces and can be learned. R is harder, but if you really care about statistical analysis on large scale data, you can learn that. But the better approach, IMO, is to have a software and informatics team that make data and information available to query and visualize. Where a lot of the complex munging has already been done, and you are now...
more...
- Deepak Singh
Bill H: "Once you get beyond a dataset that you can see on one screen...then Excel loses some of the comfort factor." Interesting. My current project has multiple pages, each 1,305 rows, about 140K cells in all. I was an analyst/programmer for 4 decades. Excel works like a champ for me (with the label row always frozen, of course): What *should* I be using?
- Walt Crawford
Deepak's message sounds fine for Big Science. Is there a solution for Small Independent Research? I mean, "software and informatics team..." not gonna happen.
- Walt Crawford
Is it a bit early to talk about cloud-based mechanical-turk-y core analysis facilities? Either for the pre-processing (as Deepak describes) or the full service (with coauthorship).
- Chris
from twhirl
Walt, that's a good point. However, if we train people correctly and get the funding bodies to see the light. Otherwise, we are not in good shape
- Deepak Singh
Walt, I'm not the one trying to tell you what you *should* be using. I'm gonna step out of this conversation because it pisses me off to be told I'm doing it wrong, ask how to improve, and be told basically "oh, just use the tools we have spent our careers learning... oh, it should only take you a few hours to learn". Feh. This is called Experts' Myopia, and I suspect I am guilty of it...
more...
- Bill Hooker
Bill: Good point. I'm not a scientist at all, but even us lay researchers sometimes amass fairly large datasets--and, frankly, Excel is an awfully good tool with a very gentle learning curve. I just wondered what I was missing...and think I know. "Get the funding bodies to see the light" is going to mean $0 for any independent operator.
- Walt Crawford
Let me step back a little. I am talking about understanding the ability to choose your clustering algorithm, try and find correlations, figure out time series in multivariate systems. There is a class of analysis for which excel is just fine. There are deeper analytical approaches which require understanding of algos, data models, etc. To be able to proceed without those is like...
more...
- Deepak Singh
I'm continually impressed by the number of researchers with no command-line or programming experience in my field who are rapidly mastering or trying to learn R. There's a sense in ecology and applied phylogenetics that you'll need R for anything you do, and muddy-boot field biologists fill every workshop offered. I think this kind of motivation + peer training has certainly started to make in-roads...
- Carl Boettiger
Carl, that's certainly my hope (from someone who does not know much R)
- Deepak Singh
Our bioinformatics core facility does semi-regular classes for members of our institute in R, perl, unix, galaxy, genome browsers, etc. I was actually thinking we should do an Excel class because so many people use it -- if they want to keep using it, they might as well learn it really well. I think part of my job as a bioinformatician is to help biologists learn to use computational tools.
- Madelaine
That's a great idea. It might actually be useful, especially if you can highlight where the relative strengths lie and make the end user make the determination how they are going to use the tools.
- Deepak Singh
"Optimizely is a dramatically easier way for you to improve your website through A/B testing. Create an experiment in minutes with our easy-to-use visual interface with absolutely no coding or engineering required."
- Paul Buchheit
from Bookmarklet
I had a pretty ropey day today. Failing for 45 minutes to do some simple algebra. Transformation overnight that didn't work...again...and just generally being a bit down what with the whole imploding British science funding situation. But in the midst of this we did one very cool and simple experiment, one that worked, and one that actually has some potentially significant implications. The only things is...I can't tell you about it.
- Cameron Neylon
Yeh, but good to surface it from time to time...just do the check..."yup - this is still a problem...". And the rest of the day was just so entirely shit that it was the kind of thing you want to tell everyone about to cheer yourself up...
- Cameron Neylon
I feel your pain; nothing I do can be in the open for the immediate future, and it grates my cheese something fierce.
- Bill Hooker
I actually find this post refreshing, since it shows that Cameron isn't in some special position where he can happily do open science all the time without resistance, but has to juggle it with old school 'closed' science. It serves as an example that most academic researchers could integrate an open science project or two into their research programs without having to go fully open overnight.
- Andrew Perry
Yeh, definitely a good point Andrew. Hadn't thought about it that way.
- Cameron Neylon
I don't think there is a contradiction Cameron - projects (associated with a specific notebook) are ONS not necessarily people.
- Jean-Claude Bradley
No I don't think its necessarily a contradition. Just that I believe we could make much more rapid and interesting progress if we could open it out a bit.
- Cameron Neylon
Cameron can I ask what benefit you get from this particular closed project?
- Jean-Claude Bradley
It's a connection between a project I happen to know about and another one that we're involved in supporting. So I get the warm feeling of connecting two things together, if it does take off we get some big papers and/or patents, and if it gets used and adopted we've achieved our aim of increasing the impact of the facility in general terms. So I guess credit by association with success, which is essentially what I set up as my own criterion for success in the job.
- Cameron Neylon
The real irony of course being the connection was only made by skating around the edges of what should and shouldn't be disclosed...which make the whole process really irritating...which is what sparked the post in the first place.
- Cameron Neylon
"ThinkUp captures your posts, replies, retweets, friends, followers, and links on social networks like Twitter and Facebook. We'll be adding more networks in the future. ThinkUp stores your social data in a database you control, and makes it easy to search, sort, filter, export, and visualize in useful ways."
- Bill Hooker
from Bookmarklet
Interesting....anyone got it up and running yet?
- Cameron Neylon
Looking to try and implement this in some way on S3.0, will keep you posted. Thanks Bill
- science3point0
Started having a play around on the science3point0.com test site here if anyone is interested, limited by apps at the moment, not sure what they all do? : http://www.collabbook.org/thinkup...
- science3point0
I've got it running and archiving my tweets. I'll let you know how it goes. The people behind it are top notch.
- Mr. Gunn
from YouFeed
I was thinking that if it could be ported to a web app, it might be a FF replacement...
- Bill Hooker
What do you mean? Surely it *is* a web app.
- Matt Leifer
You're right, I mis-spoke. (mis-wrote?) I don't mean "ported", I mean hosted; you need your own server to run it at the moment, and I can't imagine that hosting such a thing for a sizeable community would be cheap or easy so it's not something that we could, for instance, browbeat MrGunn into doing for us. :-)
- Bill Hooker
I am not sure how it could work as an FF replacement. As far as I understand, it does tweet archiving, pulls in @ replies to your tweets and does analytics. These are useful functions for an individual twitter user, but I don't see how you can make a FF-like community out of it. There are other open source projects that aim to replicate sites like FF a lot more closely than this, e.g....
more...
- Matt Leifer
Yeah, I'd be happy to set anyone who wants up with an account (with the caveat that I may take it down at any time and if anything breaks - tough shit ;-) ), but it's really for archiving and analytics of activity on other networks, not a network itself.
- Mr. Gunn
To counter this problem, journals should demand that authors submit sufficient detail for the independent assessment of their paper's conclusions. We recommend that all primary data are backed up with adequate documentation and sample annotation; all primary data sources, such as database accessions or URL links, are presented; and all scripts and software source codes are supplied, with instructions. Analytical (non-scriptable) protocols should be described step by step, and the research protocol, including any plans for research and analysis, should be provided (see http://go.nature.com/UaF2Kv). Files containing such information could be stored as supplements by the journal.
- Kubke
RT @bluezaki: RT @natpryce @aaronpk: There are only 2 hard problems in Computer Science: cache invalidation, naming things, and off-by-one errors. /via @t
@Cameron. It is from a survey they sent out.
- Andrew Lang
in that EOS article I blogged, they mentioned a journal that just describes datasets like that... Earth System Science Data. Interesting
- Christina Pikas
related: http://www.biomedcentral.com/1756-05... "By publishing Data Notes (often called “data papers” by other publishers), authors in BMC Research Notes can publish peer-reviewed articles that briefly describe a biomedical data set or database, withthe data being readily accessible and attributed to a source." (ObDisclosure: I'm a BMCRN assoc editor, it's a volunteer position)
- Bill Hooker
Christina, Earth System Science Data is the journal for the Pangaea project we talked about earlier today. I could provide contact information...
- Martin Fenner
+1 Neil Why create yet another journal for this? Why not use the existing repositories?
- Mr. Gunn
@Neil, MrG: a cynic would say that Prestige Journals want the Prestige model to carry over to everything, so here they are re-inventing a wheel so that they can slap their brand on it...
- Bill Hooker
Bunch of spreadsheets described by free text and a Nature logo? Not entirely convinced that this would enhance our lives greatly.
- Neil Swainston
from iPhone
I disagree, as I think that where to deposit primary research datasets is an unsolved problem. Supplementary information is obviously not a good solution. And there just aren't repositories (institutional or domain-specific) for all datasets, let alone standard formats.
- Martin Fenner
2 things: (a) how searchable are existing 'general' data repositories (eg datadryad, CRData), what kind of metadata is included in these datasets and how will this compare to the NPG system? (b) what is the best system for long-term stability: a funding-based model or a branding-driven business model?
- Thomas Lemberger
Bear in mind that this is a user survey, not a press release. ;) Also, letting people *upload* documents in any format doesn't mean that a system couldn't process them into a format more suited to discovery.
- Euan
Ah, yes, branding. I forget how much those within an institution care about it (relative to how little those outside care.)
- Mr. Gunn
I had a bit of a rant at a Science Online London panel session on Saturday.As usual when discussing scientific publishing the dreaded issue of the Journal Impact Factor came up. While everyone complains about metrics I've found that people in general seem remarkably passive when it comes to challenging their use. Channeling Björn Brembs more than anything else I said something approximately like the following.
- Cameron Neylon
"...as professional measurers and analysts of the world we should be embarrassed to use JIFs to measure people and papers. It is quite simply bad science." Hear, hear!
- Bill Hooker
Totally OT - I would gladly pay to watch Aussies, Bill and Cameron play tennis (other games may be applicable) and all the proceeds be donated to a worthy cause. #anyonefortennis?
- Graham Steel
Ironically enough, just yesterday I filled in a form required for an application for a professorship, where they wanted to know how many papers I had in which IF journals. Should I be interviewed, this particular position would be so important for me, that for the first time ever, I would probably not say anything about this embarrassing use of the IF, which would normally disqualify them as employers immediately.
- Björn Brembs
Björn, was that Göteborg? I was trying to find the details, but they are apparently using that as part of the officiel job application process... did not find those details yet, though...
- Egon Willighagen
Now, and that makes it even more embarrassing, it was here in Berlin. When I interviewed in Uppsala I did not see any of this nonsense. If I get the professorship, you can be sure there'll be a lecture or two about IFs. And there will be figures with forms from certain universities...
- Björn Brembs
This whole sad discussion reminds me: why isn't there a tool available, that allows people to construct their own citation list??? I've been doing this by hand for years now: http://bjoern.brembs.net/citatio...
- Björn Brembs
Yes, I use it, but it only pulls from GS which is neither as user friendly nor as 'accurate' as WoS/Scopus. So I use all three, de-duplicate by hand, copy and paste into an HTML editor and format (also by hand). I don't know of any other way to do this. To stay on topic: I think this is currently the best way to replace IF counts when evaluating people: use actual citations.
- Björn Brembs
In Poland JIF is used every single time scientists are evaluated (whether it's a grant or a new position). Also, quite often a lecturer on a science seminar is introduced with mentioning her/his "total IF points". We have also "ministry points" (from Ministry of Science). These are awarded in the same manner as IF - per publication (points are also awarded for writing a syllabus for a...
more...
- Pawel Szczesny
@Björn We're working on something with the intention of delivering this and PM-R has been arguing a lot recently for open citations and open metrics.
- Cameron Neylon
I would have thought that using JIF in a job application process would open an organization up to being sued...
- Cameron Neylon
@Pawel - that was kind of the basis of the prestige vs outcomes riff that I most recently wrote about in the interview with Michael. It's a perfectly reasonable decision for a country, particularly a small country to go for prestige as a way of making a mark. But they shouldn't expect that to lead to either a viable, stable, or particularly valuable research community. If you want those things then you need to optimise for them (which is harder to measure obviously, but most important things are)
- Cameron Neylon
I was thinking the other day about changing my cv and instead of listing 'my publications' start listing the papers that cite my papers (first order) and those that cite those first order papers (second order)) (or some quantification of that sort based on 'order'). A visualization of it could be fun to do too. Then I start wondering whether I should wait until I am out of my continuation period ....
- Kubke
@Kubke... agreed... if your research published in a low ranking journal but used significantly in Nature X publications, what JIF should you fairly take... should we perhaps make a black list of universities where JIFs are used? it seems that SHOUTING is the only way to get things changed these days... :(
- Egon Willighagen
@Egon :) I am on the advisory board for creative commons Aotearoa New Zealand, and one thing that came up is that 'opening up' requires a serious change in assessment policies. One example: Lets say someone gets 1000 citations on nature preceedings (not peer reviewed) shouldn't that count more than zero or 1 citation on a 'peer reviewed' nature? Should we move from 'peer reviewed' to 'peer accepted'?
- Kubke
Tres interesting, Kubke. >>> "Should we move from 'peer reviewed' to 'peer accepted'?"
- Graham Steel
And depending on who your peers are, we could have top peer, instead of top tier.
- Noel O'Boyle
What if the citing papers all cite the paper to dismiss it, or because it was shown to be fraudulent? You'd need either a citation typology or he possibility to retract papers from the record, the latter being difficult in non-peer-reviewed archives.
- Björn Brembs
@Cameron: Looking forward to that tool!
- Björn Brembs
@Björn It's not so much the tool. That's pretty trivial. It's getting hold of the data that is the problem.... but that's what the project is about.
- Cameron Neylon
Similar issue here as what Bjorn mentioned in the beginning: about to start a tenure-track, and one of the items on my checklist to be eligible for tenure in 5 yrs is "x papers/yr in a journal with IF >= y". Which obviously completely bypasses my open-source work... But at this point in my career there is nothing much that I can do.
- Jan Aerts
I think there are two things I would say to that. One is don't assume that tenure process in 5yrs will look like tenure today. Things are shifting, slowly admittedly, and perhaps too slowly but they are shifting. "Impact" and demonstrated income potential will be very important, both of which your prominence in the Open Source community will help with. Secondly, yes you need some good...
more...
- Cameron Neylon
@Björn wrt. citation typology: here's a recent paper on this very topic: Shotton. CiTO, the Citation Typing Ontology. Journal of Biomedical Semantics 2010, 1(Suppl 1):S6 http://dx.doi.org/10... "..ontology for describing the nature of reference citations in scientific research articles and other scholarly works, both to other such publications and also to Web information resources, and for publishing these descriptions on the Semantic Web. .."
- 'Mummi' Thorisson
@Mummi: nice! This sort of technology needs to be developed and incorporated in citation analyses are to progress.
- Björn Brembs
While there's certainly a use case for a "sent via mobile device" tagline in emails (to make excuses for brevity or spelling, for example) that line has always seemed just a bit much to me.
- Mr. Gunn
I can't turn that tagline off when using my phone for email. Dunno about iphones.
- Bill Hooker
Half of these are sent simply because it was the default setting, rather than a conscious choice.
- Mike Chelen
I find it worrying that most people don't think to try and turn it off and instead whore-out every email they send to Apple. At one point I changed my sig to "Sent from a smartphone that doesn't need to advertise itself in my email sig", but I decided it was too confusing for recipients.
- Andrew Perry
from Android
"Dr. Koch's graduate student, Andy Maloney, enjoys being a member of a laboratory that has taken part in the grass roots campaign to create an open access forum for doing science. Andy says, "Communication is key in science. With open notebooks, I'm completely up to speed with everything everyone is doing in the lab. This really helps science progress." To see what Andy's up to in the lab, feel free to check out his open wetware page at: http://openwetware.org/wiki.... Lab Website: http://openwetware.org/wiki... Lab Blog: http://stevekochresearch.blogspot.com"
- Steve Koch
from Bookmarklet
Most excellent. What will you spend the money on?
- Bill Hooker
Andy wrote up the whole application and budget. I believe he asked to put the money towards a better PCR machine, which would help us in making our DNA constructs. I don't know, though, whether he's able to spend at his own discretion within the rules...
- Steve Koch
Mechanical? I don't think I've ever seen such a thing. When I looked at these vintage ones, all my memories of programming stupid little games during class into the graphing calculator came rushing back. I loved that.
- Jennifer Melinn
Sharp EL-512, all the way through high school and college. No longer made, I think, but I loved that thing. I have an EL-531W now, which is pretty nice too.
- Bill Hooker
they don't have the vintage HP ones on that site. My older brother had one - he usually plugged it into the wall outlet. I think the numbers displayed in red.
- Elizabeth Brown
I used to have one of the Casios pictured (not sure exactly which model). I discovered it could do something cool as a kid - if you put it right against a transistor radio and pressed the buttons, you could play tunes like a little synthesizer ... who knows what kind of RF that thing put out. Sadly, like most electronics I owned as a kid, I eventually disassembled it and connected it to other things until it was destroyed, so I don't have it anymore :P
- Andrew Perry
Total luck -- your eyes don't see that way, so I would never have thought of it, but I was trying to take a different shot in the field of flowers and noticed what was happening on the phone screen as I panned across my shadow. Quite made my day. :-)
- Bill Hooker
I have a great shot of my husband on New Year's Eve, looming hugely against our house (which is a geodesic dome, making looming much more fun!) in the shadow of fireworks.
- Mickey Schafer
I'm with Daniel -- I didn't think it was depressing at all.
- Bill Hooker
all that effort and toil for the equivalent of a dent. woe, no?
- Marie
Looking at the even bigger picture, there are millions of PhDs alongside your own, pushing the boundary of knowledge ever farther outward, so in that sense, it is a very good thing. :)
- Shirley Wu
It would be interesting to see how the boundary is shaped by the number of PhDs working to advance knowledge in different fields. e.g. how big/far out does the "theoretical physics" part of the circle go, vs. "cell biology" vs. "ecology" etc? What happens to the circle when new disciplines are added? What would a time-lapse look like? Having visions of crazy infographics here...
- Shirley Wu
Had a quick skim through. Actually the 'novel' high cell density method is almost identical to a method used for making isotopically labeled proteins for NMR (or selenomethonine labeling), designed to save on isotope - grow up in rich media, spin down and switch to a 1/4 volume of minimal media [+label], then induce after a recovery phase (eg Marley et al, J Mol Bio NMR, 2001 http://www.ncbi.nlm.nih.gov/pubmed...).
- Andrew Perry
Bill, that's why I bookmarked it! Major question with these things is often whether they're more fiddly than just growing up 12L and doing a large scale lysis but if you could get the volumes down that much it would be a big help for us. We're very happy when we get 3 mg/L with some of our things, sometimes close to 0.5 mg/L
- Cameron Neylon
Jonathan - we're collaborating with the group at Southampton (see http://www.ourexperiment.org/racemic...) which is basically a blogging functionality but we are hoping to enhance it as we use it, e.g. to have better automatic interactions with department instruments. It was important for us to be able to be involved with a group that are actively changing their system in response to our needs. It was also crucial to be involved with an open platform.
- Matthew Todd
I like WordPress blogs, versioned, subscribe to RSS feeds of your student's work, HUGE GPL community.
- Dave Lunt
Depends a lot on what you want from it and what kind of work it is supporting. The Southampton systems are getting a lot better and possibly more importantly a lot easier to add functionality onto. For some types of science they are probably good enough - for others, as I guess Mat is finding with wanting to put chemical structures in a native way - its not quite there yet. I find it...
more...
- Cameron Neylon
Steve Koch's group are the masters with using OpenWetWare, wordpress could probably do a lot of what you want if you've gt people prepared to do a bit of php wrangling...I think the main thing is not to think "I want an ELN" but to think hard about what you want to capture, and how important it is to structure that. A group DropBox folder that people dump Word files into can work just fine if what you want is just a backup and notification mechanism.
- Cameron Neylon
CAMERON - I want a system where people record EVERYTHING they are doing in their research with links to all data, analyses, output, etc. And I want access to it from anywhere. And I want to be able to search it intelligently. Dropbox won't cut it.
- Jonathan Eisen
I built our lab site on drupal, and I have considered adding this sort of functionality into it. Development Seed's custom drupal distribution Openatrium (http://openatrium.com/) looks pretty cool too as an "intranet in a box." I can see that being compatible with your concept.
- Walton Jones
Jon - OK, that's a big ask to make it work. Technically all this is do-able and it is exactly the concept we are working on. Capture everything, and make connections between things. So what is next on my list is trying to connect DropBox to our blog system. This would mean that dropping any file in gets uploaded - the idea being that a web service is watching the subscribed drop box and...
more...
- Cameron Neylon
I used WordPress for my undergraduate research project and think it will suite your needs perfectly. There are a plethora of plug-ins available that can add additional functionality, if there is something you need that isn't provided by default and in general it's highly customisable.
- Steve Moss
We've been using Wikispaces as the lab notebook and Google Spreadsheets for numerical data for a few years and it meets our requirements. Both have version tracking and we have code that enables full archiving of the notebook and associated raw data files. They are free and hosted so nothing to install and maintain locally. Google Spreadsheets has a nice API for querying and visualization of data if used as a platform to organize data collected from multiple experiments.
- Jean-Claude Bradley
@Frank that sounds good. Are there easy ways to link a normal CMS to GIT/SVN? They tend not to be very user friendly, bad interface, which reduces uptake of the whole system in my experience. GIT plugins maybe for WordPress/Plone/Drupal?
- Dave Lunt
I don't think there is any Git plugin for Wordpress but if memory serves there are some Wiki frameworks built on a Git backend. I still think worrying about the back end is putting the cart before the horse. Your average lab would probably not benefit from using a versioning system unless they understand what they're for. If they do, then they can probably just use it at the command...
more...
- Cameron Neylon
An issue tracker such as Trac integrates well with revision control systems, and includes wiki functions. This provides mostly annotation and discussion for the repository, so contributors must still push changes to the code itself.
- Mike Chelen
Wordpress is great for logging stuff, but it kinds fails at the intelligent search aspects, unless there's something I'm not aware of
- Mr. Gunn
My requirements for an ELN are: (1) Support separate projects. (2) Complete text search. (3) Version manage all content, (4) Everything can be linked and tagged, (6) Display syntax-highlighted code snippets (7) Display images, video, etc (8) Synchronize with my pdf bibliography of references (9) Embeddable intelligent spreadsheets, (10) Display rss feeds and provide an rss feed, (11)...
more...
- Carl Boettiger
I'm wondering how many people here are experimental (wet-lab) people, and how many are theoretical/computational/electronic people? A lot of the things I'm interested in are to do with capturing the messy process of what happens in an experimental lab so that it can be searched and shared. Is that what everyone else has in mind?
- Matthew Todd
My problem is I do both and want one system Sent from my iPad
- Jonathan Eisen
from email
I've been working on this tool that seems to answer you need: www.symbyoz.com. Give it a try?
- Joel
Neil, I think the main distinctions are that in the wetlab things aren't automatically captured in the same way that they all sit on a disk somewhere for computational work. One of the big problems I have is getting people to see the value in making a record of all the samples that they create. There are very few systems that make this easy and natural. And it seems to make no sense...
more...
- Cameron Neylon
Actually I'd also throw your comment back at you. Descriptions of computational _process_ would be a lot less problematic if computational scientists stopped thinking that just collecting all the outputs was job done and learnt how to keep a high quality wet lab style notebook :-) Versioning systems capture outputs but rarely is there a good record of what _happened_ beyond some sort of commit message or a log file.
- Cameron Neylon
I agree with the sentiment but not the details. I think a versioning system is overkill and not really the right paradigm for the lab scientist in most cases. In my experience you generate the data once, and then it doesn't change much. You process it, generating new stuff, but the kind of cycling, tweaking, and branching that VS are built to support doesn't really apply as much. You...
more...
- Cameron Neylon
There are so many different kinds of data. One researcher might be managing thousands of microscope images every day, while in the same time another could record only a half dozen numerical results. Saving revisions of the images could be pointless if each is only written once and never modified. Whereas putting a document under revision control is a great way to ensure that nothing...
more...
- Mike Chelen
Ah ok. I think we have a philosophical difference here then. I don't see processing as versioning, for much of my work at least, because it usually generates new objects of new types. I do agree that where you are manipulating a single object in a repeated way, perhaps with branching, then versioning (and branching) is a good way to think about it but I'm less sure that it is a useful...
more...
- Cameron Neylon
...and I do think that versioning systems (including branch and merge) should be a basic feature of any file system. Just not sure that they need to be surfaced for most users in many use cases.
- Cameron Neylon
I'd go further actually, generating, storing, analysing and publishing research objects, explicitly including samples and other physical objects. And I think the "computational thinking" approach might be even better applied to the physical world.
- Cameron Neylon
Cameron I think that the requirements for version tracking may differ between labs. In my lab we often have lots of undergrads recording their lab notebooks and it would be unusual if there was no error correction for an experiment at some point. It is also not uncommon that the Google Spreadsheets we use to record the raw data and show the mathematical processing often initially...
more...
- Jean-Claude Bradley
If you want everything consistently done in git the documentation part could be handled with toto - "a git-powered, minimalist blog engine" (http://www.cloudhead.io/toto) (not sure if you need commenting functionality)
- Konrad Förstner
We've been very happy with http://www.unfuddle.com Unfuddle. It includes notebooks, though I don't know how appropriate these would be for experimental data. The integration of the subversion repository with tickets / milestones / projects is very nice.
- Ruchira S. Datta
biological data, at the point of capture, is really messy, and you often have to iterate several fast optimization kinds of assays or experiments before you hit on the set of conditions under which you can capture clean or meaningful data. Then once you do get close to clean, it's off to the next set of optimizations for the next thing. That's the dynamic that computational tools fail...
more...
- Mr. Gunn
A bit late to the party (via Cameron's post on the Daily Scan), but back in February I was sat in a dull seminar and made these notes. A Data Analysis Deposition System e.g. to store all the R code used to turn data X + Y into results Z. - load source code for analysis pipeline - check data dependencies, on submission and periodically in future (e-mail owner if links go down.) -...
more...
- Dave
Open Source www.Bikalabs.org develops web and CMS based LIMS - used in chemistry, agriculture, water quality, environment and inter-laboratory sectors. A public health lab branch is currently in development. Your requirements are research orientated, batching per project and full text admin are gaps in Bika that can be plugged. Full Plone CMS functionality is available. Analysis work...
more...
- lemoene
I've been using git for curating my (text based) data analysis in the social sciences, and I've been impressed with its usefulness for keeping a very detailed history of what I've been doing. The only problem is for large binary objects, but apparently that's being addressed. Gitalist: http://search.cpan.org/perldoc... is a front end to git which can be easily modified to provide...
more...
- k d
There has been some very interesting discussions here, and a lot of questions that are commonly asked. There appears to be a lot of discussion about using software that has not been specifically designed to be an ELN and some people find that these are perfect for what they need them to do. However, it is important to outline your requirements (as has been done above). I work for...
more...
- Aaron Norman
I'm developing a collaboration tool for researchers called SciTecMed [ http://www.scitecmed.com ]. It is like a mashup of dropbox and github. I'd appreciate an opportunity to show the demo and learn more about your needs.
- SciTecMed
hey scitecmed - drop me a line - am interested in this jonathan.eisen@gmail.com
- Jonathan Eisen
I just discovered this interesting discussion. Two systems for electronic lab notebooks that I don't think have been mentioned so far are: eCAT, a commercial system from Axiope http://www.axiope.com/ and Yogo / Neurosys http://neurosys.msu.montana.edu/ . I haven't tried either myself as yet.
- Mark Longair
we are with you ... in spirit .. its to late to be working really :)
- Pedro Beltrao
Now listening to Iron Maiden ... the night is young.
- Deepak Singh
This reminds me of Trent Reznor going to the Amityville house to record ... I can see the code already full of lots of paranoid checks and exception catching, just in case something unexpected jumps out :)
- Andrew Perry