Carl says that KochLab students' ONS on OWW partly inspired him to start his own open notebooks. That makes me happy! Props to Carl & Anthony, Larry, Andy, Brian, ...
- Steve Koch
One of us! One of us! :-) :-) This makes me very happy.
- Bill Hooker
"I'm an Open Scientist and this is what I do!" I agree with Bill. This makes me very happy!
- Anthony Salvagno
I added his two notebooks under the theory section. As I understand it, it's theory + computational modeling. Computational modeling can be in either in my opinion--the kind of results generated are probably more like lab experiments than theory. So, probably there's need for far more than two categories. Plus, that section of notebooks examples is getting too big! That's a good thing, but I think a couple "examples" should be put in that section, and then the rest should be put in a new page, "Examples of ONS practitioners." Or at least, that section should be moved to the end of the article?
- Steve Koch
Thanks for adding Steve. I think computational is more closely related to theoretical since there isn't a clear tradition of what constitutes an "experiment" and how to record it like there is in experimental sciences. I don't think the Wikipedia editors will allow the creation of a separate page just for examples of ONS but we certainly can put them at the end if some more examples get collected.
- Jean-Claude Bradley
I'm surprised that wikipedia editors have been harassing you, Jean-Claude! I've started many articles of questionable significance and never had anyone mark them for deletion. (For example: http://en.wikipedia.org/wiki...) Someday I think the article would have to be changed to "notable open notebook science practitioners," but I don't see why we couldn't have a "List of open notebook science practitioners" page. There are thousands of pages like that. For example: http://en.wikipedia.org/wiki...
- Steve Koch
Carl, you should pipe in on whether your computational work is more similar to theory or experiment! I think the whole spectrum of science is much harder to put into categories than Jean-Claude or I expect. I see computational as "experiment" often. For example, I collaborate with some people doing molecular dynamics (MD) simulations. It surprising how much like experiments it is. Perhaps the best example (and most amusing to me when I learned it) was the need for a "thermostat" in the simulation. The student spent a few weeks tweaking the MD software until he could get his thermostat working. I'd say recording that process, along with all the failed experiments (and the conditions that led to them) would be very valuable. Very much like what we'd like to do in our experiments. But it's probably be easier to capture the information for the computational stuff--so I'd expect those open notebooks to be excellent sooner than ours.
- Steve Koch
Steve - it was actually not that easy to get the ONS entry on Wikipedia to not be deleted or redirected - it took 2 tries and lots of documentation to appease the editors - see the discussion page. I guess you don't know how it will play out until you try.
- Jean-Claude Bradley
Sounds like an overly-aggressive guy was obsessed with deleting your article? No rush for anything, but if you think it'd be better with a sub-page of practitioners, I'll create the page someday.
- Steve Koch
Steve - I think it is actually much more difficult to capture the information for computational experiments because it is so easy to generate immense amounts of data. If every tweak is "an experiment" you would end up spending more time documenting what you do than actually doing work. When we did docking we did approach documenting it like a physical experiment so that there is enough information to reproduce (e.g. http://usefulchem.wikispaces.com/D-EXP01... ) but we didn't record every parameter tried. In a wet lab experiment the situation is much more clear cut - if you do something physically in the lab during an experiment you record it. This is a practice that is part of academia and industry - well defined enough that the "lab notebook" has a legal meaning in patent law.
- Jean-Claude Bradley
Interesting, Jean-Claude. And I hadn't been thinking of the legal definition (and I also don't feel like thinking about that now :) ). The MD simulations I was thinking of actually took significant time on a supercomputer for each tweak (I think). So, that would be practical to record everything and treat it as an experiment. Clearly, though, as you say there are cases where it would be very difficult to capture everything. This is true in some experiments, too, though! For example I remember in the Collider Detector Facility (CDF) at Fermilab, the first round of data acquisition was all hardware and the job was to filter out 90% (or whatever fraction) of the data so that the bandwidth of the next steps would be sufficient. That seems like a clear example of too much experimental data to save. So, probably the spectrum of open notebook science shouldn't just be one-dimensional from theory to experiment. It seems like at least one more dimension is needed which describes how much information there is to capture.
- Steve Koch
And perhaps another dimension for how much tacit knowledge there is. I think tacit knowledge may be what I had in mind when I said computational science may be easier to do as ONS. Certainly there is tacit knowledge in computational science. But listening to John Hogenech at ScienceOnline2010, I learned that it's possible for one group to exactly replicate another group's computations, if they use Amazon Web Services as the platform. Or, Deepak pointed out Virtual Machines to me as a way of packaging a computational environment. So it seems easier to at least transfer some of the tacit knowledge. In a wetlab, it's still not possible to do that. So, maybe there're at least three dimensions on which to classify ONS: (1) experiment / theory, (2) volume of information, and (3) volume of tacit knowledge (or maybe ratio of tacit knowledge to explicit knowledge)
- Steve Koch
For Andy's gliding motility assays in our lab, I'd rate it (1) solidly in experiment, (2) low/moderate amounts of information (GBs of image data), (3) moderate/high amounts of tacit knowledge. (Or tacit knowledge that is difficult to capture. Maybe that's what the 3rd dimension is: how difficult to capture the tacit knowledge)
- Steve Koch
Interesting discussion! In my mind they are both. In the phyologenetics notebook, I am trying to extend the theory of comparative methods beyond linear models. This field is moving very quickly, and it's no use to suggest new theory without providing software that implements it, or no one will be able to use it. I also find that saving all the data from all the runs I do can be prohibitive. If I want to repeat any simulation or run I describe in the notebook, I (or anyone else) can grab the code as it was that day from the subversion repository on google and rerun it. Still figuring out how to get the most out of this!
- Carl Boettiger
The reason I mention the legal definition of a lab notebook is that it leads to an expectation in the scientific community that students will be trained to record their experiments in a fairly consistent way. Perhaps I have the wrong impression here but it seems that there is not a standard way to record computational/theoretical work. [In fact as a postdoc I wanted to change the format of my lab notebook and my supervisor refused because it was not consistent with the requirements of our Research Office]. However, in general I think the acid test for a good notebook is whether an article or patent can be written from the records without significant interaction with the student/researcher who kept it. If the record keeping is so bad that one can't tell what happened in the experiment then the work is wasted. Unfortunately it isn't uncommon for students to repeat experiments from students who kept poor notes and left the lab.
- Jean-Claude Bradley
Carl - one of the benefits I'm finding from the researchers who keep Open Notebooks is learning how science is done in different fields and different groups. This is something that has traditionally been difficult to assess because notebooks are traditionally very private. Over time we're gathering data that will prove handy for discovering how the scientific process actually works - as opposed to the ideal of hypothesis -> experiment design and execution -> evaluation which is widely taught.
- Jean-Claude Bradley
Jean-Claude, that's an excellent point. I've also found that most of my colleagues in wet labs are taught to keep lab notebooks in a rather precise way. I know of only a few theorists in my department who keep any kind of regular notebook, and I've never had that kind of instruction or even encouragement. In computational sciences I'm surprised how few scientists use version management like subversion, which maintains a revision log, etc, for their codes. It will be interesting to see what feedback & suggestions I get on how to keep a notebook effectively.
- Carl Boettiger
Thanks for the feedback Carl - it will be interesting to see what response you get
- Jean-Claude Bradley