six! (JCB - Can you pls link to your 'comments feed'?
- mike seyfang
blogs are great for text, and tagging provides a form of structured metadata. hopefully the next step will be to integrate semantic web technologies
- Mike Chelen
try an SSH tunnel, you can even put all the traffic over port 80 if that is that only thing that is open. it only requires a shell account, which is quite common. here is one guide to get started: http://www.engadget.com/2006...
- Mike Chelen
'record-keeping is appalling'; electronic notebooks/data-sharing—is this going to improve matters? One can imagine that knowing you're going to have to make your data available at some point might incentivize you to keep better notes... on the other hand you might be a contrary bugger and not :)
- Richard P Grant
We recently published a paper about the transcriptome of a particular cell type. We only scratched the tip of the iceberg as to interesting analyses. Datasets are not enough, though, for publication in my mind. But you can't require all potential analyses, either, for a paper - or it's infinite. Scientific publications by necessity are about communication, but being forced to communicate all unfruitful lines of inquiry will really muddy the waters. Being *able* to is a good to have option, though.
- Heather
The "paper" is good for focused communication, linking to data (which can be in other places because of format, eg protein data bank or open notebook). Science is not only about datasharing but also about communicating the ideas arising from the data (to various levels of readers).
- Maxine
The point was made repeatedly yesterday that a paper has to have a 'claim'—it can't just be data.
- Richard P Grant
I need to refine that definition though - I like the idea that the paper makes a 'claim' or a 'statement' describing a model - it has a definite place in (web)space and time. Then we can talk about having access to the material that supported that claim at the time it was made.
- Cameron Neylon
the main problem with a paper is it generally needs to tell a complete "story" - and many perfectly good results get left out as being irrelevant to the story
- Jean-Claude Bradley
On the other hand, many papers describing perfectly good data do not get into journals because the "story" told oversells what the data show.
- Maxine
just putting together some slides on this theme for tomorrow - I think it cuts both ways - the people who can't get a publication and the big datasets which get big papers - lets just publish data as data and be done with it
- Cameron Neylon
And I think the point comes down to this - the easiest way to make the paper well supported by the full set of data and analysis is to have a shareable lab notebook, even if it isn't shared until publication - but once you decide not to publish something then the rest can also be easily made available
- Cameron Neylon
Maxine - too bad we missed you yesterday...but we did meet a lot of people at Nature
- Jean-Claude Bradley
I'm sorry too, I had unavoidable domestic commitments. Glad you met lots of people other than me!
- Maxine
We had a fun time and met some more people, but you did miss me saying something nice about peer review :-)
- Cameron Neylon
OK, Cameron, I'll have to work on getting you to repeat it next time we meet!
- Maxine
"lets just publish data as data and be done with it" you write, Cameron. I agree with the "lets just publish data as data" bit - but not the "be done with it" because of quality control and rewards (jobs, etc) for two things. Communication (as in discussion above) a third - someone who understands the data needs to do this, either the author of it or someone who is pretty close to the techniques, etc, used.
- Maxine
fair point. In no sense was that meant as just dump data any old how - just that what we tend to do at the moment with eg genome sequence papers server neither the journal nor the data well
- Cameron Neylon
from fftogo
Agree very strongly, Cameron. Journals are dead keen on organised places for data. We are increasingly struggling with problems of hosting on our journal websites innovative and huge formats. As publisher of a paper it is our responsibility to host this online "supplementary" information because an author's own website is not necessarily going to be maintained or accessible. But in some...
more...
- Maxine
Significant social challenges: fears, management issues. There can be answers; need to think through & diversify means of publication and credit for that
- Richard P Grant
congregate around the principle of access to the full facts that support a published claim: currently there are huge problems but we all agree this is something we should try for
- Richard P Grant
'ONS Grand Challenge'—solubility in different solvents
- Richard P Grant
Nature is willing to do profiles of the groups involved. Jenny wants physics & biology to also be involved...
- Richard P Grant
Cameron thinks the experiment itself doesn't matter, at this marketing level. And reckons this project could then feed into biology anyway (drug discovery, after all, is biology...)
- Richard P Grant
J-C wants to enforce rigour of open notebook
- Richard P Grant
... which constrains the subjects that can be 'judged' by the expertise in this room, essentially.
- Richard P Grant
No one has ever looked at (comprehensively) solubility in organic solvents. So it's important, too.
- Richard P Grant
I get that J-C, but you're making the points about fields outside of yours. There has to be some kind of validation from an 'expert', yeah?
- Richard P Grant
yes someone competent in a field can validate a notebook in their field
- Jean-Claude Bradley
Two issues: recording data electronically, and making it open. Not the same thing
- Richard P Grant
Issues of trust... trusting the data, trusting the person who put the data there...
- Richard P Grant
Cameron N: someone's quote on carbon footprint of UK IT infrastructure being equal to CF of air travel - can we afford to backup *everything* (c.f. sequencing projects)
Openness is a continuum, so far we have only been talking about professional scienctsits and researchers, how far do we want to take this? -> Citizen science e.g. http://galaxyzoo.org
- Cameron Neylon
building tools for the general public and not from the expert
- Cameron Neylon
Data curation and preservation choices - disciplinary data centres, institutiona/departmental repositories, federation, national library, 'public' repositories, web archiving, commercial data store, ecosystem of hosted services, none of these, all of these
- Cameron Neylon
if serious stuff goes into it - what is the longevity in the longer term
- Cameron Neylon
are these 'records of science'? If so how important are they. What long term preservation appraoches are in place? Funder requirements? Institutional requirement? personal requirement?
- Cameron Neylon
Answer from David leahy - how far can open go? As far as possible!
- Cameron Neylon
concern of muddying the water (premature picture of reality)
- Cameron Neylon
inspiring wild goose-chases (different between a theory in the process of directly disproved and one perpetuated in the fac eof an underlying error) e.g. small t-construct
- Cameron Neylon
came from someone very famous and 'very reliable' which is itneresting - issue of trust again
- Cameron Neylon
data fatigue: spend all the time looking not doing
- Cameron Neylon
territorial pissing to discourage competition (big lab puts stuff out to keep smaller groups out)
- Cameron Neylon
and they might piss just to scare people away even though they actually have no intention of working on it
- Richard P Grant
slave to fashion - swept along by a prevailing view - would this be amplified?
- Cameron Neylon
redundancy - waste or actually robustness
- Cameron Neylon
Cameron is handing out the cyanide to the open bloggers before lunch
- Jean-Claude Bradley
Issues for scientists and their careers - vulnerability to data predators, already a systemic problem in certain fields (people being sent to meetings directly to spy)
- Cameron Neylon
who is affected - fields where ratio of discovery time to discovery validation in very high - hard to find but easy to show once you know
- Cameron Neylon
what are the implications - if you are deprived of a paper it can be terminal
- Cameron Neylon
It's good to be able to eat, even if it is cyanide...
- Richard P Grant
tweaking the rewards system - possible solutions. A short list of genes, data in a universal database, with timestampe and author ID. Could flag some to follow up and put a personal I am following this up. Do a bunch of experiments around which you then publish. Someone may ignore te request and write their own paper. WHo is the independent arbiter? Journals/entirely new body?
- Cameron Neylon
Could reject paper - enforce authorship across a paper
- Cameron Neylon
maybe not need papers but will still need synthesis
- Richard P Grant
Move beyond the paper - what is the MPU and how is it published. The apaper as the story or the milestone. So what is the contribution if it is partial and can these be added up to provide a 'number'
- Cameron Neylon
I think the dichotomy/continuum is flawed—in that subscription publishing is not 'private' whereas primary data in paper notebooks are.
- Richard P Grant
The problem of repeating experiments where the detail is not available
- Cameron Neylon
Using freely hosted, and reliable, third party services
- Cameron Neylon
re: continuum I think its a ways of presenting one aspect of the issue - but it's an agenda rather than a model
- Cameron Neylon
comments/date stamp on a blog is just a matter of technology, though
- Richard P Grant
Link from a description of the experiment (blog/paper/whatever) back to the experimental record
- Cameron Neylon
its true its just technology but there is a lot of background in people's minds about what a 'blog' or a 'wiki' looks like. Versioning is something that has been associated with wikis and not blogs. In the end both are just php scripts running over databases
- Cameron Neylon
Use of google docs to represent a further abstraction of experiments - summary and plans
- Cameron Neylon
instruments will talk to instruments to computers rather than humans reading the logs
- Cameron Neylon
the issue of training people to do what might be seen as a new type of science
- Cameron Neylon
I htink there's a danger of turning 'science' into 'technology'
- Richard P Grant
agreed - but if we get the technology right we could also do more (better) science. I wonder sometimes whether I do science anymore though
- Cameron Neylon
we're still talking about science - this is a drug discovery process example
- Jean-Claude Bradley
Is drug discovery science? Philosophical question ;)
- Richard P Grant
@Richard - in the end, doesn't science become technology and the cycle repeats?
- Rajarshi Guha
drug discovery a science? Ha! More of an art :)
- Rajarshi Guha
dunno Raj, that's a little too Zen for me.
- Richard P Grant
been less focussed on lab teaching but on generic stuff for scientists, how to do presentations, but also feedback from the people who are getting taught
- Cameron Neylon
Outreach - talking to arts/humanities, communicating what a scientist is/does, mention of the Open Lab collection 'the eureka moment', the people who just like the pictures as art
- Cameron Neylon
help and advice - worked in perl and provided advice on how to do that from a wide range of people, including computer scientists, industrial people
- Cameron Neylon
Sharing - protocols, saving money (by finding the right suppliers), further help, USENET over again
- Cameron Neylon
Problems - causing offence, someone thought it was possible to identify someone anonymously. How much is allowed? What are the limitations.How much detail? Time away from the bench was a central issue, started writing only at home. A little less black and white about it now. What does the boss think?
- Cameron Neylon
Is any information I give actually valuable without context/reagents. Are methods more valuable than data? Need good help to good experiments with difficult methods. "Gandalf makes cells explode"
- Cameron Neylon
Issues - blogging data that is presented at meetings. Is this the same thing? Stealing vs redistribution. Publication and patent rights.
- Cameron Neylon
yes blogging is disclosure for patent purposes (just like publishing a paper)
- Jean-Claude Bradley
Would liek to see - collaborative, more than one 'hard science' blog, central faculty commitment - support, marketing, training. Arts/humanities community don't seem to have the same worries about whether it is value or 'what the boss thinks'
- Cameron Neylon
marketing teams are interested in using these things for marketing what the institution does
- Cameron Neylon
last.fm checks what you are listening to and adds that to your profile online
- Cameron Neylon
lets you discover your own stats - over time, artist. Finding people who are interested in the same thing and therefore things that might be similar or of interest to people who are interested in what you are listening to
- Cameron Neylon
desktop works on its own without needing the network effects - takes the paper (pdf) and extracts metadata, full text, references etc - can search tag and group and share
- Cameron Neylon
I approve of going for the 3 major OS simultaneously
- Richard P Grant
but the application works without the collaboration and already provides that added benefit by not requiring you to type in the metadata. This is a wonderful example of what I've been writing a bit about recently. Provide a good app, add a nice bit of extra functionality, and then introduce the killer next step capability once people are drawn in
- Cameron Neylon
I agree its just a reference manager to start with - but that's what is needed to get people in, you need something that fits in with an existing workflow - once they are sucked in then blow them away with something new
- Cameron Neylon
I've been playing around with it and it has a lot of potential, not least because it is extremely intuitive. waiting to put more in to see if I can get more out.
- Heather
Mendeley sounds very similar to Labmeeting (though with higher profile investors). Anyone have any thoughts on how the two might differ (or not)?
- Shirley Wu
Main community is sharing Taverna workflows
- Cameron Neylon
Packs provide a way of collecting sets of arbitrary stuff both inside and outside of MyExperiment - provides some versioning for inside and outside objects as is possible (primarily via MD5)
- Cameron Neylon
no longer just 'Facebook for Scientists' but different as well
- Cameron Neylon
sharing workflows enables re-use - specific example looking at parasites in animals
- Cameron Neylon
part of a distributed system - not focussing on a monolithic system
- Cameron Neylon
not trying to speed up machine processing but speed up the human side
- Cameron Neylon
some hope about re-trying to use Taverna to process SMILES codes to create virtual molecule libraries
- Jean-Claude Bradley