Soton Open Science Workshop 1/9/08

Soton Open Science Workshop 1/9/08

Room for the workshop on open science at Southampton University on 1 September 2008
Jean-Claude Bradley
Now Cameron is talking about open notebooks
the blog as the lab notebook - with links to files, images, comments - Jean-Claude Bradley
use mobile phone to write to the lab notebook - Jean-Claude Bradley
he's showing graphs of his lab notebook objects interconnected - Jean-Claude Bradley
there are no formal semantics in his system but can use tagging very well - Jean-Claude Bradley
I"m switching to posting comments to my feed - nobody else here :( - Jean-Claude Bradley
Well I'm here... - Richard P Grant
true - 4 people :) - Jean-Claude Bradley
six! (JCB - Can you pls link to your 'comments feed'? - mike seyfang
blogs are great for text, and tagging provides a form of structured metadata. hopefully the next step will be to integrate semantic web technologies - Mike Chelen
Cameron Neylon
Failing to connect to Mogulus - I think the Southampton firewall may prevent us streaming video
Damn those institutional firewalls - mike seyfang
try an SSH tunnel, you can even put all the traffic over port 80 if that is that only thing that is open. it only requires a shell account, which is quite common. here is one guide to get started: - Mike Chelen
Richard P Grant
Geeks in an echo chamber
Good title for a blog - Maxine
agreed - Mike Chelen
Cameron Neylon
LiveScience: Era of Scientific Secrecy Near End -
Nice little addendum to the meeting - commentary at Michael Nielsen's original share - - Cameron Neylon
Richard P Grant
'data-sized peg into a paper-sized hole'—Cameron
'record-keeping is appalling'; electronic notebooks/data-sharing—is this going to improve matters? One can imagine that knowing you're going to have to make your data available at some point might incentivize you to keep better notes... on the other hand you might be a contrary bugger and not :) - Richard P Grant
We recently published a paper about the transcriptome of a particular cell type. We only scratched the tip of the iceberg as to interesting analyses. Datasets are not enough, though, for publication in my mind. But you can't require all potential analyses, either, for a paper - or it's infinite. Scientific publications by necessity are about communication, but being forced to communicate all unfruitful lines of inquiry will really muddy the waters. Being *able* to is a good to have option, though. - Heather
The "paper" is good for focused communication, linking to data (which can be in other places because of format, eg protein data bank or open notebook). Science is not only about datasharing but also about communicating the ideas arising from the data (to various levels of readers). - Maxine
The point was made repeatedly yesterday that a paper has to have a 'claim'—it can't just be data. - Richard P Grant
I need to refine that definition though - I like the idea that the paper makes a 'claim' or a 'statement' describing a model - it has a definite place in (web)space and time. Then we can talk about having access to the material that supported that claim at the time it was made. - Cameron Neylon
the main problem with a paper is it generally needs to tell a complete "story" - and many perfectly good results get left out as being irrelevant to the story - Jean-Claude Bradley
On the other hand, many papers describing perfectly good data do not get into journals because the "story" told oversells what the data show. - Maxine
just putting together some slides on this theme for tomorrow - I think it cuts both ways - the people who can't get a publication and the big datasets which get big papers - lets just publish data as data and be done with it - Cameron Neylon
And I think the point comes down to this - the easiest way to make the paper well supported by the full set of data and analysis is to have a shareable lab notebook, even if it isn't shared until publication - but once you decide not to publish something then the rest can also be easily made available - Cameron Neylon
Makes sense to me, Cameron. - Maxine
Maxine - too bad we missed you yesterday...but we did meet a lot of people at Nature - Jean-Claude Bradley
I'm sorry too, I had unavoidable domestic commitments. Glad you met lots of people other than me! - Maxine
We had a fun time and met some more people, but you did miss me saying something nice about peer review :-) - Cameron Neylon
OK, Cameron, I'll have to work on getting you to repeat it next time we meet! - Maxine
"lets just publish data as data and be done with it" you write, Cameron. I agree with the "lets just publish data as data" bit - but not the "be done with it" because of quality control and rewards (jobs, etc) for two things. Communication (as in discussion above) a third - someone who understands the data needs to do this, either the author of it or someone who is pretty close to the techniques, etc, used. - Maxine
fair point. In no sense was that meant as just dump data any old how - just that what we tend to do at the moment with eg genome sequence papers server neither the journal nor the data well - Cameron Neylon from fftogo
Agree very strongly, Cameron. Journals are dead keen on organised places for data. We are increasingly struggling with problems of hosting on our journal websites innovative and huge formats. As publisher of a paper it is our responsibility to host this online "supplementary" information because an author's own website is not necessarily going to be maintained or accessible. But in some... more... - Maxine
Cameron Neylon
“Cameron Neylon is summarizing the Southampton Open Science conf” -
See the comments at the original item - Cameron Neylon
Cameron Neylon
Semantics and interoperability
Yaroslav Nikolaev - Richard P Grant
Hypothesis + Data = Knowledge - Cameron Neylon
objects can be data or knowledge or the connections between the two? - Cameron Neylon
Ontologies connect objects - Cameron Neylon
Material = [Raw]Materiald + Method + Instrument - a general data model? - Cameron Neylon
quoting JC - -'communicate first, standardise second' - Cameron Neylon
need for clear experimental description ontologies - you still out there Frank? - Cameron Neylon
An 'Open Science' ontology? - Cameron Neylon
discussion of whether ontologies can be used in practice across domains or whether they are always limited and what that means - Cameron Neylon
wanted: onlotogies for organic chemistry reactions - Jean-Claude Bradley
Frank - yes do you have a link? - Jean-Claude Bradley
is there one that exists for organic chemistry or does it have to be built? - Jean-Claude Bradley
Richard P Grant
CN: a lot of tools available. Some work, some don't. The ones that do are the ones that will plug into stuff they already need
Significant social challenges: fears, management issues. There can be answers; need to think through & diversify means of publication and credit for that - Richard P Grant
congregate around the principle of access to the full facts that support a published claim: currently there are huge problems but we all agree this is something we should try for - Richard P Grant
everybody looks drained :) - Jean-Claude Bradley
J-CB: things that don't work go into the ON too, which is (as) valuable as the things that finally make the paper... *process* - Richard P Grant
great notes Richard - Jean-Claude Bradley
"can we agree that notebooks be available after publication? - Jean-Claude Bradley" - Richard P Grant
Richard P Grant
So what do we do?
Jenny wants to do PR. Beefy letter to Nature... - Richard P Grant
'ONS Grand Challenge'—solubility in different solvents - Richard P Grant
Nature is willing to do profiles of the groups involved. Jenny wants physics & biology to also be involved... - Richard P Grant
Cameron thinks the experiment itself doesn't matter, at this marketing level. And reckons this project could then feed into biology anyway (drug discovery, after all, is biology...) - Richard P Grant
J-C wants to enforce rigour of open notebook - Richard P Grant
... which constrains the subjects that can be 'judged' by the expertise in this room, essentially. - Richard P Grant
No one has ever looked at (comprehensively) solubility in organic solvents. So it's important, too. - Richard P Grant
Bye Jim. - Richard P Grant
Does anyone know some congressmen? - Richard P Grant
well enforce that things called ONS really are - Jean-Claude Bradley
I get that J-C, but you're making the points about fields outside of yours. There has to be some kind of validation from an 'expert', yeah? - Richard P Grant
yes someone competent in a field can validate a notebook in their field - Jean-Claude Bradley
Two issues: recording data electronically, and making it open. Not the same thing - Richard P Grant
Issues of trust... trusting the data, trusting the person who put the data there... - Richard P Grant
very true R - Jean-Claude Bradley
Jim Procter
Cameron N: someone's quote on carbon footprint of UK IT infrastructure being equal to CF of air travel - can we afford to backup *everything* (c.f. sequencing projects)
Jim Procter
Cameron N: Something disruptive: Defining 'Open Science' - or branding - should there be a basic, or 'aspirational' standard for Open Science
Cameron Neylon
Liz Lyons - Curation
Openness is a continuum, so far we have only been talking about professional scienctsits and researchers, how far do we want to take this? -> Citizen science e.g. - Cameron Neylon
building tools for the general public and not from the expert - Cameron Neylon
Data curation and preservation choices - disciplinary data centres, institutiona/departmental repositories, federation, national library, 'public' repositories, web archiving, commercial data store, ecosystem of hosted services, none of these, all of these - Cameron Neylon
are blogs being archived? Kind of... - Cameron Neylon
if serious stuff goes into it - what is the longevity in the longer term - Cameron Neylon
are these 'records of science'? If so how important are they. What long term preservation appraoches are in place? Funder requirements? Institutional requirement? personal requirement? - Cameron Neylon
Answer from David leahy - how far can open go? As far as possible! - Cameron Neylon
role of libraries - Richard P Grant
Alf Eaton - anything in a feed is preserved by google - Cameron Neylon
re googlereader—but only if it's in the feed. 'Below the fold' disappears. - Richard P Grant
Richard P Grant
David Leahy InkSpot Science
Doing drug discovery at home - hopefully in a nice place - Cameron Neylon
Need a good support system for the background infrastructure particularly the computational infrastructure - Cameron Neylon
An on demand workbecnh for scientists - Cameron Neylon
focus on the doing of science from the instrument to the analysis - Cameron Neylon
focus on drug discoveyr but should be applicable across a wide range of fields in the future - Cameron Neylon
managing multiple data types - viewers - services management - Cameron Neylon
workflow managment, multiple enactment engines, desktop wiorkflow editor, on demand computing - Cameron Neylon
looking for users to drive the development of specific data types etc - Cameron Neylon
he's still putting the infrastructure together - Jean-Claude Bradley
confidentiality is an option - Jean-Claude Bradley
digital signing. - Richard P Grant
not necessarily for long term storage - Jean-Claude Bradley
fine grain security control, digital signing, ambient similarity (?) - finding out what people are doing when it similar - Cameron Neylon
issue of whether you want to edit workflows on the desktop and/or use them on the web - Cameron Neylon
interested in hosting specialised services - Cameron Neylon
using an auto-QSAR process as an example - Cameron Neylon
QSAR automation - would be helpful for us - not sure when will be ready to try - Jean-Claude Bradley
a competitive work flow process where the system looks at diferent possible workflows to solve the same problem - Cameron Neylon
SHAPS looks for shape similarity - looks like LASSO - Jean-Claude Bradley
free for open science - Jean-Claude Bradley
he has funding for a few years - Jean-Claude Bradley
subscription for confidentiality and pay for use services - Cameron Neylon
presumably an option is to exploit their proximity to open data as well for value added services - Cameron Neylon
will start within next few weeks - Jean-Claude Bradley
his docking service will be freely available in a few months - that would be very useful - Jean-Claude Bradley
Are there slides of this talk available? - Rajarshi Guha
rajarshi - David is going to put them up on Slideshare I think - Cameron Neylon
Great - Rajarshi Guha
Richard P Grant
'The conditional sceptic'
Ok - lets use this one - Cameron Neylon
concern of muddying the water (premature picture of reality) - Cameron Neylon
inspiring wild goose-chases (different between a theory in the process of directly disproved and one perpetuated in the fac eof an underlying error) e.g. small t-construct - Cameron Neylon
came from someone very famous and 'very reliable' which is itneresting - issue of trust again - Cameron Neylon
data fatigue: spend all the time looking not doing - Cameron Neylon
territorial pissing to discourage competition (big lab puts stuff out to keep smaller groups out) - Cameron Neylon
and they might piss just to scare people away even though they actually have no intention of working on it - Richard P Grant
slave to fashion - swept along by a prevailing view - would this be amplified? - Cameron Neylon
redundancy - waste or actually robustness - Cameron Neylon
redundancy not necessarily bad - Richard P Grant
Cameron is handing out the cyanide to the open bloggers before lunch - Jean-Claude Bradley
Issues for scientists and their careers - vulnerability to data predators, already a systemic problem in certain fields (people being sent to meetings directly to spy) - Cameron Neylon
who is affected - fields where ratio of discovery time to discovery validation in very high - hard to find but easy to show once you know - Cameron Neylon
what are the implications - if you are deprived of a paper it can be terminal - Cameron Neylon
It's good to be able to eat, even if it is cyanide... - Richard P Grant
I'm getting a bit hungry too - Jean-Claude Bradley
tweaking the rewards system - possible solutions. A short list of genes, data in a universal database, with timestampe and author ID. Could flag some to follow up and put a personal I am following this up. Do a bunch of experiments around which you then publish. Someone may ignore te request and write their own paper. WHo is the independent arbiter? Journals/entirely new body? - Cameron Neylon
Could reject paper - enforce authorship across a paper - Cameron Neylon
maybe not need papers but will still need synthesis - Richard P Grant
Move beyond the paper - what is the MPU and how is it published. The apaper as the story or the milestone. So what is the contribution if it is partial and can these be added up to provide a 'number' - Cameron Neylon
cultural shift - Richard P Grant
Nature precedings—citable - Richard P Grant
Nature Precedings - author list, doi, votes, commenting, versioning - Jean-Claude Bradley
I should add that this was Jennifer Rohn, but I don't think I can edit the original message... - Richard P Grant
Jenny gave a good overview of possible objections - Jean-Claude Bradley
Richard P Grant
Richard P Grant
Cameron Neylon
Jean-Claude Bradley - Open Notebook Science
Closed to open science continuum - Cameron Neylon
I think the dichotomy/continuum is flawed—in that subscription publishing is not 'private' whereas primary data in paper notebooks are. - Richard P Grant
The problem of repeating experiments where the detail is not available - Cameron Neylon
Using freely hosted, and reliable, third party services - Cameron Neylon
re: continuum I think its a ways of presenting one aspect of the issue - but it's an agenda rather than a model - Cameron Neylon
comments/date stamp on a blog is just a matter of technology, though - Richard P Grant
Link from a description of the experiment (blog/paper/whatever) back to the experimental record - Cameron Neylon
its true its just technology but there is a lot of background in people's minds about what a 'blog' or a 'wiki' looks like. Versioning is something that has been associated with wikis and not blogs. In the end both are just php scripts running over databases - Cameron Neylon
Use of google docs to represent a further abstraction of experiments - summary and plans - Cameron Neylon
True. - Richard P Grant
using an open standard xy data representation (JCamP) allows the viewer to play with the raw data - Cameron Neylon
agree on the conflation of subscription with privacy - that's a fair point - Cameron Neylon
also the distinction between availability and accesibility as well - which is a whole other discussion - Cameron Neylon
using instrumentation as far as possible to capture what really happened versus what the human reports - Cameron Neylon
Googledocs is a nice way of doing this easily - Richard P Grant
instrument reports 'objectively' - Cameron Neylon
paper written on the wiki - still allowed (in some journals) to publish. - Cameron Neylon
References can point directly back to experimental details - even when particular analysis was done on different samples - Cameron Neylon
very powerful—linking to the experiments from the written up report, the 'paper' - Richard P Grant
...and publish it on natureprecedings - Richard P Grant
publishing in both nature precedings and Journal of Visualised Experiments - Cameron Neylon
-> publish the stuff that doesn't make it into papers, but is important to disseminate. - Richard P Grant
there are examples of openness in some areas leading to people getting jobs - Cameron Neylon
I for one welcome our robot overlords - Richard P Grant
instruments will talk to instruments to computers rather than humans reading the logs - Cameron Neylon
the issue of training people to do what might be seen as a new type of science - Cameron Neylon
I htink there's a danger of turning 'science' into 'technology' - Richard P Grant
agreed - but if we get the technology right we could also do more (better) science. I wonder sometimes whether I do science anymore though - Cameron Neylon
we're still talking about science - this is a drug discovery process example - Jean-Claude Bradley
Is drug discovery science? Philosophical question ;) - Richard P Grant
@Richard - in the end, doesn't science become technology and the cycle repeats? - Rajarshi Guha
drug discovery a science? Ha! More of an art :) - Rajarshi Guha
dunno Raj, that's a little too Zen for me. - Richard P Grant
Cameron Neylon
Richard Grant on 'The usual bemused stuff...'
is here from Australia paid by university to attend blogging conference - Cameron Neylon
started off on a blog set up for staff by university of sydney - Cameron Neylon
started in August 2006 at usyd - Cameron Neylon
then moved the at 'the scientist' - Cameron Neylon
why blog? Teaching, outreach, help and advice, sharing data/reagents/protocols - Cameron Neylon
he likes writing .. nice :) - Jean-Claude Bradley
Teaching - seminar, tips, feedback from scientists, methods - still working on how to use blogging effectively for teaching - Cameron Neylon
blogging is still experimental - and that is ok - Jean-Claude Bradley
been less focussed on lab teaching but on generic stuff for scientists, how to do presentations, but also feedback from the people who are getting taught - Cameron Neylon
Outreach - talking to arts/humanities, communicating what a scientist is/does, mention of the Open Lab collection 'the eureka moment', the people who just like the pictures as art - Cameron Neylon
use blogs to get real science in real world - Jean-Claude Bradley
help and advice - worked in perl and provided advice on how to do that from a wide range of people, including computer scientists, industrial people - Cameron Neylon
Sharing - protocols, saving money (by finding the right suppliers), further help, USENET over again - Cameron Neylon
he's still concerned about getting scooped - Jean-Claude Bradley
Problems - causing offence, someone thought it was possible to identify someone anonymously. How much is allowed? What are the limitations.How much detail? Time away from the bench was a central issue, started writing only at home. A little less black and white about it now. What does the boss think? - Cameron Neylon
Doesn't have the answers at the moment - Cameron Neylon
Is any information I give actually valuable without context/reagents. Are methods more valuable than data? Need good help to good experiments with difficult methods. "Gandalf makes cells explode" - Cameron Neylon
Issues - blogging data that is presented at meetings. Is this the same thing? Stealing vs redistribution. Publication and patent rights. - Cameron Neylon
yes blogging is disclosure for patent purposes (just like publishing a paper) - Jean-Claude Bradley
Would liek to see - collaborative, more than one 'hard science' blog, central faculty commitment - support, marketing, training. Arts/humanities community don't seem to have the same worries about whether it is value or 'what the boss thinks' - Cameron Neylon
marketing teams are interested in using these things for marketing what the institution does - Cameron Neylon
mention Mat Todd and Synaptic Leap - Jean-Claude Bradley
mentioning the synaptic leap and Mat Todd's grant to do open drug synthesis development - Cameron Neylon
challening assumptions - most people have never heard of blogging, even if they don't see the point - Cameron Neylon
Challenge - is it worth it, if so how does it become mainstream - Cameron Neylon
many INTJ bloggers here! - Jean-Claude Bradley
Richard P Grant
Using FF for this meta-discussion is fantastic, really.
Cameron Neylon
Victor Henning on Mendeley
12 researchers , graduates, and open source developrs, funded by co-founders of and skype - Cameron Neylon
segue into - can it be applied to research? - Cameron Neylon checks what you are listening to and adds that to your profile online - Cameron Neylon
lets you discover your own stats - over time, artist. Finding people who are interested in the same thing and therefore things that might be similar or of interest to people who are interested in what you are listening to - Cameron Neylon
obvious link to research papers - Cameron Neylon
combines two products - mendeley desktop manages your local papers, mendeley web runs the back end combination of data - Cameron Neylon
value without network effect - Jean-Claude Bradley
metadata extraction sounds cool - Richard P Grant
predicts metadata from text - Jean-Claude Bradley
desktop works on its own without needing the network effects - takes the paper (pdf) and extracts metadata, full text, references etc - can search tag and group and share - Cameron Neylon
can backup whole paper library - Jean-Claude Bradley
web app provides backup and access - provides profiles, social networking and groups (working on the latter) - Cameron Neylon
looking towards recommendations, research trends, and stats by author, paper, or discipline - Cameron Neylon
correction of incorrectly extracted metadata can be shared across users - Cameron Neylon
I'm impresssed - doing a live demo but the interface looks nice, does cool stuff and really seems to get a lot of this right - Cameron Neylon
working on improved work and latex integration - Cameron Neylon
this collaboration aspect gives it real power - Richard P Grant
drag and drop backup to online library - Cameron Neylon
I approve of going for the 3 major OS simultaneously - Richard P Grant
but the application works without the collaboration and already provides that added benefit by not requiring you to type in the metadata. This is a wonderful example of what I've been writing a bit about recently. Provide a good app, add a nice bit of extra functionality, and then introduce the killer next step capability once people are drawn in - Cameron Neylon
rss feeds of various aspects - Cameron Neylon
going to provide option for privacy - interesting to know what the default is because real power will come from large numbers - Cameron Neylon
I think without the collaboration it's just another—very clever, admittedly—reference manager - Richard P Grant
Don't get me wrong, I'm impressed :) - Richard P Grant
I agree its just a reference manager to start with - but that's what is needed to get people in, you need something that fits in with an existing workflow - once they are sucked in then blow them away with something new - Cameron Neylon
I might give this a shot when I get back - Richard P Grant
I've just downloaded it but I wont' try to install right now :-) - Cameron Neylon
if its reasonably bug free then I will probably get the group to use it at RAL - Cameron Neylon
you should live-blog your experience with it ... - Richard P Grant
I'm not sure I can split my attention quite that many ways at the same time - or at least i will wait until lunch in any case... - Cameron Neylon
Here's a shorter YouTube version of the talk I gave: - Victor / Mendeley Team
I've been playing around with it and it has a lot of potential, not least because it is extremely intuitive. waiting to put more in to see if I can get more out. - Heather
Mendeley sounds very similar to Labmeeting (though with higher profile investors). Anyone have any thoughts on how the two might differ (or not)? - Shirley Wu
Richard P Grant
real names -> no instances of vandalism - Richard P Grant
Cameron Neylon
Dave de Roure on MyExperiment -
A social website for scientists (kind of) - mainly for sharing workflows - Cameron Neylon
Main community is sharing Taverna workflows - Cameron Neylon
Packs provide a way of collecting sets of arbitrary stuff both inside and outside of MyExperiment - provides some versioning for inside and outside objects as is possible (primarily via MD5) - Cameron Neylon
no longer just 'Facebook for Scientists' but different as well - Cameron Neylon
sharing workflows enables re-use - specific example looking at parasites in animals - Cameron Neylon
part of a distributed system - not focussing on a monolithic system - Cameron Neylon
not trying to speed up machine processing but speed up the human side - Cameron Neylon
some hope about re-trying to use Taverna to process SMILES codes to create virtual molecule libraries - Jean-Claude Bradley
Cameron Neylon
Jean-Claude Bradley
Cameron is counting people in the room
Jean-Claude Bradley
Jean-Claude Bradley
