Turned non-fan oven on to 140 C to pre-heat. Took a splash of balsamic vinegar, olive oil, orange blossom honey (not that the orange blossom bit is important), couple of grinds of pepper, and a small dollop of soy sauce. Brought to the boil and reduced for a few minutes to prepare glaze.
- Cameron Neylon
Cut two times ~200g square of of pork belly and spread the glaze over, recovering the glaze that ran off for later. Placed pork pieces in oven at 140 C
- Cameron Neylon
I can't stand it anymore. I'm getting some authentic bbq tonight.
- Sung W. Lim
We have more pork, you can come here if you like...
- Cameron Neylon
I'm hungry - good thing going out for lunch now. pics requested!
- Rajarshi Guha
After 90 minutes quickly browned six shallots (small onions, not green shoots) and placed them under the pork. Basted pork with remaining glaze and own juices, covered with foil, and reduced oven to 130 C....now to peel potatos
- Cameron Neylon
Peel and slice potatos and boil. Top and tail leeks and put it small oven-proof dish. Cover with chicken stock and put in oven at 180 C
- Cameron Neylon
Mash potato with a good dollop of butter (the secret to great mash is just disturbingly large quantities of butter) and mix in a little of the cooked apple. Fill the ramekins with the mixture and leave to cool while...
- Cameron Neylon
I didn't know you drank olive oil, Dr Nylon.....
- Graham Steel
Ok. We're at T-30 and this is where things have the potential to go pear-shaped. The _plan_ is. Turn bottom oven off, put in plates to warm. Put apple mash ramekins in top oven at 130 C. Remove pork and leave to rest, pouring off juices and combining with stock from leeks, reduce with white wine, to make sauce. Heat butter in frying pan and fry sage leaves, recovering and putting on...
more...
- Cameron Neylon
First move off piste - the onions need a bit more time - glazed them with sticky stuff on pan then grilled under a high heat for a few minutes. Now clearly out smoke. [ed. That's, clearing out smoke obviously...]
- Cameron Neylon
there is a big screen up the front showing the BL room - I'd forgotten that the default UI would show a big smiley picture of me if I set up the item :-)
- Cameron Neylon
Roughly how many folks are there, Cameron?
- Graham Steel
There is probably about 60-70 people in the room I would guess
- Cameron Neylon
Just starting now...introduction to the BL
- Cameron Neylon
Describing digital initiatives at the BL - Shoutout to UK PubMedCentral, beta version of the website coming soon
- Cameron Neylon
BL working with MSR on virtual research environments #blts
- Fiona Bradley
Twitter using #blts will be automatically streamed into the friendfeed room
- Cameron Neylon
Sarah K introducing John - thinking for a while about having a session on the future of scholarly communication
- Cameron Neylon
Overcoming incrementalism is a real challenge - large shifts require big changes in thinking. There is much more to scientific knowledge than the printed paper
- Cameron Neylon
research materials, data, ontologies, annotations, wikis
- Cameron Neylon
What do we do with ontologies, annotations and wikis? Semantically enhanced articles are one way (yes!)
- Fiona Bradley
showing a semantically marked up paper - the one from PLoS Neg Trop Dis that was described in PLoS Comp Biol
- Cameron Neylon
Google works because hyperlinks are sparse but with a fully marked up paper there are very dense links. May need to do something different, as well as "just" doing the markup.
- Cameron Neylon
Sem articles can also include raw data from that article, and relevant data from other work.
- Fiona Bradley
Now showing the http://beta.cell.com - it looks new. "It reminds me of when AOL told us what the future of the web was and it looked a lot like TV..." Want raw data, real information.
- Cameron Neylon
We need radar, not earhorns, it is going to look very different and be disruptive. Just like radar was.
- Cameron Neylon
Elsevier's Article of the Future project. This reminds me a bit of Herbert Van De Sompel's work of a few years ago on linking from data through to article and subsequent citations.
- Fiona Bradley
jove - peer reviewed instructional videos. Web is "forcing its way into science"
- Fiona Bradley
If you'd said in 1992 that the web would destroy bookstores you wouldn't have been believed. Don't believe people when they say they know what the future looks like.
- Cameron Neylon
Four princples of publishing: registration, certification, disseminatoon, preservation. Regisstration and dissemination obviously broken by web
- Cameron Neylon
Multiple copies of web data makes it harder to fake/change data.
- Fiona Bradley
Instant registration and dissemination. But that is not enough. Otherwise could set up a random statement generator and wait for the Nobel prize to roll in.
- Cameron Neylon
Preservation: old way was the library. But we don't do so well online. LOCKSS is a good initiative.
- Fiona Bradley
Preservation in the digital world is not so good. Internet archive is not comprehensive. Mention of LOCKSS program. Multiple copies with hashes.
- Cameron Neylon
Peer review combines two things and confuses them. 1) Is it sound? 2) How important is it?
- Cameron Neylon
First is (hopefully) more objective than the second. Showing a funny slide out peer review. I don't know the source. "Strange Matter"?
- Cameron Neylon
is a piece of work impactful? Very complex. Peer review is there so we are not disproven?
- Fiona Bradley
new work displaces knowledge - it is harder to get the first paper published on a topic than the 3rd or 4th.
- Fiona Bradley
PLoS ONE example - separating validity from impact - can anyone point me to the paper that correlated a persons publication impact with the likelihood of rejecting papers? Just got queried but can't find in Friendfeed right at moment
- Cameron Neylon
showing the sem web stack, and discussing the legal aspects of the web. "what is a copy?"
- Fiona Bradley
Power of standardisation: TCP/IP and HTML examples. How to do for scientific information? Need three layers, technical, legal, and semantic. Semantic web doesn't really work yet. Legal infrastructure is getting there but issues as science is often not about making a copy - therefore copyright doesn't apply
- Cameron Neylon
FYI, if following on Twitter, petermurrayrust and peterballantyne are both live tweeting the event. Use #BLTS at search.twitter.com.
- Jill O'Neill
There are as many names for genes as there are names for coffee
- Cameron Neylon
Need common names to be able to share anything semantically.
- Cameron Neylon
need to agree on common names for things to create an ontology. common annotation needs common names. last challenge is federation.
- Fiona Bradley
Federation: There are at least 1000 databases with over 250 different terms of use for nucleic acids. BUt using climate change/polar info from Polar year project. Lots and lots of pieces. Trying to make informed decisions about climate change. 399 different databases with different terms on reindeer herding and climate change?!?! Did I really get that right?!?
- Cameron Neylon
Discipline exceptionalism: "No one else appreciates how complicated our work is, therefore we need to write our own definitions and terms and policies"
- Cameron Neylon
better to not do things, rather than do them wrong, with disastrous consequences?
- Fiona Bradley
If we get it wrong - won't just be garbage in garbage out, but amplified garbage out. Systems could be unstable. Single copyright case brings the whole thing down. Wrong protein to wrong gene, people get sick and die.
- Cameron Neylon
legal, techical and semantic resistance points in the wire before we have genuine articles coming out of the system.
- Fiona Bradley
Moglen's analogy to barriers to open source. Ask not why the electrons want to move around the circuit. Ask what the resistance of the wire is? Key Question: What's the resistance and how do we overcome it?
- Cameron Neylon
you're welcome graham, send you my chiropractor's bill later :-)
- Cameron Neylon
wondering how many other people here are library people (other than BL staff of course!)? what can we do to help develop open science?
- Fiona Bradley
Hmmm, not sure how many are library people per se. Most seem to be publishing or academic trouble makers...
- Cameron Neylon
gathering up the crowd for questions
- Fiona Bradley
today's OKFN virtual meeting is now in full swing - can I keep up both? I'll try :)
- Graham Steel
Q: Author ID. How are people associated with the work they do? (huge issue!) A: Open ID unsatisfactory. there are several other services. Libs and publishers the right providers for this (note: some libs working on name registers)
- Fiona Bradley
honesty a huge issue in science. This issue has to be resolved. when people generate data, they should create carefully using common names for things from the start so data can be interoperable.
- Fiona Bradley
John: Its a complex problem, solution probably going to come from a stable institution, Universities, governments, publishers. (Me: Signfiicant question mark over universities' stability though)
- Cameron Neylon
John: Deciding what to throw away may become the big problem as data explodes
- Cameron Neylon
Who will store massively huge data sets? When do you make the decision to delete?
- Fiona Bradley
what made the web was good timing, and openness.
- Fiona Bradley
Complicated question about: "to what extent can you draw an analogy between 'mixing up scientists and the web and seeing what comes out free' and natural evolution". John answers talking about walled garded net services circa 1992. The web won out, in an evolutionary sense, because it was open and anyone could play.
- Cameron Neylon
lots of dumb ideas will die, lots of good ideas will die bcause of bad timing, but some will survive. the vision for semantic articles now is unlikely to be how the eventually look
- Fiona Bradley
Q: researchers aren't really predisposed to share? A: but publication by patenting is disclosure and sharing.Issue of acclaim and notice to the author. This gives the incentive to share. Need to be able to track impact of blogging, sharing data and connection with impact and acclaim. Then they will share. Need new ways to get credit in science for your work
- Fiona Bradley
Stephane Goldstein: "Is there a mistaken heroic assumption that scientists like to share" [giggles from audience]. John: Publication and patenting are sharing. Not because people want to but because people get registration and validation. Therefore a question of providing incentives to share materials and links. Key is enabling people to claim credit through things other than...
more...
- Cameron Neylon
Q: in this interconnected world, is the structure of academic articles stlll valid? A: yes. procedure, method, abstract still valid.(lots of raised hands for extra comment!)
- Fiona Bradley
comment from audience - data needs to be truly available because it is generally not possible to recreate someone else's work and test their hypothesis without cost.
- Fiona Bradley
reading through the thread... "a session on the future of scholarly communication" sounds like a great topic for future shindings
- Daniel Mietchen
comment from audience: no one really reproduces experiments though. they work to test hypotheses of others. John: but the parts you need to reproduce are the things you need to build. Research needs to be reproduceable (and open?) - even if not done in practice
- Fiona Bradley
papers as an advertisement for research? (discussing comments by Alma Swan on future of article - the new role of science blogs etc)
- Fiona Bradley
Q. from the OKFN session that I would like to put to John and post back "Can we tell a story about people who have been helped by open data in development yet, or is it still theoretical"
- Graham Steel
batteries about o go here. Ok - will try to ask that question - schedule is slipping here. John doesn't like making imprecise short answers...
- Cameron Neylon
comment:not all data is bad data. But need to be able to spot systematic errors. Data needs to be annotated.
- Fiona Bradley
q standards for dscribing articles. eg mods, mets. who is going to fix it? A: look at screeds against HTML in the early days. The simplest standard that is extensible and correctable wins. But there is also a certain amount of luck involved.
- Fiona Bradley
sem web will allow us to map contributors in one piece of data with authors in another: find both. But must be able to weed out what is merely similar.
- Fiona Bradley
q: peer review should evaluate impact and quality together? a: this is the best reason to do peer review, so you know what to trust like brands - like you might trust coca cola.
- Fiona Bradley
"being a peer reviewer is a difficult burden, being on an editorial board looks good on your cv"
- Fiona Bradley
and what if, instead of listing editorial boards on your cv, everyone could see the reviews you did in public?
- Daniel Mietchen
q; key motivation for researchers is street cred from their peers- right now that is through citations but there are other models. how do you build these new mechanisms in a networked world? a: something that can be printed for a review meeting that shows impact?
- Fiona Bradley
q: do we have a responsibilty to make articles accessible in the online environment?
- Fiona Bradley
comment: still a need to constrain the size of publications so that they still make their point, but also using the potential of new technology well
- Fiona Bradley
q: role for the BL in this? a: work on DOIs for data, unique names for things is essential. norm setting role for libraries as a partner with publishers.
- Fiona Bradley
comment: this forum provides a neutral ground to discuss these issues, is good. (valid point!)
- Fiona Bradley
q: there is too much to read! traditional method of reading needs tweaking. if your lib doesn't subscribe, you can't access materials: a: open access is the solution (laughter). w use journal brand as a proxy to help determine what to read. we need to be able to evaluate information is it worth reading.
- Fiona Bradley
great talk thanks to John, the room and the BL
- Fiona Bradley
constraining size of publications is certainly useful, but could be much more effective if we'd replace the introduction of manuscripts by hyperlinks to collaboratively written encyclopedic articles that are updated as new results come in - http://bit.ly/4ftzn . Similarly, the methods and results sections could simply link to the relevant data pages at places like OpenWetWare, and the article itself could focus on the discussion aspect.
- Daniel Mietchen
Some people make a lot of money out of textbooks but not very many. Very long tail.
- Cameron Neylon
Big text books take a lot of work - how does the textbook look in a web based world. What is a textbook for if the university doesn't exist any more.
- Cameron Neylon
chris is pretty god like ... but not christ, i dont think. :)
- Kaitlin Thaney
thanks for that - typing not good at this time in the morning - insufficient coffee...
- Cameron Neylon
Chris - the lack of return on text book authoring is an opportunity for new types of approach.
- Cameron Neylon
There is no shortage of text books, there is a glut
- Cameron Neylon
Alicia, "I would pay $20 if I knew it went straight to the author"
- Cameron Neylon
discussing the concept of "beta" versions of books. The Pragmatic Programming series.
- Cameron Neylon
Alicia - benefits for people who need accessible versions
- Cameron Neylon
John - PDF sucks, need something much better than PDF for flexibility.
- Cameron Neylon
Martin - but want access to multiple formats
- Cameron Neylon
Bosco - print on demand is getting cheap and very high quality
- Cameron Neylon
Sean - open access text books is not a technical problem but a social problem - how do you fund the content
- Cameron Neylon
Project at stanford to support making course material creative commons
- Cameron Neylon
Problems - how do you create new content (resource). If use legacy content then there are problems with copyright infringing content
- Cameron Neylon
Chris - Wikipedia a reference, not a teaching tool. These things are different. Need to provide a route through the content for the average student.
- Cameron Neylon
Needs to be seen from the perspective of the teacher.
- Cameron Neylon
Good question - what do teachers use a textbook for? How do you leverage the needs of the teacher to support the creation of better objects that could replace textbooks.
- Cameron Neylon
Bosco - major contribution of publishers is the editor - collabortive authoring is a challenge for long documents
- Cameron Neylon
debate over the value of a well integrated long work - clearly works well when you actually work through the whole book - does anyone ever really do that?
- Cameron Neylon
John - would like guidance through the process of using a "text" - where you've been and where you're going - quizes, checkups etc..
- Cameron Neylon
comment from boston: what to do about the various flavors of CC licenses on OCW content? most have BY-NC, some materials with SA - a design decision made without the forethought of interoperability, reuse (in the context of virtual textbooks) and remix. how/ what to do?
- Kaitlin Thaney
How much does it cost to fly in eight people for four weeks to get a draft together. Suggesting around $50k, so 500 people at $100 each. Bosco points out many people who are setting up courses for the first time could have an interest in putting the work in.
- Cameron Neylon
Kaitlin - I'll try and feed that in. I think people here are relatively unaware that these already exist. Do you know what the best current examples are?
- Cameron Neylon
Question: Is this something Google.org might fund? (that's a reminder to myself)
- Cameron Neylon
Chris - in the world where we have $50k to support one of these could put out a call for proposals
- Cameron Neylon
would have to check OCW policies, but i know that they're not all consistent across the board for various universities across the world, and most definitely not CC-BY ... lemme look :) thanks cameron. just something to noodle on.
- Kaitlin Thaney
Need to identify a good area where a current text doesn't exist
- Cameron Neylon
Martin - is the two week retreat model a good one or a wider online collaboration
- Cameron Neylon
Me: need the "retreat" or a focussed effort to get the seed content in palce
- Cameron Neylon
Martin is a retreat per se great or does it need to be cheaper? What is the right number of people.
- Cameron Neylon
Chris - what field is small enough and needs a textbook- me thinking Small Angle Scattering but not sure this is of interest to this group really
- Cameron Neylon
K, so to be part of the OCW consortium, the content needs to be subject to a CC license, it seems, or at least is strongly recommended. MIT is CC-BY-NC-SA. but other institutes, such as Michigan State doesn't cite a CC license and provides some content only read-only. So short answer ... it varies. Ugh.
- Kaitlin Thaney
Sean discussing status of Stanford efforts on open curriculum content - currently in a holding pattern
- Cameron Neylon
Asking the question - why is there no secondary market for teaching materials given the popularrity of blackboad and moodle
- Cameron Neylon
Fabbiano talking about NZ universities purchasing Springer textbooks "like journals" on a chapter by chapter basis
- Cameron Neylon
The books are purchased as electronic versions by libraries, as far as I know as whole, not chapter per chapter, but can download only chapter of interest
- Kubke
science has a strong culture of attribution, but can be tough to do on the web
- Duncan Hull
I was too busy following the discussion, forgot to comment here.
- Martin Fenner
Thanks for the input from afar, KT, esp, your first comment - rumors of my divinity are greatly exaggerated. :-)
- Chris Patil
Sorry, coming in late. Before O'Reilly programming, etc books weren't exactly what we've become used to. I wonder, even in a non open-textbook world, what would happen if a publisher that enabled people perhaps not that well known to just write stuff.
- Deepak Singh
Just came across wikibook (http://en.wikibooks.org/wiki...) through this article (http://mashable.com/2009...) (via @edNZ). I also read (where?) a suggetsion to have students 'edit' wikipedia pages instead of handing in papers in their courses. Maybe a collaborative effort having the students contribute to wikibook content as part of their course requirements might be a good place to start?
- Kubke
OH well, I came across this page in wikibook: http://en.wikibooks.org/wiki... I guess any project will have to have better editorial control?
- Kubke
July 28, 2009 - San Francisco Commonwealth Club - Joi Ito and John Wilbanks
- Richard Akerman
lisa, had an email from a woman in the UK asking me at SC for your contact info. said it was in regards to stefan gleanzer? message me @kaythaney when you get a chance :) thanks!
- Kaitlin Thaney
any other ideas? (are you interested?)
- Benjamin Good
Hell yes, but if there are people like Timo and John and the other signatories on board not sure exactly what I can add. Also trying desperately to cut down on my level of over-commitment and this sounds like an actually _doing_ something group...
- Cameron Neylon
ya - the plan is to actually do something this time ;). I think its an ideal role for young keener looking to make a name. It could grow into a real position (paid) within about a year I think. You might consider signing up as a potential reviewer for the group. They have three participation levels - chair, member, and reviewer.
- Benjamin Good
Tell me more about this? URL? (I'm coming to the discussion late.)
- D0r0th34
They are trying to set up a bunch of working groups to collaboratively create a (working..) version of the semantic web in the life sciences. Its cool.
- Benjamin Good
definitely interested - is there any info on what the roles really entail - and whether there is real resource to do things? Sounds very positive in many ways and the initial meeting looked very interesting.
- Cameron Neylon
not sure how long i can make a post here.. but let me try this. Here is the draft proposal for the group i'm talking about. Attribution (micro- and nano-credits) Chair: Benjamin Good Group members: Jan Velterop (co-chair), Geoffrey Bilder, Peter Suber, Jill Sorensen Reviewers: Mark Wilkinson, Stephen Uzzo, Marc van Driel Scope: The provenance of triples needs to be captured and...
more...
- Benjamin Good
but it is a large group with some very powerful members - if the politics play out well I think they build something cool
- Benjamin Good
Well, feel free to put my name (Dorothea Salo) in front of people -- a few of them already know me (Wilbanks, Bilder, Suber). Insofar as I can participate without physical travel, I'm in. I have spec-writing and -editing experience that I would be willing to contribute to the cause.
- D0r0th34
Hi Dorothy, the best route if you would like to participate is probably to head over to here (http://conceptweblog.wordpress.com/declara...) and post a comment, but I will also float your name over to Barend (Mons) who is my contact in the group and its main driver.
- Benjamin Good
Nice post, Bora. On a related topic: many physicists will preferentially cite arXiv preprints, instead of the later journal version of the same paper, just because the preprints are more reliably available. The American Physical Society then goes through and replaces the preprint references with journal references. This drives many people (including myself) crazy.
- Michael Nielsen
People should always cite the most relevant source of information and that is sometimes on a web page (most journals have allowed that for some time now). We have cited both blog posts and wiki pages in our last paper without any problems http://www.jove.com/index...
- Jean-Claude Bradley
Before I read either of these, peer review is valuable (as you know, Zivkovick). I realize ease of distribution will push passionate researchers to participate w/o financial incentive, but it will take time for the academic ecosystem to warm to an environment that prevents this long-held method for extra economic benefit in a career that is already under-compensated.
- coldbrew
Yes, of course, it is valuable. But a) it is not as all-powerful, trumping everything else, as some people insist and b) peer-review can have other forms than the traditional form.
- Bora Zivkovic
Yup, that is the stupid format I blasted when it first came out. Linked in my crayfish post....
- Bora Zivkovic
Interesting comments on the post itself - the one by deadalus2u, for example.
- Bora Zivkovic
Wow. I'd never seen that NLM format (not surprisingly). The date the blog began? The city in which the blog is published? Good thing I don't do scholarly/peer-reviewed papers; I thought the author, post title, date, blog name, and blog URL were more than sufficient (and I wonder about the blog URL, given site changes).
- Walt Crawford
Walt, I agree. Also: "accessed on Date" is useful.
- Bora Zivkovic
As for including the blog URL, I think it's going to be really hard for anyone--scientist or non-scientist--to take something seriously if it doesn't have a permalink.
- Victor Ganata
Correct. And URL to homepage does not cut it. It has to be the permalink to the exact post that is cited. Imagine citing something as just "Nature" without providing year, issue and pages? It's the same for blogs.
- Bora Zivkovic
So what happens when you cite a blog post and then 10 years later the site dies/loses interest and the content is no longer readily available?
- Brian Krueger - LabSpaces
Google Cache, WayBack Machine and there are other, newer, better services now I hear.
- Bora Zivkovic
Does your library have a copy of the journal Mendel published his pea stuff in? Hardcopy?
- Bora Zivkovic
Brian: To add to Bora's last response, if scientists start to cite, then I think it's likely dedicated archiving facilities will follow. The US Library of Congress now maintains an archive of more than 100 legal blogs, for example.
- Michael Nielsen
Yes, WebCite, thanks Bill, that's the one I was thinking about. Actually the SVPOW post about this has been deposited there already (see the comment thread there).
- Bora Zivkovic
Wayback machine and google cache can't be considered long term solutions. Their coverage is spotty at best. I like the looks of webcite, but again, the central server needs to be run and hosted by a group that will never lose funding (WebCite is run out of UToronto) or go out of business. It seems to me like it should be run by the NIH (or some other government agency)
- Brian Krueger - LabSpaces
@Brian: I believe you can now submit to the Wayback machine to make sure it covers what you want covered.
- Bill Hooker
As in journalism vs. blogging, here also the tussle bet. processed info (other papers) & (perceived) transient info (in blogs) is a reason for not citing blog posts in sci. papers.
- Arunn
The situation won't improve unless scientific papers also are written/reported in a open, transient fashion - allowing peer review itself to be done as "comments" under the respective "research blog papers" with a fail-proof mechanism for ownership precedence (if necessary).
- Arunn
@Arunn you would need something more powerful than the Web and possibly more specific for that, ja?
- Rudolf Olah
"It turns out that Elsevier put out six such journals, sponsored by industry. The Elsevier chief executive, Michael Hansen, has now admitted that they were made to look like journals, and lacked proper disclosure. "This was an unacceptable practice and we regret that it took place," he said."
- Michael Nielsen
A Tweet from Ben (Goldacre) earlier today indicates that the BBC World Service will be running something about this later today. No url as yet though.
- Graham Steel
Very glad this story has gone mainstream
- Garret McMahon
Thanks Kaitlin :) The software is not open, but the underlying dictionaries are available under the Creative Commons Attribution License. We are working on adding user editing and on setting up an API to allow Reflect to be used from other services.
- Lars Juhl Jensen
Just to address the discussion in advance: I am aware that copyright law likely does not apply to dictionaries. But in that case they can anyway be used for any purpose allowed by CC-BY, and scientific decency would demand the attribution part of the license :)
- Lars Juhl Jensen
Lars, any interest in using reflect on the Gene Wiki? There's a fair bit of raw text there that could definitely benefit from entity recognition (and subsequent wikilinking). http://en.wikipedia.org/wiki...
- Andrew Su
Andrew, you can already use Reflect on any Wikipedia page on want; simply install the Firefox plugin and press the button :) It may also be worth noting that we are working on adding Wikipedia terms in general to our dictionary, which means that Wikipedia terms can be highlighted in any web page, and that a reduced version of the Wikipedia page will be shown in our popup if you click on them.
- Lars Juhl Jensen
One more thing to add: we also have a Javascript button that can be added to any web page to allow viewers to Reflect the page even if they do not have our plugin installed. So if the people behind GeneWiki are interested, one could imagine adding this to all the GeneWiki pages.
- Lars Juhl Jensen
Lars any thoughts about an API that would enable others to leverage the service in their applications and UIs? That might allow you to monetize Reflect as well
- Deepak Singh
Deepak, if you look a little bit further up in the thread you will see that we are working on an API ;) However, we currently have no plan to monetize Reflect, largely because we will soon have collaborative editing of the dictionaries. It would seem a bit unfair to ask the community help make the dictionaries for free and then afterwards charge money if the community wants to use the result of their efforts.
- Lars Juhl Jensen
Lars ... I can read, I promise :-). I am all for not monetizing this, but APIs always keep that option availables, even when the dictionaries are open. Thanks
- Deepak Singh
I know you can read Deepak - the question is just if you can read as fast as FF sends new text in your direction ;-)
- Lars Juhl Jensen
Ah, the science API monetization battle...
- Richard Akerman
Oh is there a battle over that? I must have missed it ... shuffles back to my peaceful world
- Lars Juhl Jensen
Hi Lars, yes, this is something I'd want to provide for all Gene Wiki users, and while I think the javascript button is cool, I'd want to convert recognized entities to wikilinks en masse. So it's not about providing reflect to readers, it's about using the reflect output to change the WP page. The API would be great for this, unless you wanted to collaborate before the API is officially released...
- Andrew Su
@Andrew: once the Wikipedia terms are in (and we're trying to come up with ideas to avoid the trivial terms), it would certainly be possible to "wikify" pages with the help of Reflect. Wikipedia conventions say that only the first occurrence of a term should be converted into a link (http://en.wikipedia.org/wiki...), so the current Reflect method would have to be tweaked a bit
- Michael Kuhn
@Andrew: We are still working on adding Wikipedia terms to Reflect and have started to design an API. Once that is in place, it should be possible to use it for your purposes; my idea is that the API will tell which terms have been found and the code at your end will have to do the actual "wikification" based on that (i.e. convert the first instance to a link). Would that interface make sense to you?
- Lars Juhl Jensen
Michael and Lars, sounds great. Let me know then when you have something we can start playing with...
- Andrew Su
Michael, yup, that's something we're working on. We're shooting for something that's modeled after Diberri's template filler (http://toolserver.org/~diberr...)...
- Andrew Su
Lars & Michael, any interest in linking from the reflect popup to BioGPS for reference gene expression data? e.g., http://biogps.gnf.org/generep... Those data have been cited 1000+ times (though I know citation rates aren't cool in certain circles around here.) ;)
- Andrew Su
Andrew, it is our plan that the right-most small image in the protein popup will show where in the body the gene is expressed. It is not at the top of our priority list, but it could be a good student project for the right student.
- Lars Juhl Jensen
Glynn is becoming a little too dogmatic isn't he?
- Deepak Singh
Yes, and no. The point that someone needs to take some stand on these things is reasonable. I'm not sure I think that particular one is a battle worth fighting though. I did give the people in question an earful about it though...
- Cameron Neylon
Focus on lengthening your line http://is.gd/qRfg instead of dissecting the line of others. This discussion reminds me of an old Zen riddle I just read. Zen and the art of Open Science
- Jim Hardy
The way to overcome difficulties is by strengthening oneself. Raise your level of knowledge, further develop your own skills and like the first line, like any problem, automatically becomes smaller. http://is.gd/qRgD OK, I am done proselytizing
- Jim Hardy
But I think his fundamental assumption is incorrect. Scientists are way more open to open source than to open data. Their reasons for using closed source are mostly pragmatic. In almost every case, where an open alternative exists which is good enough people use that.
- Deepak Singh
That is true - the software isn't really the key point - but its a rallying point which those that came out of OSS movement understand and know how to argue. Data is a much more slippery concept for most I think. Possibly also a sense that those advocates don't want to equate scientists with government in their attitude to data - as in their giving us the benefit of the doubt?
- Cameron Neylon
@Deepak, we've been here before, but I still say if you were right people would use Open Office. It *is* good enough -- the remaining barrier is a short learning curve and simple laziness.
- Bill Hooker
Bill, I've tried to use it (didn't have office at home), but after several months of frustration gave up and spent the money
- Deepak Singh
But I have yet seen a case (and I am sure there are some) where you get dictated to. As someone trying to sell software into academia, the preference for open source was a big reason why selling scientific software is hard. R is popular because it is useful and in the end companies like Insightful couldn't compete against both SAS and R. Sun Grid Engine works, which is why people use it. And this works not just for open source software but for any competitor.
- Deepak Singh
Let me pose this question. Matlab and Mathematica are fabulous applications. They are completely closed source. Does that mean that all science done with Matlab and Mathematica is bad science since we don't know what the engines are doing?
- Deepak Singh
@Deepak, I say no way. I heavily use LabVIEW and don't feel it is contrary to my goals for open science. One thing about LabVIEW...even though the compiler is expensive, I can create executables that anyone can install for free. (except the image processing library costs a license fee to install)
- Steve Koch
You should feel free to use your proprietary ruler to measure whatever you want, but scientific repeatability means you need to tell me exactly what you measured. As rulers are (theoretically) interchangeable this makes sense.
- Paul J. Davis
Paul, no disagreements on that point.
- Deepak Singh
And a slightly OT WTF for Open Data, my division at work (which is 100% philanthropic) sequenced a bacterial genome a couple years ago and we're being told that we're not allowed to use data derived from that genome by one of the major academic databases.
- Paul J. Davis
I don't know that the ruler analogy fits what I'm thinking about Paul. But I guess what Cameron is saying, and I agree with is that the sharing of the data is much more important than whether the "ruler" is open source. "Rulers" that are free to acquire are not worth much if the data are not available to work with. Conversely, use of multiple "rulers" is actually a good check on whether the rulers are working properly.
- Steve Koch
I'm perfectly willing to share any of my LabVIEW source, code, and in some cases we have already, including one sourceforge project. It's jut that the proprietary compiler is quite expensive. To me, that is consistent with openness and scientific repeatability.
- Steve Koch
And a slightly more OT comment about OSS, if you've ever released software that says "Free to academics, corporations contact us" you are not allowed to use any of my OSS contributions. Any contributions I make to public knowledge are only intended for consumption by other corporations.
- Paul J. Davis
The "ruler" that is free to others (such as Python, Ruby, whatever) actually has a very high cost to me. If you include my salary, it's a cost much higher than an individual purchase of LabVIEW to learn how to use another application. And it's highly likely I would not achieve a level of efficiency in terms of controlling data acquisition that I currently have. Not to mention throwing away a decade's worth of software.
- Steve Koch
Although on that point, Niall Haslam made a good point on my post which is that it may be a high cost on my time but what about the students? Arguably more productive for them to learn to OSS tools, particularly in the long term if they then move onto a lab that can't afford the closed source ones. It's a slightly longer term view. I should point out that Niall was partly my PhD student and was forced by a postdoc, partly against my wishes originally, to use entirely open source tools. He's walked that road
- Cameron Neylon
The key point for me is that OSS can work as an open platform. In the long term with the right population of skilled people this is extensible and integrable in a way that closed source isn't and can't be. But this is a long way away from the day to day concerns of your average scientist.
- Cameron Neylon
That's a good point, Cameron (or I guess I should say Niall). I still, though, would lump the goal of using open source "free" software in with a goal of using non-proprietary supplies & hardware in the lab. I think it's often unrealistic and the science can be open while using proprietary tools (software, hardware, equipment, materials, supplies).
- Steve Koch
We've had pressure to make ChemSpider Open Source. We can't..it's on the Microsoft platform. So people said support MySQL. We produced ChemSpider over a few months in evenings and weekends. I'd love to see an Open Source platform that can host a database of >21 million compounds
- Antony Williams
There are enough platforms that would do that. Yahoo uses MySQL, Amazon uses key value stores for a big chunk of it's data (and relational databases), I know a few postgres based solutions handling 100's of millions of records. You can architect something like this in open source
- Deepak Singh
Having said that, I don't think you need to go open source. Your architecture is an implementation detail. Your data and APIs are the critical part
- Deepak Singh
Deepak: depends... if your architecture does not touch your data sure... as soon as it starts processing your data, then black boxes are evil.
- Egon Willighagen
Egon, quite true. I am assuming that Chemspider if more of an API to access and integrate data, rather than manipulate
- Deepak Singh
Keep in mind that ChemSpider too, processed data... it normalizes, might here and there assume stereochemistry, be limited to file format limitations... but I guess it depends on which API bits you actually use if you are using processed data or not...
- Egon Willighagen
Open sourcing those components shouldn't be a problem, since those aren't exactly differentiators. Probably useful transfering some of those lessons anyway
- Deepak Singh
ChemSpider actually comes with a number of molecular properties calculated with proprietary software... I have been thinking of putting up a service to provide CDK-based alternatives for them, but never really got around to that... but, if you happy to know a JavaScript library for XMPP chatting, I might just have what we need...
- Egon Willighagen
We have just announced the winners of the Science Blogging Challenge (The challenge was to get a senior scientist to start blogging, announced at last year's Nature Network Science Blogging conference). - http://network.nature.com/groups...
Well frak. I had a senior scientist blogging and was going for gold and somehow I missed the deadline and have only myself to blame. Congrats Shirley and Russ. *stubs toe on purpose*
- Karen James
I think your point has legs, although I would quibble with the "anti"-evidence sobriquet. It seems to me most people are like this: the concept of validating hunches with objective data doesn't really apply. Outside science, R&D and perhaps some market analysis, I'm not sure anyone actually gives much thought to the prediction/validation model. It's counter-intuitive for Joe Bloggs. Which is a sad failure of education.
- Chris Cotsapas
Yes, you are correct that not too many people pay much attention to the prediction/validation model. But I think it is clear that Bush was pushing against evidence in every arena - Law, Military, Environment, Medicine, Security, Privacy, etc. I think we have not seen this in any president possibly ever and certainly not for a long time.
- Jonathan Eisen
To Carole Goble (myExperiment): did you experience network effects? Carole says, a bit of seeding and curation needed at the beginning to put a foundation in place and ensure users of quality. Since trust is so important, a process for creating a safe environment is crucial, a place where people still retain some control over how much they share, etc.
- Shirley Wu
Getting users to tag things can still be a challenge. Might be because tagging was in an unnatural place in the workflow.
- Shirley Wu
Sean Mooney from Indiana University: "If you build it, they WON'T come." Note, he helped develop Laboratree, one of the other so-called "facebook for scientists". He says one thing they've emphasized is that funding institutions, grant providers etc appreciate when researchers indicate they will deposit, share, make available their data, or use these collaborative tools
- Shirley Wu
Nigam: something to talk about is the contributor to user ratio. In wikipedia, it's really low, millions of users to thousands of contributors. In science wikis, the ratio is closer to 1. You can't crowdsource effectively with a ratio like that
- Shirley Wu
Drew Endy now talking about OpenWetWare, a wiki resource that started in his lab at MIT by students. When his students started sharing protocols and resources on a lab wiki, other labs took notice and it's now pretty big, ~500 active users. But now the funding has stalled out and they want to get some data on how much impact OWW is having on research
- Shirley Wu
Showing a slide: Idea cycle --> research cycle --> publication cycle. Where are the gaps? what are the priorities? Open science not just open data, but all three cycles
- Shirley Wu
Example: current publishing cycle way too slow compared to current state of collaboration and authoring tools like Google Docs.
- Shirley Wu
Phil Bourne: there's a dichotomy. Do we just refactor things we already have, or do things completely differently? Refactoring can be inefficient but it can be effective in the short term
- Shirley Wu
Steve Brenner asks Drew Endy: what metrics could you use to measure usable outcomes for OWW? Drew answers: one metric for him could be has any user recorded all of a research progress on the wiki? How long did it take? etc.
- Shirley Wu
The problems of lack of concrete success stories is not only true for OWW but much more general for open science. It looks a bit like a chicken and egg problem of lack of identifiers/rewards that create incentives and lack of success stories to justify the changes in rewards.
- Pedro Beltrao
Heather Piwowar mentions: Information behavior issues like these are of great interest to information science field - lots of research and conferences going on so useful place to look for ideas
- Shirley Wu
Cameron mentions one example of completely open recording of the complete research cycle: Jean-Claude Bradley and UsefulChem
- Shirley Wu
Citation and credit is tricky - a lot of people might be using it but don't think to cite it. "No one cites infrastructure." "You know you've been successful when you don't get cited." Also not always clear what it is you're supposed to cite with all these new-fangled web objects
- Shirley Wu
Phil Bourne: idea of "tokens" issued for contributions. But lots of grey areas. What are the relative merits of reviewing a grant vs a paper vs writing a blog post etc. Also what makes sense in one field doesn't necessarily translate into other fields.
- Shirley Wu
One thing we could do to make science more open? Carole Goble - change whole idea of citation, and ensure persistence of identity, sustainability. Phil Bourne - needs to come from funding, funders need to be pro-active. Enormous strides already made in Open Access.
- Shirley Wu
Show of hands from those with refereed grants - how many have made comments int heir grants about data sharing and availability? Most PIs in the room raised hands. Larry Hunter mentions that many grants don't get funded if you don't mention data sharing or how you will make your research and results accessible. But there are cultural differences - genome field shares more than cancer field, e.g.
- Shirley Wu
Russ Altman notes that many studies especially clinical ones, it's not a matter of not wanting to share but it's often a lifetime of work following a very specific clinical cohort, and their entire career depends on them publishing 10-20 papers on that cohort etc so there are other considerations. Cameron says but what if instead of 10 or 20 papers they could get 40 papers out of collaborations? But that is open for debate
- Shirley Wu
Nigam Shah: One thing we could do to make science more open? Demonstrated utility and return on investment
- Shirley Wu
Heather Piwowar: One thing we could do to make science more open? Be brave. Be brave in being the change you want to see.
- Shirley Wu
Larry Hunter quotes ____ McClure: "New science out of other people's data"
- Shirley Wu
Nigam notes: openness is meaningless without context and annotation. If you don't know what the parameters are for the experiment and the data, the data is useless
- Shirley Wu
Drew Endy notes: surprising to him that he has never come across a community of people whose job it is is to make research better. Mike Wong from SFSU mentions there are some people who provide infrastructure for scientists. But problem is that previously not permanent staff, just transient students etc
- Shirley Wu
Drew Endy: Real need to combine the social support that is often there with real technological development. Audience member: problem is that these activities (developing and improving research infrastructures) not often recognized as research.
- Shirley Wu
The success of OWW comes the community, not the technology
- Graham Steel
So now Drew Endy's One Thing We Could Do: incentivize infrastructure R&D
- Shirley Wu
Phil Bourne: we may need to combine a top-down (funders, policy-makers) with a bottom-up (scientists, grassroots) approach and meet in the middle
- Shirley Wu
Dave de Roure: examples of this in the UK with funded "Virtual Science Environments"
- Shirley Wu
Phil Bourne: key component is to demonstrate that we have impacted science in specific ways.
- Shirley Wu
Dave de Roure's One Thing We Could Do: Connect and present success stories
- Shirley Wu
Larry Hunter notes: the National Centers for Biocomputing were conceived partly to accelerate scientific discovery through infrastructure and tools. 1. Driving biological problem. 2. Develop tools to solve problems. 3. Demonstrate impact of tools.
- Shirley Wu
Quo's One Thing We Could Do: Separate the camps. Open vs. non-open. We also need to reflect on our own identity and what makes a scientist. Do we define ourselves in terms of publications? citations? the data? How might this change? How do we take this into account?
- Shirley Wu
It occurs to me that a lot of people in this discussion are established PIs. Just this fact is a huge difference from some of previous discussions I have seen in real life or online where we would discuss the need to get PIs to be aware of these topics.
- Pedro Beltrao
Carole: mentions a big fear which is mis-representation. What if someone uses your open protocol for a completely inappropriate purpose and then cites you? These are important issues to address
- Shirley Wu
thank you, cameron! took the words right out of my mouth ...
- Kaitlin Thaney
Have only caught bits of this but nice wrap up Cameron
- Graham Steel
doesn't tell us how much isn't being shared, who isn't sharing it, why not, does it matter, and what can be done about it
- Cameron Neylon
highlighting the results of a number of studies - surveys, manuarl reviews, citation analyses
- Cameron Neylon
How much data gets shared? Depends a great deal on datatype. Lots of DNA sequences, virtually no proteomics spectra, Talking about the data that supports published studies
- Cameron Neylon
Did I see the article about the difference between biologists and physicists in The Life Scientists room?
- Paul J. Davis
many historical reasons for these issues
- Cameron Neylon
can't remember - it rings some bells though - heather pipes her citeulike papers into friendfeed so might have seen it there?
- Cameron Neylon
That's in relation to data sharing I should've mentioned. Basic idea was that not every physicist can have their own LHC to generate data, so they share/cite extensively.
- Paul J. Davis
self reported withholding - many less report denying data than report being unable to get data from others. People are fibbing? Or not responding to email? Or just not available in the appropraite databases?
- Cameron Neylon
80% of scientists report they have postiive experiences from data sharing. 45% postiive only and 35% mixed experiences
- Cameron Neylon
Also, the transientness of email. I could be getting requests for help with old software and have no idea because forwarding emails getting lost.
- Paul J. Davis
only 17% reported only negative experiences
- Cameron Neylon
withholding is assocaited with industrry links, competitiveness of the field, being male. 40% of scientists say that data sharing was ediscouraged during training!?!
- Cameron Neylon
Human subject research is _not_ associated with data withholding
- Cameron Neylon
main reason for witholding is just too much work to share, second main reason is the wish to publish more
- Cameron Neylon
IN social science there is more emphasis on wanting to publish, want xclusive use, but a big reason is data confidentiality
- Cameron Neylon
Useful and interesting stats on data sharing.
- Pedro Beltrao
This was based on a survey of requests for data that was assocaitd with publishd papers but the data was not available
- Cameron Neylon
benefits are both societal and personal - but difficult to measure. Studey by Catherine Ball (Nat Biotech 2004) suggests the benefit could be 20-25% of the cost of generating the data
- Cameron Neylon
ok - think I've mangled that - read the paper!
- Cameron Neylon
Personal benefits of data sharing - Heather's PLoS ONE paper on increased citations - citations increase when data is available
- Cameron Neylon
What incentives are valued? The major reason (70%) - "If I really thought it would really benefit others", second most "if required for future funding/publication",
- Cameron Neylon
What would make it easier- No 1. More time and money, No 2-5. Need help, better tools
- Cameron Neylon
Incentives for quality: If quality was visible to others, if I had help, if I noticed other hand higher quality, if the archivist nagged me, at the bottom (5%) If I had released it sooner (easier to share earlier)
- Cameron Neylon
Do journal mandates work? Journals with enforceable policies have more shared datasets, unenforceable is somewhere in between. Mandates appear to be strong predictors
- Cameron Neylon
Once shared is it always there? No, URL decays, email decays, in 6 top journals, 5% unavailable after 2 years, 10% unavailable after 5 years
- Cameron Neylon
even when hosted on the journal websites (but not all of it was - still some is lost from journal sites)
- Cameron Neylon
In some cases do the costs outweight the benefits? Not much research going on on this
- Cameron Neylon
Summary: Although many researchers share, many don't, depends on many factors; relative value of incentives is surpising, withholding is correlated with the usual suspects; much room for future research, particularly on ROI - opportunities for grants on this!
- Cameron Neylon
ontologies and content acquisition: 1. Naming things 2. Naming relationships 3. Logic of combining relationships 4. Realisation that manual curation will not scale, leading to a new love of text mining
- Cameron Neylon
trends in content acquisition: increased structure, collaborative curation paltforms (lots of wikis but not just wikis), integration of texy mining in curation (BioLit etc)
- Cameron Neylon
Knewco - concept web and wikiprofessional - prepopulation by lightweight textmining of various databases
- Cameron Neylon
using wiki as an interface to the "concept web" - also programmatic access etc.
- Cameron Neylon
note: Knewco is not completely open ...
- Kaitlin Thaney
SWAN is at the other extreme of very structured content acquisition - wikis can only give a concensus. SWAN is a scientific discourse ontology, research statements can contradict. Curation interface is structured and form based - much comes from text mining but then has to be curated
- Cameron Neylon
yes - previous speaker made point that knewco is not open as well - started off with open source aspirations
- Cameron Neylon
how do you get people to contribute - Forms are a great aid for insomnia :-)
- Cameron Neylon
integration of text mining and curation - the idea of "just enough ontology"
- Cameron Neylon
there is a tension between what linguists want in an ontology and what the users will want
- Cameron Neylon
CbioC a project for capturing relationships between entities and abstracts
- Cameron Neylon
use of ontologies is not widespread - possibly because of 1> Lack of one stop shop, lack of tools to use ontolgoies for annotation, damn missed the last one
- Cameron Neylon
The key ingredients needed 1> Just enough ontology 2 apporpriate use of NLP in the curation workflow. NCBO is developing tools
- Cameron Neylon
Has anyone used SWAN? Is it as awesome as it sounded?
- Paul J. Davis
I'd never heard of it but it looks pretty good
- Cameron Neylon
An annotator service that essentially text mines whatever you are using and adds tags the are associated with ontologies
- Cameron Neylon
our scientists have been working closely with tim clark and the SWAN team in order to build upon it. i'm not tech savvy enough to fully tinker around with it just yet though.
- Kaitlin Thaney
Biomedical resource database - marked up version of public databases aggregated together. Example of needing to search for the right things - if the relatinships are not marked up you don't get the different terms
- Cameron Neylon
As we've said, Kaitlin won't be able to make the workshop, but we'll be using the themes of this presentation to kick off the discussion at the end of the workshop. It's a bit of a large file (6.9Mb) but it's worth the wait.
- Cameron Neylon
Thaney's call for public domain data usage according to community norms should be extended to publications as well. This approach towards data will yield great results as facilitates the learning and communication fundamental to scientific study and research. Creative Commons licenses are one powerful tool to move towards that goal.
- Mike Chelen
Thank you for posting, Cameron. I wish I could attend in person, but will be here for the virtual part of the conversation.
- Kaitlin Thaney