Sign in or Join FriendFeed
FriendFeed is the easiest way to share online. Learn more »
ISMB/ECCB Stockholm 2009

ISMB/ECCB Stockholm 2009

17th Annual International Conference on Intelligent Systems for Molecular Biology & 8th European Conference on Computational Biology
The talk specific feeds will be created each day shortly before the start of the first presentation. Find talk specific blogs by searching here for the authors, the title of the talk or the talk identifier as given in the program (like HL03 for the 3rd Highlight paper) The feeds can also be accessed on the conference pages in the according sections: SIGs, Keynotes, Proceedings Track, Technology Track and Highlights and the last few blogs are shown on our web-portal page.

Happy blogging !!
Hedi Hegyi
we couldn't get in to the party either - so we went to the Bishop's Arms - a lively English pub on Vasagatan - about 100m from Sheraton where the party was held. No music, thanks God, so we were able to carry on meaningful conversations. I spotted these folks outside the Sheraton party :-)
paris 280.jpg
paris 282.jpg
ISMB
ISMB/ECCB
Allyson Lister
SIG: Bio-Ontologies: CiTO, the Citation Typing Ontology and its use for the annotation of reference lists and visualization of citation networks, David Shotton
added characterization to citations present on websites using CiTO. - Allyson Lister
"refuted by" seems a heavily value laden notion! That might irritate some authors. - Phil Lord
can also use CiTO to characterize cited works - Allyson Lister
Not heard of FRBR before, but all seems sensible to me - Phil Lord
@Phil me either. Agree that it sounds good - Allyson Lister
By the way: does anyone know some OWL/RDF schema/ontology for representing user feedback on things like papers and alike? Like in Amazon or e-bay: user comment plus 1-to-5 star vote. I know Plos is doing something similar. - Marco Brandizi
Wordpress has extended RDF for representing blogs with comments also - Phil Lord
A relevant (probably well-known) related paper can be found here: http://www.ploscompbiol.org/article... - Robert Hoehndorf
the paper about CiTO has come out in the meantime: http://www.jbiomedsem.com/content... - Michael Kuhn
Ruchira S. Datta
Live Coverage of Intelligent Systems for Molecular Biology/European Conference on Computational Biology (ISMB/ECCB 2009) http://www.ploscompbiol.org/article...
"[T]he International Society for Computational Biology (ISCB), organizers of the Intelligent Systems for Molecular Biology/European Conference on Computational Biology (ISMB/ECCB) 2009 conference decided to actively support future blogging efforts. The live blogging efforts described here can be seen as a model for future conferences, with the organizers providing a tight link between the FriendFeed ISMB/ECCB 2009 room (https://friendfeed.com/ismbecc...) and the conference Web site in the ISCB Web portal (http://www.iscb.org/ismbecc...)." - Ruchira S. Datta
A big thank you to the conference organizers for making this a success! - Ruchira S. Datta
Thanks to Allyson Lister and all our co-authors. - Ruchira S. Datta
I was thinking of going to this conference next time around, if I can get a travel grant. How much protein structure stuff is there? - Donnie Berkholz
There's a whole 3Dsig for two days that's all structure. - Ruchira S. Datta
@Donnie 3DSig is a good meeting. I went 2 years ago and really enjoyed it. I'd like to go again this year http://bcb.med.usherbrooke.ca/3dsig10... - Adam Kraut
Ruchira S. Datta
BioPathways SIG: Nicolas Le Novére "The Systems Biology Graphical Notation"
supposedly biologist-friendly pathway maps are completely unintelligible - Ruchira S. Datta
by contrast, electric circuit diagrams, even very complex ones, can be understood by high school students - Ruchira S. Datta
Systems Biology Graphical Notation http://sbgn.org a limited number of well-defined symbols for biochemical and cellular events - Ruchira S. Datta
3 languages: process diagrams, entity relationships, and activity flow - Ruchira S. Datta
Thanks for updating us - I'm reading with interest from the DAM SIG room in between talks! - Allyson Lister
process diagrams: biochemistry; entity relationships: molecular biology; activity flow: physiology/genetics - Ruchira S. Datta
we're "projecting" the biological process onto different dimensions (the different languages) - Ruchira S. Datta
process diagrams: entity pool nodes (things that can be counted): containers, and process nodes: pipes - Ruchira S. Datta
also connecting arcs, which connect the various nodes w/ logical operators - Ruchira S. Datta
syntactic rules define what can connect with what, and diagram rules show how to render well-formed expressions - Ruchira S. Datta
can contract large diagrams into small ones - Ruchira S. Datta
color is strictly insignificant, so can photocopy diagrams - Ruchira S. Datta
entity relationships can be viewed as rules - Ruchira S. Datta
a process diagram representation of the same process would result in combinatorial explosion; the rule expresses this much more compactly - Ruchira S. Datta
activity flow never represents molecules, only the activities of those molecules - Ruchira S. Datta
have also annotations (not strictly part of SBGN), e.g., GO annotations - Ruchira S. Datta
SBGN is supported by many tools - Ruchira S. Datta
new version is about to be released - Ruchira S. Datta
have upcoming fora and hackathons - Ruchira S. Datta
it is compatible with BioPAX - Ruchira S. Datta
SBGN is more concerned about the graphic representation, so you can import BioPax pathways and visualize them with SBGN - Michael Kuhn
Science Daily summary of the August 8 Nature Biotechnology article: http://www.sciencedaily.com/release... reshared from Jenny Morman http://ff.im/6uWIo - Ruchira S. Datta
Iddo Friedberg
Absolut standards: report from the Metagenomics Metadata and Metaanalysis 2009 meeting. Part 1 - http://bytesizebio.net/index...
ISMB/ECCB
TT26: Nicolas Le Noväre - BioModels Database, a database of curated and annotated quantitative models with Web Services and analysis tools
Lots of things are called models. He's NOT going to talk about HMM, Bayesian models, sailboat models, supermodels :) - Allyson Lister
Models and their description/metadata need to be accessible - Allyson Lister
Models have to be encoded in SBML and follow the MIRIAM guidelines - Allyson Lister
There's been a steady increase in the numbers of models in BioModels. There are about 35000 reactions and about 400 models. - Allyson Lister
Standard search functionality available from their website at the EBI (http://www.ebi.ac.uk/biomodels) - Allyson Lister
Can export in CellML, BioPAX and others (though the SBML is the curated, perhaps more "trusted", version). - Allyson Lister
You can also just extract portions of the models: these will end up as valid SBML models in their own right. - Allyson Lister
@Alyson: Sorry to be picky, but the abstract says: "Models are accepted in two common formats, SBML and CellML." :-) - Dagmar
@Dagmar - good point. I thought so as well, but that's what Nicolas said. Can we say it's his fault for saying the wrong thing about his database? ;) (If indeed it is the wrong thing?) - Allyson Lister
ISMB/ECCB
HL30: Robert Murphy - Automated Analysis of Patterns in Human Protein Atlas Images
Allyson Lister
SIG: DAM and BOSC Joint Session: BioHDF: Open binary file formats for large-scale data management, Mark Welsh, geospiza.com
...aka Toward Scalable Bioinformatics Infrastructures - Allyson Lister
@Allyson: Thanks for setting up all those talk threads! - Oliver Hofmann
Ouch: "Byte-ing off more than you can chew"!! - Allyson Lister
HDF5 is a model and file format for large complex data: http://www.hdfgroup.org/HDF5... - Brad Chapman
@Oliver - no problem - I just check before posting that someone hasn't done it before me :D - Allyson Lister
Allyson -- seconded. Great work. Nice to have others here. I refuse to comment on that pun-filled slide. - Brad Chapman
Problem: handling mutiple next-gen sequencing data while still being able to drill down to the original sequence reads. Aim is to generate domain-specific HDF5 extensions to move away from a flat-file format. - Oliver Hofmann
@Brad - it's definitely more fun to liveblog in a community of bloggers :) - Allyson Lister
HDF5 is a fairly complicated API. BioHDF layers a biology oriented interface on top of it, targeted at next generation sequencing especially. Appears to sit on top of one or more HDF databases. Having trouble finding the code itself. Geospiza BioHDF page is here: http://www.geospiza.com/researc... - Brad Chapman
I wonder when the time is right to start referring to "current-gen sequencing", soon we'll have to start saying "next-next-gen" when talking about single-molecule sequencing. - Andy Jenkinson
The BioHDF page seems to be here: http://hdfgroup.com/project... Edit: Nevermind, that link resolves to the HDF front page... - Oliver Hofmann
He has a bagful of thumb drives with the HDF software pre-loaded. good idea! - Allyson Lister
@Andy - very funny :) I know people who, similarly, hate to hear the phrase "post-genomic era"! - Allyson Lister
Do we get to keep the drives :) ? - Oliver Hofmann
plug for piotr's talk in the BIo* update session tomorrow - Jim Procter
I'm still confused. Hierarchical data. Random access to the data. Is this not a file system? Why not use a filesystem? - Phil Lord
Allyson — thanks for blogging my talk at BOSC / ISMB. And, I'm afraid we're already using the term "next-next-gen" to describe technologies like those of Helicos or Oxford Nanopore. - Mark Welsh
Ah, the pun-filled slide... in my defense, those were other peoples' puns; however, I'm guilty of propagating them (anything for a laugh in a technology talk). - Mark Welsh
You can download the current BioHDF prototype software from the "BioHDF Command Line Tools" link here: http://www.hdfgroup.org/project... - Mark Welsh
The BioHDF slides are available here on slideshare (along with all the other talks from the conference): http://www.slideshare.net/bosc... - Mark Welsh
There will be a scivee.tv presentation of our ISMB poster also — looks like its not up yet though. - Mark Welsh
HDF is kind of like a file system within a single binary file. However, those "files" (called datasets in HDF) are multi-dimensional arrays with each element being an arbitrarily complex data structure. - Mark Welsh
@Mark - thanks for the extra information and comments. It's always good to be able to go back and look over the slides. :) - Allyson Lister
Ruchira S. Datta
That has a private feed, Ruchira :) - Allyson Lister
Sorry, I fixed it. - Ruchira S. Datta
Perhaps I'm not refreshing properly, but it still seems to be private :) - Allyson Lister
Oops, hopefully fixed it for real this time. Sorry! - Ruchira S. Datta
works fine now.. thanks for blogging this! - Peter Menzel
I was just requested to hide the feed, so it's now private again and will remain private. - Ruchira S. Datta
Ok - no worries! - Allyson Lister
Allyson Lister
ISMB Light Factory: Party on Wednesday night open thread
Left the party relatively early to allow the long queues outside to move into the party (at least by one person!). had fun - thanks to the organizers - Allyson Lister
yes, thanks to the organizers - Ruchira S. Datta
there is no dancing at math conferences - Ruchira S. Datta
now you know why i'm in bioinformatics ;-) - Ruchira S. Datta
@Ruchira - very good point! - Allyson Lister
Yeah, you need to reserve some time for the coverage! :) - Egon Willighagen
deeply disappointed that nobody live-blogged the party - I had expected real-time updated playlist and more ;) - Lars Juhl Jensen
Would be fun :) Seeing the comments become more non-sense as the commenter gets drunk... - Egon Willighagen
Well, about the playlist: Good tracks included James Brown, various reggae stuff, and Tainted Love :) - Allyson Lister
wished the venue would have been bigger so everyone could get in. - David Sexton
did y'all hear about the thief? took someone's backpack, someone's purse, and someone's camera. owner of the backpack noticed it was missing, walked outside and saw this guy with an enormous backpack. when he called to the guy, the guy ran. it was the same guy who was bothering a lot of women earlier (in one case, took her purse). i was so annoyed by him getting in my face that i left--so my things are safe, luckily for me! heard all this from the owner of the backpack at the services desk. - Ruchira S. Datta
@David I don't think that they thought so many would attend :) Victim of its own success? - Allyson Lister
@Ruchira - that's awful! I'm really glad I put my bag in a locker. So, a few people have descriptions of him? And does that mean he was a conference attendee, or just gate-crashed? Yikes... - Allyson Lister
@Allyson, I expect so. You may have noticed him at some point--short, dark complexion. I hadn't seen him before the party, but that doesn't mean anything. I was asked only "CS or bio?" at the door, but I had my badge on. - Ruchira S. Datta
Hey guys, Allyson is correct. Based on last year's party, we expected far fewer people to come. For everyone who came and could not get in initially and had to wait or got frustrated and left -- we are really sorry. It is a priority for next year to make sure that doesn't happen again. Regarding the thief, we believe he snuck in aided by a couple of friends who distracted the security.... more... - Dave Messina
Lars Juhl Jensen
Wordle contributors cloud v3 - now weighted by comment length
ISMBECCB09_FriendFeed_contributors_weighted.png
As *demanded* by Roland Krause - but it is not corrected for mean attendance, complexity of the presented material, and smileys. Nor did I include the many excellent blog posts from Allyson and Oliver in the counts. These improvements are left as an exercise for the reader ;-) - Lars Juhl Jensen
@Lars, constantly amazed by how much you get done on the side ;) thanks! - Oliver Hofmann from iPhone
Yes, I'd like to add my thanks - fantastic! :) - Allyson Lister
Thanks a lot Lars! Now you can apply machine learning to infer what areas the different FFers are interested in... ;-) - Ruchira S. Datta
Or I could just take the old-fashioned approach and read what you all wrote ;-) - Lars Juhl Jensen
Lars Juhl Jensen
Thanks to everyone for all the live commentary! In return, I give you a Wordle cloud based on all the comments that were posted in this room :-)
ISMBECCB2009_FriendFeed_Wordle.png
Surprised 'Allyson' is not one of the large keywords :) - Oliver Hofmann from iPhone
I used only the comments themselves - not the names of the people who posted them - Lars Juhl Jensen
Maybe make a Wordle out of the contributors? - Cameron Neylon
Did someone spot the 'Open' yet? - Egon Willighagen
Cameron, contributors cloud done and posted :) - Lars Juhl Jensen
I see a homologous word in the two Wordles! Latin data = Sanskrit datta, "given". :-) - Ruchira S. Datta
@Oliver - to answer your question there are a number of "@" names in the wordle, if you look closely, but yeah, it's very nice of @Lars to do both wordles! I can see @allyson and @oliver for instance - Allyson Lister
Looks like we should ramp up our vocabulary of verbs - use, using and used are all prominent items by themselves. - Roland Krause
I'm amazed by the difference between the Wordle based on abstracts (http://larsjuhljensen.tumblr.com/post...) and this one. The abstracts seem to focus on methods whereas the FriendFeed comments focus on data. - Lars Juhl Jensen
At least partially because it is much easier to quickly describes test data set than a complex method that might require notation or a schema, I think - Oliver Hofmann from iPhone
Lars Juhl Jensen
Wordle contributors cloud v2 (now with first and last names concatenated)
ISMBECCB2009_FriendFeed_Contributors2.png
Apologies to those of you with foreign characters in your names, which got mangled by Wordle :-/ - Lars Juhl Jensen
you can use a tilde to keep words together but maintain the space http://www.wordle.net/faq#spa... - Simon Cockell
Frank, me too! I wonder if this Wordle excluded the non talk specific things (e.g., FF shut us down) - Ruchira S. Datta
No it was just a quick'n'dirty hack: download all comments as JSON in batches of 30 topics, extract FriendFeed user names, paste into Wordle, submit to FriendFeed ;-) - Lars Juhl Jensen
well, that would probably explain it then :-) - Ruchira S. Datta
I don't think so Ruchira - the distance from Ally to you is more than 150 comments. I doubt you made *that* many non-talk comments (but I haven't checked). - Lars Juhl Jensen
Simon, thanks - I won't pester people with a v3 of this cloud, though ;-) - Lars Juhl Jensen
Think Allyson (and to some extent I) switched to the blog posts and stopped pasting over comments to ff when coverage was already good -- and Ruchira, your coverage was fast and incredibly thorough :) - Oliver Hofmann
I demand that the analysis is normalized by comment length! And mean attendance! And complexity of the presented material! And smileys! - Roland Krause
@Roland you are very funny! If you want to know my hypotheses, then: @Ruchira should be first for FF comments, as she definitely ramped up her commenting over the week, even including the SIG comments (which I guess were included :)); also, as @Oliver says, both of us mainly switched to blog posts as the week went on, especially for talks where other FFers were about. I'm not demanding that Lars re-write to pull down word counts from comments or blog posts, though :) (But it would be interesting! :D) - Allyson Lister
ISMB/ECCB
Keynote: Webb Miller - Bioinformatics Methods to Study Species Extinctions
10 Steps to Success in Bioinformatics - sebi
has been 40 yrs in computers, 20 yrs in bioinformatics: but doesn't have enough money to retire (tongue-in-cheek) - Michael Kuhn
Step 1: Become a biologist. - sebi
extinction - Venkata P. Satagopam
Extinction: How to save endangered species - Peter Menzel
which species are in trouble? - Venkata P. Satagopam
Here's the article "10 steps to success..." http://www.iscb.org/iscb-pu... - Michael Kuhn
3 stories - Venkata P. Satagopam
(out of batteries, not sure the iPhone is adequate for this. So far a great intro :) ) - Oliver Hofmann from iPhone
3 studies: Tasmanian tiger, Mammoth, ... - Peter Menzel
explains how DNA can be found in hairs. - Peter Menzel
For extinct species, mitochondrial genome is easier to sequence, as it is up to 1,000 times more abundant than genomic DNA - sebi
aDNA = ancient DNA - Peter Menzel
background of ancient DNA ... - Venkata P. Satagopam
what can be learned from aDNA? - Venkata P. Satagopam
1. phylogenetics - Peter Menzel
2. population genetics - Peter Menzel
This has some parallels to the last talk in today's session in T1 by RE Green: Neandertal mitochondrial DNA to place them (and us) in the phylogenetic tree - sebi
3. directly observe evolutionary rates - Peter Menzel
4. observe evolution of function - Peter Menzel
short reads are good enough for aDNA junks - Peter Menzel
next-generation seq of aDNA - Venkata P. Satagopam
aDNA from hairs can easily be decontaminated - Peter Menzel
aDNA from hair shafts enclosed in a plastic bag like .... - Venkata P. Satagopam
Tasmanian tiger: extinction date 7/9/1936 - Peter Menzel
over 700 know specimens - Venkata P. Satagopam
thylacine = tasmanian tiger - Peter Menzel
attempts to study thylacine DNA - Venkata P. Satagopam
observations - Venkata P. Satagopam
2 mitochindrial genomes in GenBank - Peter Menzel
in GenBank two genes 12S, cytb both were wrong - Venkata P. Satagopam
both wrong at 10% of nucleotides - Peter Menzel
tasmanian devil closest relative to tiger - Peter Menzel
split at ~40m years ago - Peter Menzel
30% of new sequence data are from nuclear genome - Peter Menzel
-> maybe sequence whole genome? - Peter Menzel
one imp question ... did an epidemic contribute to extinction? - Venkata P. Satagopam
Epidemic probable cause of extinction - Peter Menzel
(the quiet comments are hillarious. 'we sequenced one from Stockholm swimming in alcohol. Long way from home, extinct, drowning his sorrows...') - Oliver Hofmann from iPhone
story 2 wooly mammoth - Venkata P. Satagopam
Mammoth: - Peter Menzel
hair sample from eBay - Peter Menzel
we found sample on ebay - Venkata P. Satagopam
for 90$ :-) - Peter Menzel
most samples are from Sibiria - Peter Menzel
have now 18 different complete mtDNA sequences - Peter Menzel
what we draw from this analysis ....mammoth separated from living elephants 6m years ago - Venkata P. Satagopam
like chimp and human - Peter Menzel
want to sequence the full nuclear genome - Peter Menzel
already got 0.7-fold coverage - Peter Menzel
= 3.3 billion bp - Peter Menzel
assuming real size of genome is 4.7Gb - Venkata P. Satagopam
aa identitiy is nearly 99.8% between mammoth and elephant - Peter Menzel
99.4% overall nucleotide identity - Peter Menzel
This equals 1 mutation per protein - sebi
(all this excluding indels) - Peter Menzel
90% seq from the sample from the wooly mammoth - Venkata P. Satagopam
Mammoths differ from all other vertebrates in some highly conversed genes, such as MRTO - Venkata P. Satagopam
maybe due to living in cold places - Peter Menzel
extinction scenario ...human killed them all - Venkata P. Satagopam
certainly not - Venkata P. Satagopam
extinction szenario: killed by humans.. become to warm... -> but thats' not true - Peter Menzel
humans arrives to late in Siberia - Peter Menzel
3rd story Tasmanian devil - Venkata P. Satagopam
tasmanian devil now - Peter Menzel
threatened by extinction due to cancer - Peter Menzel
cancer cells passed around by biting each other - Peter Menzel
cancer is passed through infection (biting other individuals, spreading cancer cells) - sebi
some resistant individuals - sebi
Trying to preserve the species, keeping the gene pool large enough - sebi
compared two individuals: one who is resistant, one who died - Peter Menzel
Finding out about the resistance: considering an ortholog of a human tumor supressor gene - sebi
the guy who died, has a mutation at an extremely conserved position - Peter Menzel
but not much is known what it means.. - Peter Menzel
want to look at non-coding nuclear SNPs - Peter Menzel
closest sequenced relative is oppossum - Peter Menzel
90m years of separation - Peter Menzel
-> computational problem: SNPs without a reference genome - Peter Menzel
produce sequences from many individuals instead - Peter Menzel
and compare SNPs there - Peter Menzel
-> extract population structure - Peter Menzel
seems to work.. but no final data... - Peter Menzel
Quick summary: Bioinformatics very helpful, but much more data needed - Peter Menzel
Role of genetic diversity in extinction still not really known.. no simple story - Peter Menzel
but genetic diversity not necessary a good indicator for risk of extinction.. - Peter Menzel
e.g. panda has high diversity, but endangered - Peter Menzel
TV documentary from Australia: http://www.abc.net.au/catalys... - Peter Menzel
(only to get the blog on top of the ISCB portal site; the figures messed up our layout) - Reinhard Schneider
Related: 11 Extinct Animals That Have Been Photographed Alive: http://ecoworldly.com/2009... - Peter Menzel
ISMB/ECCB
HL58: Caroline Friedel - A new method for high-resolution gene expression analysis
Metabolic tagging of newly transcribed RNA - Diego M. Riaño-Pachón
Able to measure decay and de novo synthesis - Diego M. Riaño-Pachón
this methods tries to solve a long standing problem in transcriptomics, ] - Diego M. Riaño-Pachón
i.e., bias for differential expression of short-lived RNAs - Diego M. Riaño-Pachón
and thus in subsequent analysis - Diego M. Riaño-Pachón
solution is to perform gene expression analysis of newly transcribed RNA - Diego M. Riaño-Pachón
using 4-thiouridine - Diego M. Riaño-Pachón
after tagging,normal analysis techniques for microarray or RNA-seq can be sused - Diego M. Riaño-Pachón
bur of course new methods could exploit features specific for the tagging of newly transcribed RNA - Diego M. Riaño-Pachón
directly observe changes in de novo transscription - Diego M. Riaño-Pachón
observe changes relative to basal transcription - Diego M. Riaño-Pachón
no bias for short-lived trnascripts - Diego M. Riaño-Pachón
It can offer evidence of the stability of RNA (decay) - Diego M. Riaño-Pachón
(room pretty full) - Diego M. Riaño-Pachón
New methods for normalization are necessary when comparing total RNA to newly transcribed RNA - Diego M. Riaño-Pachón
after normalization the RNA half-life can be calculated using and exponential decay model - Diego M. Riaño-Pachón
the exp. technique with the new normalization increases the precision on the estimation of RNA half-life - Diego M. Riaño-Pachón
Q: is 4-thiouridine toxic to the cell? A: No, at least in the time points looked at. - Diego M. Riaño-Pachón
Q: is there a bias towards transcript length? A: They did not see any significant correlation - Diego M. Riaño-Pachón
(only to get the blog on top of the ISCB portal site; the figures messed up our layout) - Reinhard Schneider
ISMB/ECCB
HL57: Erik Sonnhammer - FunCoup: global networks of functional coupling in eukaryotes
How to reconstruct networks - experimental networks are incomplete. - Roland Krause
Up to 300.000 interactions are proposed for human - Roland Krause
only 35000 known - Ruchira S. Datta
Each experimental method can give you more than 20% of the interactions. - Roland Krause
experiments have high false negative and false positive rates - Ruchira S. Datta
e.g., false positives from in vitro experiments: the interaction may never happen in a living cell - Ruchira S. Datta
Interactions have to be combined and evaluated. - Roland Krause
there are many kinds of evidence for functional coupling - Ruchira S. Datta
Lots of evidence for functional coupling, not only from PPI but als from localization, gene expresson, interacting domain, TFBS; miRNAs. - Roland Krause
domain interactions - Venkata P. Satagopam
Integrate different kind of data from various organisms. - Roland Krause
Some links are continous, some binary, etc. - Roland Krause
using Naive bayesian training - Venkata P. Satagopam
full Bayesian training would be too computationally heavy - Ruchira S. Datta
Naive Bayesian training, going from continuous data to distinct bins. - Roland Krause
compare with positive and negative reference datasets - Ruchira S. Datta
There is no negative reference data set out there, genes in different compartment might actually interact. - Roland Krause
therefore use random examples as negative set - Ruchira S. Datta
calculate enrichment as likelihood ratio=P(+)/P(-) - Venkata P. Satagopam
Learn log likelihoiod ratios for each evidence, requires large negative set. - Roland Krause
sum all the log-likelihood ratios to get full bayesian score - Ruchira S. Datta
4 different flavors of training sets: metabolic pathway, signaling pathway, physical ppi, and complexes from UniProt - Ruchira S. Datta
Training sets are from KEGG (metabolic and signaling), HPRD and Complexes from Uniprot. - Roland Krause
using the curated data only for training/validation, not as input to the networks - Ruchira S. Datta
No curated data used in the network, only in training. - Roland Krause
7 model organisms and human are combined with 50 individual data sets. - Roland Krause
Convert log scores to confidence scores. - Roland Krause
predict coupling between 2 genes, for each model FC-PI, FC-CM, FC-ML, FC-SL model - Venkata P. Satagopam
convert into confidence scores, which may be different in different models - Ruchira S. Datta
convert the Bayesian score using the probabilitly of functional coupling, which is unknown but they just set to 1/1000 ad hoc - Ruchira S. Datta
algorithmic innovations- we have to develop new evidence scores - Venkata P. Satagopam
Used input from PPI as continuous data, using experimental counts. - Roland Krause
discretization caused problems - Ruchira S. Datta
they figured out how to discretize - Ruchira S. Datta
test significance with chi-squared test - Ruchira S. Datta
integrate evidence from orthologs used by InParanoid - Ruchira S. Datta
using inparanoid orthologs - Venkata P. Satagopam
Used transfer of ortholougs information under the same Bayesian framework - Roland Krause
new phylogenetic patterns using orthologs - Ruchira S. Datta
Except for yeast, most species had more information transferred rather than generated for the organism. - Roland Krause
most support coming other species, other wise we may miss - Venkata P. Satagopam
many links would not have been found without evidence from other species - Ruchira S. Datta
only yeast is well supported all by itself - Ruchira S. Datta
validated networks, TCGARN science 2008 - Venkata P. Satagopam
Validation using cancer pathways, recovering 29 of 36 links, found an additional 25. Not entirely independent. - Roland Krause
Independent validation from recovering tumour mutation sets. - Roland Krause
applications - 1 . exploring local networks, 2. analyzing network conservation - Venkata P. Satagopam
Easy exploration of the data sources leading to an edge in the network. - Roland Krause
graph visualization: started with Medusa, but it didn't fulfill needs - Ruchira S. Datta
now developed JSquid (sp?), published last year - Ruchira S. Datta
used jSquid .. Bioinformatics 2008 - Venkata P. Satagopam
New view of the human disease network, build tree of disease interactions. - Roland Krause
conclusions - Venkata P. Satagopam
to discover novel functional coupling between genes - Venkata P. Satagopam
Cancers group together, neurological diseases do not. - Roland Krause
can expand gene sets such as pathways - Venkata P. Satagopam
(only to get the blog on top of the ISCB portal site; the figures messed up our layout) - Reinhard Schneider
ISMB/ECCB
HL61: Lee Newberg - Global Measures of Uncertainty: Long Overdue in Computational Molecular Biology
two papers: webb-robertson & lawrence; newberg ... lawrence - Michael Kuhn
what is a good e-value? - Diego M. Riaño-Pachón
E- and p-values are small when random data in unlikely to do well - Michael Kuhn
E-values and p-values tell you about random data but not whether there are other solutions. - Roland Krause
but there still might be other solutions which equally good E or p values - Michael Kuhn
need to define confidence / credibility limits - Michael Kuhn
e-values and p-values are poor proxies to credibility - Diego M. Riaño-Pachón
an example from RNA 2D prediction, MFE is not the best representative - Diego M. Riaño-Pachón
today's goal: compute a global measure of representativeness of a point estimate - Michael Kuhn
solution spaces are immense, but we often choose a particular point estimate... misleading e.g. if there's a bimodal solution - Michael Kuhn
Goal: Compute global mesaure of representativeness of a point estimate - Peter Menzel
Many problems are tackled by dynamic programming, HMMS, etc, collectively they are Hidden Boltzmann Models - Roland Krause
compute / estimate credibility: look at distribution of differences from point estimate - Michael Kuhn
SW alignment of nt sequences, score distribution can be modeled by exponential distribution. For a 3000x3000 alignment, the Fourier approach is fast. - Roland Krause
Will not explain all about Fourier Transform (in 15 minutes). - Roland Krause
use the credibility limit, is a simple number, even a biologist can understand that - Diego M. Riaño-Pachón
(only to get the blog on top of the ISCB portal site; the figures messed up our layout) - Reinhard Schneider
ISMB/ECCB
HL62: Richard Green - A Complete Neandertal Mitochondrial Genome Sequence Determined by High-Throughput Sequencing
Neandertals: closest extinct relative. existed 400000-30000 yrs ago - Marcel Martin
chimps are closest living relatives, but deviated longer ago - Marcel Martin
first DNA from extinct species was in 1985 from the Quagga - Marcel Martin
Neandertal mitochondrial genome, as it is present about 1000 copies per cell vs 2 copies of genomic DNA - sebi
mitochondrial genome is easier to recover since there are many copies per cell - Marcel Martin
Mitochondial genome is useful for tracking maternal lineages, and it accumulates mutations slowly -- ideal for building trees - sebi
seems like there was no interbreeding between Neandertals and modern humans - Marcel Martin
Deep sequencing with high-throughput next generation sequencing, used to be direct PCR - sebi
ancient DNA fragments are just 60nt in length - Marcel Martin
Roche/454 and Illumina sequencing was used, no need to fragment DNA (more fragmented than one would like anyway) - Marcel Martin
many C to T transitions, also G to A - Marcel Martin
COX2 protein has 5 differences between chimp and human, 4 of 5 differences happened in the last 600,000 yrs, so Neandertals also have 4 different AAs compared to Homo sapiens - sebi
Maybe fast evolving sites? Reverting to previous (=monkey), more advantageous AAs? - sebi
sequencing errors: 3% of all Cs are Ts, same for G->A. Reason: C is deaminated to U, then seen as T - Marcel Martin
higher probability for deamination at the end of fragments. Perhaps because cytosine deamination is 100x faster in single-stranded DNA and end of fragments are single-stranded - Marcel Martin
MIA: mapping iterative assembler. manuscript in preparation - Marcel Martin
Measuring protein evolution with a ratio of non-synonymous differences to synonymous differences dN/dS: indicative of a small Neandertal population size - sebi
(only to get the blog on top of the ISCB portal site; the figures messed up our layout) - Reinhard Schneider
ISMB/ECCB
Keynote: Mathias Uhlen - A global view on protein expression based on the Human Protein Atlas
Introduction: Works a lot on affinity reagents. Invented and developed pyrosequencing technology (http://en.wikipedia.org/wiki...) now used in 454 - Allyson Lister
Out line of the talk - 1. systematic biology -introduction, 2. HPR project 3. The Human protein Atlas - Venkata P. Satagopam
18th century - biologist. 19th - chemist (1/3 of elements discovered in Sweden in this century). 20th - physicists and at the end, computer scientist. He'd now like to say that the 21st century is the century of medicine. - Allyson Lister
HPR one of the largest projects in Sweden wrt funding, about 100 million euro so far - Oliver Hofmann
An impressive log-scale plot of number of bases sequenced since 1965. - Allyson Lister
developer of sequencing by synthesis via pyrosequencing in late 90s -- basis of 454 technology - Andrew Su
Personalized genomics ... 454 technology developed in our lab - Venkata P. Satagopam
Bioinformatics is the key in the new era of genomics. - Allyson Lister
95% of drugs (still) aimed at proteins - Oliver Hofmann
(Personal opinion: I like how it's not the "post-genomic era", but a new era of genomics :) ) - Allyson Lister
95% of drugs today target proteins. Thus, studying proteins is studying for the future - Diego M. Riaño-Pachón
Systems biology /omics is going to be fantastic in the next 10 years. - Allyson Lister
(That, or we find better ways of interfering with RNA) - Oliver Hofmann
Image of contradictory sign in Paris: you know where you want to go, but not how to get there. - Allyson Lister
We know where we want to go (characterize all proteins), but not sure how to get there (due to a lack of high-throughput methods) - Oliver Hofmann
The generation game - Nature july 7, 2007 - Venkata P. Satagopam
antibodies are the core tool for probing proteins - Andrew Su
but they are too cross-reactive - Diego M. Riaño-Pachón
Human antibody initiative HAI - Venkata P. Satagopam
human anitobdy initiative (HAI) -- uhlen, M. Synyder, P. Hudson -- generate comprehensive and validated antibody collection - Andrew Su
validation of commercially available antibodies - Venkata P. Satagopam
average success rate of commercial antibodies is 49% - Andrew Su
some companies 100% antibodies works fine, some companies 0%, in general 50% works fine - Venkata P. Satagopam
Antibodypedia -- a portal for validated anitbodies (we need to add a link from Gene Wiki...) - Andrew Su
From the website: "The antibodypedia is a community-based portal showing application-specific validation of publicly available antibodies to human protein targets. Each protein binder (antibody or other affinity reagent) has been scored in an application-specific manner into three main categories (supportive, uncertain and non-supportive)" - Oliver Hofmann
@Oliver - nice! - Allyson Lister
If you have 2 antibodies, you can compare results in various assay platforms so he wants to develop paired antibodies for every protein target. - Allyson Lister
Nat Methods 2008: High-througput method to identify epitopes - Oliver Hofmann
6 months ago published a paper Nature methods (december 2008) - Venkata P. Satagopam
(bummer, antibodypedia doesn't use mediawiki so can't assess current usage...) - Andrew Su
(I am entirely too short-sighted to read the author lists half of the time. Sigh) - Oliver Hofmann
HPR - The human proteome resource - Venkata P. Satagopam
(grumble grumble, antibodypedia creates YAI -- yet another identifier) - Andrew Su
(++ HPA -- uses ensembl gene IDs...) - Andrew Su
hpr is a multi-disciplinary program - Venkata P. Satagopam
(Proteinatlas seems to be using ENSG/ENSP/UniProt) - Oliver Hofmann
The gene factory does about 200 clones per week, and is in full production. - Allyson Lister
(HPA ids- mapped to uniprot also) - Venkata P. Satagopam
Close to 34.000 clones in the database - Oliver Hofmann
(@Andrew - indeed) - Allyson Lister
200 clones per week, 33,925 clones total (all human?) - Andrew Su
Open source (but in-house?) LIMS developed - Oliver Hofmann
(would be interesting to compare to origene collection of mammlian clones) - Andrew Su
The antigen design uses PRESTIGE, which is a bioinformatics approach to select antigens using the protein epitope signature tag (PrEST). - Allyson Lister
antigen design -- used PRESTIGE a bioinformatics approach to select antigen for antibody - Venkata P. Satagopam
read of this project is protein expression profiling - Venkata P. Satagopam
readouts -- immunohistochemistry (IHC) and IF (immunofluorescence) - Andrew Su
Organ, tissue, cellular and sub-cellular expression profiing on a protein basis - Oliver Hofmann
apply antibodies to tissue arrays (cancer focus, I think) - Andrew Su
(140+ human samples, around 200 tissues.. did someone catch the numbers?) - Oliver Hofmann
(Faq from the website: spatial distribution of proteins in 48 different normal tissues and 20 different cancer types as well as 47 different human cell line) - Oliver Hofmann
annotation of images taking place in Mumbai in India - Venkata P. Satagopam
image annotation -- difficult problem. automated anaylsis would be good, but now using indian pathologists for manual annotation - Andrew Su
$60 / 500 images for annotation? - Andrew Su
Confocal microscopy for subcellular localization, difficult to scale up to high-throughput - Oliver Hofmann
high-throughput subcellular localization in A0431 (squamous cell carcinoma), U-251MG (glioma), ??? - Andrew Su
(I think Bob Murphy talked on a similar project to map subcellular localization of proteins en masse...) (Oh, looks like it's a collaboration between the two...) - Andrew Su
They have a SVM that seems to be able to annotate 28 different parts of the cell. - Allyson Lister
2TB data each week (courtesy of 50.000 images in the same time) - Oliver Hofmann
2/3 of data come from in house data and 1/3 comes from different companies - Venkata P. Satagopam
(I wonder how the protein expression compares with our gene expression atlas http://www.ncbi.nlm.nih.gov/pubmed...) - Andrew Su
About 33% of the sample space done (6850 genes) - Oliver Hofmann
progress - started in 2005 , last week released version 5. 8,832 antibodies, covering 1/3 genes in uniprot - Venkata P. Satagopam
(And I suppose there's always more to do.. check for splice variants, truncated versions...) - Oliver Hofmann
Most recent release: 7 mln images. - Allyson Lister
(@Andrew: if there is overlap in the cell lines that could be an easy correlation analysis) - Oliver Hofmann
The next 5 yrs are also about getting the paired antibodies mentioned earlier. - Allyson Lister
all antibodies available to the public - Venkata P. Satagopam
(good good, HPA is already BioGPS plugin.. http://biogps.gnf.org/#goto=p... </shameless_plug>) - Andrew Su
central questions in proteomics - Venkata P. Satagopam
(Rodent atlas seems a bit redundant to Allen Brain Atlas? -- Oh, ABA is via in situ / RNA, this is protein. Again, would be interesting to compare...) - Andrew Su
how many proteins are expressed in a given cell? - Venkata P. Satagopam
how many protein are tissue specific? - Venkata P. Satagopam
@Oliver, probably not directly comparable by exact cell lines, but might be worth comparing by parental tissue. Need to wait until they allow downloading of data though... - Andrew Su
Ensembl "thinks" that the genes are up to 23,000, but UniProt "thinks" 20,000, but the number is probably with that (for genes coding for proteins). ("thinks" in scare quotes, as databases don't think - yet) - Allyson Lister
the size of human membrane proteome .. 5,514 human membrane proteins; covering 26% of protein-encoded genes - Venkata P. Satagopam
proteins expressed in normal cells - 6,800 antibodies towards (>25% of all protein encoding genes). 65 normal cell types (from 45 different tissue types) - Venkata P. Satagopam
70% of proteins expressed in a given cell, approx even distribution across # of cell lines (not what we observed on gene expression data, which had distinct peaks at tissue-specfiic and ubiquitous) - Andrew Su
80% of proteins expressed on average in cell lines (surprisingly high to me...) - Andrew Su
(@Andrew: protein selection might be biased towards the ones that are well expressed / had known anitbodies / ...) - Oliver Hofmann
9% of proteins cell type specific, 62% expressed across 3 different cell types - Oliver Hofmann
ubiquitous expression, but differing levels - Andrew Su
(interesting cytoscape visualizations of cell type / tissue specificity) - Andrew Su
In the Atlas: < 2% specific to a single cell type (84 proteins), well known ones like insulin - Oliver Hofmann
Includes a number of uncharacterized proteins with no known function - Oliver Hofmann
PROSPECTS: PROteomics SPECification in Time and Space - Allyson Lister
"Complementary technologies, including mass spectrometry, cryoelectron microscopy and cell imaging will be applied in innovative ways to capture transient protein complexes and the spatial and temporal dimensions of entire proteomes." - Oliver Hofmann
MCF-7 data with IHC, Mass spec - Oliver Hofmann
next generation seq of cDNAs from U2-OS human cell line 76% detected by mRNA seq - Venkata P. Satagopam
76% genes detected by next gen mRNA sequencing in U2-OS - Andrew Su
again -- mostly ubiquitous expression (now on mRNA level), but differing levels - Andrew Su
high fractions of all proteins expressed in human cells, tissues and organs - Venkata P. Satagopam
Lack of specificity not good knows for those looking for good antibody targets for therapeutic purposes - Oliver Hofmann
the quantity of proteins, rather than their presence /absence, is the key to cell identity - Diego M. Riaño-Pachón
few cell-specific proteins (<1%) and group-specific proteins (<10%) - Venkata P. Satagopam
find biomarkers for early detection of disease ... it is very good for human mankind - Venkata P. Satagopam
(Proteome 2008) Suspension bead arrays - Oliver Hofmann
mg/ml to pg/ml range (dynamic range of protein concentration in blood 10^12 ) - Oliver Hofmann
working on kidney disease in collaboration with astrazanica .... for detection of biomarkers - Venkata P. Satagopam
Developing 'next generation' plasma profiling, scale to one million assays / months - Oliver Hofmann
(has he mentioned availability of these antibodies? Commercially available? antibody-producing lines?) - Andrew Su
They're part of ENGAGE. - Allyson Lister
commercially available @Andrew - Allyson Lister
first draft of the human proteome by 2014 - Venkata P. Satagopam
Aim to have the draft version of the human proteome by... see above - Oliver Hofmann
(@Andrew I think via Prestige Antibodies (I remember the cute advertising slide). Does that mean advertising works?) - Allyson Lister
Nature, "The big ome" - 24 April 2008, editorial - Allyson Lister
tissue-specificity is achieved by precise regulation of protein levels in space and time - Venkata P. Satagopam
Prestige Antibodies through Sigma: http://www.sigmaaldrich.com/life-sc... - Andrew Su
science 26 sep 2008, vol 321 pages 1758-1761 - Venkata P. Satagopam
"Proteomics Ponders Prime Time", Science, 26 September 2008, in response to the Nature article - Allyson Lister
new lab in Stockholm coming soon, Science for Life Laboratory - Venkata P. Satagopam
New Science for Life laboratory being established, see http://www.newsdesk.se/pressro... - Oliver Hofmann
Q: importance of splice isoforms -- A: complexity that is currently not considered due to technical complexity (to be saved for second phase) - Andrew Su
Q: conclusions on tissue specificity have bias based on antibody availability? A: bias of commercial antibodies possible, but only 1/3 of data. Data they are generating based on walking down chromosomes (I think?), so don't expect bias... Also, some of ubiquitous expression is due to cross-reactivity. (first mention of this...) - Andrew Su
Q: perspective for gene therapy or antisense therapy, more generally non-protein based therapies. A: pharma shifting from small molecules to biologics (not sure about a "shift" rather than "expansion"). Gene therapy problem is getting into all relevant cells. Ubiquitous expression of proteins the root cause of side effects for protein-based targets, possibly... - Andrew Su
(only to get the blog on top of the ISCB portal site; the figures messed up our layout) - Reinhard Schneider
Ruchira S. Datta
Birds of a Feather session: Semantic Web-Linked Data, organized by Eric Neumann, in T5
linked data is: a simple set of 4 guidelines for publishing RDF data on the Web (over HTTP), developed by Tim Berners-Lee in 2006 - Ruchira S. Datta
1. Use URIs as names for things (globally unique identity). 2. Use HTTP URIs (everyone has a web browser/client) 3. When someone looks up a URI, provide useful information...in the form of RDF data. 4. Include links to other URIs (foster discovery of additional information). - Ruchira S. Datta
Context-independent identifiers (URIs) would make things so much more useful and interoperable - like Lego pieces. - Ruchira S. Datta
Some want to get the semantics exactly right and use formal logic and OWL, but here we're emphasizing just the linkability of things. - Ruchira S. Datta
A URI can only refer to one thing, but one thing can have several URIs, unfortunately. - Ruchira S. Datta
several years ago, tried to bridge use LSIDs (life science ids): thing:something:something:identifier. But this can only be recognized by some particular software, not a web browser. Strong influence from W3C to use HTTP URIs, per the law of least power: do what requires the least technology. Even Mark Wilkerson who was touting LSIDs has come around to HTTP URIs. - Ruchira S. Datta
A commenter says LSIDs still exist, it's just that they can extract them from HTTP URIs. - Ruchira S. Datta
There are other proposals, e.g., shared names; Neumann prefers even less constraint than shared names. - Ruchira S. Datta
So, now if you put in a URI you get something back. You should be able to get RDF back. UniProt does this: if you put .rdf on the end of the URI, you'll get the data back as RDF. - Ruchira S. Datta
Now colleagues can just use the URIs in order to reuse the data; don't need to copy the data. - Ruchira S. Datta
someone says the UniProt accession is an identifier, whereas the URI is a way to get at the thing through the web - Ruchira S. Datta
identifiers can overlap, but HTTP URIs put things in unique namespaces - Ruchira S. Datta
it needs to be stable: when you put this out, you're establishing a contract with the community that it's going to change - Ruchira S. Datta
currently, the url, e.g., http://www.uniprot.org/uniprot... is also the URI. the second part is the identifier of the record and the part before the slash is the namespace. At http://purl.bioontology.org, we separate the namespace and the url. So going there we have a PO box that can eternally forward it. - Ruchira S. Datta
Problem: this assumes http://purl.bioontology.org may go away. Thus the Banff Manifesto. - Ruchira S. Datta
Reduce the likelihood of catastrophic failure by consolidating it into an institution, e.g., Stanford University, with longevity. - Ruchira S. Datta
This just pushes the problem onto purl.bioontology.org. - Ruchira S. Datta
The domain name can be transferred, so why do we need purl.bioontology.org? - Ruchira S. Datta
Transferring a zillion domain names is a pain, transferring one domain name is easy. The institution commits to maintaining that domain. - Ruchira S. Datta
It should be not just the institution, but the community--the community will continue to live on. - Ruchira S. Datta
Knowledge should be monotonic: it grows and doesn't disappear. Even if a particular effort dries up, the URIs should still be valid so we can still see what was there. - Ruchira S. Datta
The Linking Open Data Project: A community project started within the W3C Semantic Web Education & Outreach group in 2007 - Ruchira S. Datta
The LOD (Linking Of Data) "cloud", May 2007: many projects with various links between them, e.g., MusicBrainz, FOAF, DBpedia, etc. - Ruchira S. Datta
By March 2008, had tripled - Ruchira S. Datta
you can put any kind of data up and make it available to Sparkle queries - Ruchira S. Datta
By September, WordNet and various other dbs had come in - Ruchira S. Datta
March 2009: life sciences comes in, with Bio2RDF - Ruchira S. Datta
now you can find the data that is in NCBI and UniProt in RDF format, but not the experimental data yet - Ruchira S. Datta
to make this useful for interesting research, will need URIs, and to figure out what are the rules that are important for life sciences - Ruchira S. Datta
when you publish using this data, how is your data that builds on top of it going to be able that's linked from it? - Ruchira S. Datta
we don't really have this concept in life sciences yet, people don't know about it - Ruchira S. Datta
suppose one looks for a concept in the LOD cloud, like "heart"; how do we know which thing to query? BioOntology, DBpedia, etc? - Ruchira S. Datta
one can't do the Google on it yet - Ruchira S. Datta
bioontology guy hates Google analogy; Google gives millions of hits, but we want the contextual query - Ruchira S. Datta
i protest, have to have indexing before ranking - Ruchira S. Datta
this doesn't solve the problem of redundancy: we want the facts about a protein, regardless of their source - Ruchira S. Datta
bioontology guy says you don't need the index to answer the query, just to answer it fast - Ruchira S. Datta
someone else says, need the indexing in order to do the clustering - Ruchira S. Datta
she says you need semantic web overlays. we need hierarchical indexing environment in order to do this at scale - Ruchira S. Datta
one needs to be able to query on an abstraction - Ruchira S. Datta
bioontology guy: what's more important, query or browsing? - Ruchira S. Datta
someone else: even browsing, if something is 3 links away, may not even go there - Ruchira S. Datta
bio2rdf guy says: first we ask everyone simultaneously: do you know about this? then we ask what do you know about it? we have implemented Shared Names. But the URI just goes to the original record. Many people have said many things about the same entity. - Ruchira S. Datta
nobody wants to have to read all the papers in MedLine. the punchline is the links: how does this protein relate to others. if we don't trust a link, *then* we want to drill down - Ruchira S. Datta
if there are 5 million sources of "A is related to B", we don't want to read all of them, we just want to know that there are 5 million of them. we also want to know the kind of evidence, e.g., particular kind of experiment. Then the user can decide whether to trust it. - Ruchira S. Datta
At this conference, enormous number of people mining data. We should be able to see their results as easily as the original sources. - Ruchira S. Datta
Great coverage! Thanx! It's like being there... would have loved to sneak in on this BoF... - Egon Willighagen
Egon: feel free to wander in... - Ruchira S. Datta
we want just the local subnetwork, not the text. we want the facts - Ruchira S. Datta
people want question answering - Ruchira S. Datta
Bing bought Powerset for this purpose - Ruchira S. Datta
someone says this turned out to be crap, e.g., "Psoriasis causes arms" - Ruchira S. Datta
the question can be a small subgraph, not necessarily an English sentence - Ruchira S. Datta
when we want information about a protein, there are only a limited number of kinds of things we can be interested in, so the software can guide the query context-sensitively - Ruchira S. Datta
we need to distinguish the problem of document retrieval from query formulation - Ruchira S. Datta
we shouldn't just think of scientists, but also other kinds of users - Ruchira S. Datta
this will all be possible, but many people are currently just reinventing RDF over and over again - Ruchira S. Datta
how many here are producers of RDF? roughly 8 - Ruchira S. Datta
put the things that we create in RDF, e.g., if you make the intersection of this fact with this paper, you are in charge of minting that URI - Ruchira S. Datta
if everyone does this, then this facilitates cross-references and exchanges - Ruchira S. Datta
Nophar Geifman has been working with Eytan Ruppin on finding cliques in GO around different diseases. A thing like that should have an URI, so other people can use it. - Ruchira S. Datta
can we do micro-experiments so by ISMB next year, we can prove this concept - Ruchira S. Datta
paradigm shift between hypothesis-driven query versus, the data throws the hypothesis at you - Ruchira S. Datta
what's important is the use case: what is the question you can answer that would make them go "wow"? - Ruchira S. Datta
David Hune is working with FreeBase and has developed Parallax, a facet browser - Ruchira S. Datta
he also developed Exhibit - Ruchira S. Datta
faceted browsing makes more sense to biologists - Ruchira S. Datta
look at Google's Wonder Wheel - Ruchira S. Datta
Jamie Gonagell (sp??!) at SciFoo camp designs games, first slide was World of Warcraft: if you harnessed the collective brainpower that youngsters spend on WoW every day, you could rewrite Wikipedia every day! - Ruchira S. Datta
we need to figure out how to pull people in - Ruchira S. Datta
I mentioned during the session, but forgot to link here (hard to talk and type at the same time!), Marti Hearst's new book _Search User Interfaces_ http://searchuserinterfaces.com - Ruchira S. Datta
How to pull people in - that's the challenge to get this working. Could we get some seed money into this scientific effort and have it distributed with mechanisms similar to Google Adds? Yes if so we could have students and scientists putting efforts into this rather than some obscure webservers, blog etc with Google Adds. But how to generate the seed money? Government grants, donations or pay for usage? - Bo Servenius
Ruchira S. Datta
FFers open thread
I'm here and will be attending the BioPathways SIG. I figured there should be some place to say non-talk-specific things like "I've arrived!", so here it is. - Ruchira S. Datta
Will be switching between DAM, BOSC and the BioPathways. Once I've figured out which talk is when ;) - Oliver Hofmann
Uhoh. I sense a lack of places to recharge notebooks. This is going to be fun during the session breaks.. - Oliver Hofmann
I forsee a problem with the SIG posts in FF - only one heading, and many, many speakers.... Hopefully will be OK for people following. - Allyson Lister
The projectionist kindly suggested I power up in the projection cubicle during the coffee break. - Ruchira S. Datta
Network is slowing down to a crawl for me. - Oliver Hofmann
Sitting at home, following the conference. Great job folks! My only wish is that there was some place where the speakers would upload their presentations. - Lars Juhl Jensen
Help! Anyone got a UK plug adaptor with them I could borrow for an hour or so? Will happily exchange laptop charge for beer/coffee. - Cass Johnston
Oliver, I've been having that problem too. There were some times when I had posted a comment and wanted to post another one, but the first hadn't finished posting, so I had to wait for the post to finish, hold my comment in my head, and also pay attention to what the speaker was currently saying... - Ruchira S. Datta
Talked to the organizers. They are aware of the problem, IT is looking into increasing bandwidth. Yay. - Oliver Hofmann
@Oliver well done! Perhaps we could ask someone to look into more power extension leads? :) - Allyson Lister
Anyone up for a FF lunch / dinner / drink at some point during ISMB? - Oliver Hofmann
Sounds good. I'm about to go orienteering, though, so will be out of reach this evening... - Ruchira S. Datta
Monday and Tuesday are the only "free" nights, with poster sessions going from 5:45 to 8:30 pm. Shall we just set Monday night for the FF meet-up? When would people want to leave the conference at the earliest? 8 pm? - Michael Kuhn
Monday's going to be busy meeting everyone; I'd probably try and stick around 'till 8pm. Meet at the registration booth close to the entrance at 8:15? Can decide on a venue depending on our numbers and whether people are willing to travel to the city center. - Oliver Hofmann
I think I would prefer 8:40. I have a poster. I haven't figured out yet whether it's Monday or Tuesday, but others might have posters too? Last year, at least, I was explaining my poster to people pretty much constantly until the poster session closed. - Ruchira S. Datta
Monday or Tuesday is good for me - either way, and any time is fine. It's a good idea! - Allyson Lister
8:45 on Monday then, same place (registration area)? Do we have any locals with suggestions? - Oliver Hofmann
ok - see you all in registration at 8:45 monday :) - Allyson Lister
If everyone has already seen Gamla Stan by that time, we could go to Södermalm--plenty of good restaurants. I'm staying there. - Ruchira S. Datta
@Ruchira: works for me as long as we have a guide to get us there :) - Oliver Hofmann
@Allyson There are lots of power extensions along the windows just inside the T hall. Let me know if those are still too full and I'll see what I can do about getting more. - Dave Messina
@Oliver The conference organizers have bought high-speed internet. Anyone can go to the registration desk for a free login. Spread the word! - Dave Messina
@Dave, yep, got mine this morning. World of a difference. - Oliver Hofmann
@Dave, any chance of getting extension cords to the other conference rooms? Starved for outlets at K21 :) - Oliver Hofmann
I'll be arriving today, would it be worth to pack a extension cords with multiple sockets? - Roland Krause
@Roland: This is always a good idea. Problem is that some rooms only have power on the stage and in the projection booth. - Michael Kuhn
Organizers also have extra extension cords for emergencies, but even the main conference room has no outlets. Best to recharge during the breaks, but that means leaving the notebook unattended... - Oliver Hofmann
photos from ISMB/ECCB, please put on flickr under tag "ismbeccb2009" - Burkhard Rost
The projectionist here just unplugged my power cord from their outlet and said none of us will be allowed to use their outlet; we'll need to use the outlets in Hall C. He said the organizers did not pay for extra outlets, so we will still have this power problem. - Ruchira S. Datta
Well, at least I'll be able to sit closer to the front where I can see better. - Ruchira S. Datta
Even the "free" network is considerably faster and more responsive today. Well… we'll see what happens when the first coffee break hits. - Johann Visagie
@Ruchira: Ouch. That's not going to help. We can get extension cords and use them in the smaller meeting rooms though. - Oliver Hofmann
@Johann agreed - I also think the network is faster today - Allyson Lister
I love the fierce, ofttimes violent competition for the limited resource that is power outlets during coffee break! - Johann Visagie
It seems a simple #ismb is the twitter hashtag of choice. Also note Burkhard's suggestion above to use "ismbeccb2009" as Flickr tag. (Hey! Flickr tags can has spaces! ;) - Johann Visagie
@Johann: Bring your own socket multiplier and you are loved by everyone ;-) - Oliver Hofmann
Any chance of leaving / recharging our desktops at the registration desk during the breaks? Would rather not leave mine in a conference room unattended. - Oliver Hofmann
Lonely FFer near the registration desk is looking for some chatting... anyone? - Egon Willighagen
Sure, I'll wander by momentarily. - Ruchira S. Datta
I'm sitting on the ground in front of the registration desk, waering a black t-shirt... - Egon Willighagen
Any poster presenters here on FriendFeed? Maybe you can start a new post in the ISMB/ECCB room and attach your poster PDF? - Egon Willighagen
Great idea, Egon! Not being at the conference, I would very much appreciate being able to see some of the posters on FF. - Lars Juhl Jensen
Lars, are you in Stockholm? - Egon Willighagen
What about special sessions, I do not see them here. - Diego M. Riaño-Pachón
I'm having a special FF sessions outside, but I am the only participant :) (after a nice meet up with Ruchira earlier) - Egon Willighagen
@Egon - shame - I was offline for lunch. At least I assume it was lunch you said that. FF doesn't put up the time each comment was made - so add times please if you're talking about meetups! Hopefully see most of you this evening after the poster session anyway :) - Allyson Lister
P.S. The network seems to have slowed down again. Anyone else notice, or is it just me? - Allyson Lister
@Dave thanks for the power extension cable - except for the main victoria hall (as we discussed) it's really helped a lot :) - Allyson Lister
@Allysin, I'm still around... in front of E.A.T. now.... - Egon Willighagen
@Egon - ah, but I'm in a talk! :) - Allyson Lister
@Egon, no I decided not to go this year - Lars Juhl Jensen
@Allyson: As half of the conference is probably sharing the username / password for the closed network I am not surprised it is slowing down :) - Oliver Hofmann
Maybe it is because I am happily uploading Jmol 11.7.45 to SourceForge ? :) - Egon Willighagen
@Allyson Glad to help! - Dave Messina
@Oliver Hopefully it's that rather than someone downloading Genbank. :) Still plenty of FREE logins for the closed wifi are available at the registration desk for anyone who wants one. - Dave Messina
Fun to see how quickly many of us have found a workaround for the 'Oops'... :) - Egon Willighagen
Is there a place for posting jobs/student positions available to ISMB attendees? (i.e. http://tinyurl.com/brinkma...) I'm not at the ISMB this year, so am definitely checking out FF and related resources and thank you all for your posts! - Fiona Brinkman
Does anyone know how to 'filter out' all group:ismbeccb2009 messages in my main feed? So that I can just open the ISMB/ECCB 2009 group in a separate window? - Egon Willighagen
Thanks to everyone - organizers, reviewers, presenters, friends, bloggers etc etc etc : I must away, but had loads of fun! :) - Allyson Lister
@Allyson, safe travels. @all, see you in Boston. Feel free to PM/mail me for tips where to stay, eat or party in the city. - Oliver Hofmann
nice meeting you, Allyson, and have a safe trip! - Ruchira S. Datta
I am leaving too, was nice meeting you all and have a good trip back. - Roland Krause
Conference finally closed. - Peter Menzel
i'm now at the Stockholm airport en route to Cambridge for another conference (!) http://www.functionalgenomics.org.uk/section... It was fun meeting some of you and microblogging together. Hope we'll see each other around at future conferences. Next year in Boston! - Ruchira S. Datta
@Egon: if you go to the ismbeccb09 room and click the "(edit)" link to the right of "Lists" then you can deselect it from your Home feed. - Lars Juhl Jensen
A top-ten of talks (well, not really - just the top ten of my ISMB posts based on number of hits). What was everyone interested in enough to click through to my blog? Find out: http://themindwobbles.wordpress.com/2009... - Allyson Lister
Lars Juhl Jensen
Wordle cloud of the contributors (as requested by Cameron Neylon)
ISSMECCB2009_FriendFeed_Contributors.png
no, sorry - I have not found a way to make Wordle keep words together - Lars Juhl Jensen
with an underscore ? - Pierre Lindenbaum
yes, Frank - that will obviously work (see new post) - Lars Juhl Jensen
Oliver Hofmann
ISMB/ECCB Stockholm 2009: Bioinformatics Core Facilities Workshop, Fran Lewitter
Project priorities: just long term misses new customers, grant opportunities. Focus on money or project size misses pilot projects, does not allow diversification. Suggestion: based on merit, what allows the core to grow in new directions, expand, supports the institutional community in general - Oliver Hofmann
Hiring: 50% FTE available for a new person, enough to get someone started. Identify new technologies, hire to get an early start. Consultants can fill gaps. - Oliver Hofmann
Time: maintain an overview of timeframes (putative project starts). Wrap up projects to avoid task switching overhead as much as possible. Make time for the planning stages. - Oliver Hofmann
Expectations: be open and transparent with regards to availability, feasibility, stick to realistic time estimates and turn down work if the resources are not available. Collaborate between Cores - Oliver Hofmann
Cancer UK, Cambridge Core. Prioritize short tasks, genomic tasks. Use queuing system, use steering committee to guide on participation in long term researcher-based projects - Oliver Hofmann
usually 6-10 projects per person at any given time (ouch...) - Oliver Hofmann
Manage workload: define scope early, manage using collaboration software, deliver data in stages - Oliver Hofmann
automate and standardize - Oliver Hofmann
Training / empowering researchers the most important aspect - Oliver Hofmann
Report regularly to all groups / departments, communicate - Oliver Hofmann
Request co-authorship (but do not seem to require it) - Oliver Hofmann
Start of discussion - Oliver Hofmann
Current developments and changes: focus on next-gen - Oliver Hofmann
Benefits of training - Oliver Hofmann
Chargeback model. half of participants fully supported by their institutes - Oliver Hofmann
Hourly charges between 70 and 120$ - Oliver Hofmann
Authorship: mostly just acknowledgement, many cores with focus on master's level students - Oliver Hofmann
Collaboration: central place to share knowledge of methods, tools, evaluation. One place to deposit this information could be the http://www.bioinfo-core.org/index... - Oliver Hofmann
Try to find ways to pool information and resources (similar to the BOSC OpenBio projects) - Oliver Hofmann
Switch of topics on how to handle next-gen seq influx - Oliver Hofmann
What kind of questions are being asked, can they handle the data themselves, and is there any way to build re-useable workflows? Experience seems to be that so far no question (beyond the assembly step) has come up twice. - Oliver Hofmann
Develop method agnostic tools that help the biologist to get a handle on their own data, Eg http://www.bioinformatics.bbsrc.ac.uk/project... - Oliver Hofmann
Mikhail Spivakov
Special Session 6: Regulatory Genome Architecture and Noncoding Mutations in Human Disease
Talk: Veronica van Heyningen, "Lessons from regulatory mutations in developmental anomalies" - Mikhail Spivakov
Highly conserved non-coding DNA seqs emerge as regulator elements - bind multiple TFs - Mikhail Spivakov
Genes with complex expression patterns require complex control - Mikhail Spivakov
Disease-associated chromosomal breaks outside gene highlight distant regulatory elements, often within neighboring genes - Mikhail Spivakov
Different mechanisms of disruption: - Mikhail Spivakov
key elements separated from transcriptional unit - total loss of function - Mikhail Spivakov
altered chromatin organisation (originally position effect variegation) - Mikhail Spivakov
Pax6, Spx2, Otx2 - genes with important role in eye development that are also important in the brain; continue to be expresed in the adult organism as well - Mikhail Spivakov
Explore Pax6 long-range control elements - up to 180 kb away from the gene - Mikhail Spivakov
use distant elements in reporter transgenic analysis to look at expression patterns they drive - different elements drive expression in different tissues at different times. however, find large overlap, so there's no easy rule according to which distant regulatory elements work together to establish the expression pattern - Mikhail Spivakov
some distant elements are within a neighboring housekeeping gene - Mikhail Spivakov
study deletions in various regions by using a reporter construct, compare with isolated enhancer function - Mikhail Spivakov
find an element in Pax6 intron that drives expression in the (zebrafish) heart - unexpectedly - Mikhail Spivakov
also find a Pax6-response element: autoregulation - Manuel et al, Development, 2007 - Mikhail Spivakov
a mutation in a Pax6 reg element located in the intron of the neighboring gene has been associated with (a mild form of) epilepsy - Mikhail Spivakov
some enhancers work only at later developmental stages (eg, E17.5) - Mikhail Spivakov
need methods for phenotype prediction based on regulatory mutations - eg, enhancer-driven knockouts - Mikhail Spivakov
in zerbafish, there are pax6 forms - pax6 a (expressed in the barin), pax6b (expressed in the pancreas), both are expressed in the eye - Mikhail Spivakov
the regulatory elements for the brain were lost from pax6b, pax6a has lost its neighboring gene (ie, the reg elements located within it), pax6b retained some a part of it - Mikhail Spivakov
Move to Sox9 - Mikhail Spivakov
haploinsufficiency causes campomelic dysplasia and sex reversal - the breakpoints can be very far away, the severity of phenotype decreases with the breakpoint position being further away - however mutations causing Pierre Robin syndrome may localise to a very specific long distance element ~1.1 Mb away from the gene, small deletions at 1.4 Mb upstream _AND_ 1.4 Mb downstream. How these elements work together is yet to be determined. - Mikhail Spivakov
Another example: Van der Woude severe phenotype caused by intragenic mutations in IRF6; a regulatory variant of IRF6 (a SNP upstream of the gene, disrupting an AP2alpha binding site) predisposes to sporadic CLP (cleft lip and palate) - Mikhail Spivakov
Discuss developmental abnormalities caused by mutations in SHH (sonic hedgehog) - Mikhail Spivakov
a related phenotype is produced by an insertion in the intron of a neighboring gene - LMBR1 - Mikhail Spivakov
Pax6 and Sox2 interact to cross and autoregulate, mutations in the "double" binding sites (ie for both) at different cells result in mutant phenotypes. Note that TF dosage is important. - Mikhail Spivakov
Next talk: James P. Noonan, The Role of Developmental cis-Regulatory Change in Human Evolution - Mikhail Spivakov
what makes us human? unique abilities <- developmental changes (brain, limbs etc). What's the molecular basis of these traits? - Mikhail Spivakov
changes in gene regulation vs changes in genes themselves. perhaps there's a lot of the former (in the evolution of primates) - Mikhail Spivakov
two approaches: 1) in vivo chacaterization of cis-regulatory modules with human-specific developmental functions, 2) whole-transcriptome comparative analysis - Mikhail Spivakov
start with (1). how to identify developmentally active CRMs in the human genome? - Mikhail Spivakov
the more conserved an element is the more likely it'll function as an enhancer - Mikhail Spivakov
also look at chromatin signatures - for example, p300 binding sites are 90% predictive of enhancers (Visel et al, Nature 2009) - Mikhail Spivakov
look for noncoding elements that are conserved across species but rapidly evolve in humans - human-accelerated conserved noncoding sequences (HACNSs), Prabhakar et al, Science 2006. - Mikhail Spivakov
Developed stats models to measure evolutionary rates, estimate probability of human-specific substitution, calculate likelihood of observed pattern of substitutions in each noncoding element - estimate acceleration p-value. Identified 992 HACNSs out of ~10000 conserved elements - Mikhail Spivakov
associate with neighboring genes -> GO analysis. only 1/734 functional categories showed significant association: cell adhesion! More detailed look - neuronal cell adhesion, axon guidance, synapse formation - Mikhail Spivakov
Do HACNSs function as reg elements in vivo? Transgenic assays in mouse (select elements based on overall constraint and human-specific acceleration). Many elements show specific developmental patterns of expression in these assays - Mikhail Spivakov
Focus on HACNS1 - associated with gbx2, highly conserved in all terrestrial vertebrates, evolving 4x faster than neutral rate in human, sequence changes are fixed. The sequence change is association with a gain of function: human-specific anterior limb expression. - Mikhail Spivakov
this gain of function has been achieved by 13 substitutions in the sequence compared to chimp (proved by "humanizing" the chimp region and "chimping" the human region by making these substitutions) - Mikhail Spivakov
drives expression in the anterior-most digit at e13.5 - may contribute to the development of longer thumbs in the human compared to chimp/orangutan? - Mikhail Spivakov
HACNSs: selection or biased gene conversion? Biased gene conversion - postulated to increase the fixation rate of AT to GC substitutions. Mutagenic effect of recombination hotspots + preference in recombination-associated DNA repair toward GC vs AT pairs. - Mikhail Spivakov
Synergy between positive selection and BGC in HACNS1? - Mikhail Spivakov
HACNSs are not biased towards high-recombining regions. The human-specific substitution rate is not elevated outside of HANS1 - but GC rate is. - Mikhail Spivakov
Move from genotype to phenotype: comparative gene expression profiling in developing cortex. - Mikhail Spivakov
Mouse vs human fetal forebrain, look at gene expression profiles of several cortical stem cells by RNAseq. Purify cell populations by laser capture microdissection. (collaboration with Rakic/Ayoub @ Yale). Recover 50 ng of total RNA from ~10,000 cells, which is more than enough for RNAseq. Comparison of different cell populations in the developing mouse cortex - very high correlation of expression. Differentially expressed genes "make sense". - Mikhail Spivakov
Ultimately, want to associate differentially expressed genes with HACNSs - Mikhail Spivakov
Next talk: Steve Montgomery, Population genomics of human gene expression using next-generation sequencing technology - Mikhail Spivakov
looked at two different quantitative traits: exon reads and splicing pairs - Mikhail Spivakov
~80% reads map to known exons. 40% reads span multiple exons (insert size: ~150bps) - Mikhail Spivakov
data normalisation of RNAseq data is still a challenge. used a simple scaling approach - Mikhail Spivakov
test data filter criterion: <10% of individuals equal to zero - Mikhail Spivakov
see ~ 11,000 genes, corresponding to ~95,000 exons; ~23,000 exon pairs -> from 1 lane per individual. - Mikhail Spivakov
there's a decent correlation between any two individuals (same sample same run: corr=0.93, diff sample, same run: corr=0.89, diff sample, diff run: corr=0.86) - Mikhail Spivakov
Correlation of exon reads within genes (test for freq of alternative splicing) - corr is good, but not perfect for exons with high mean read count, so some information is escaped by array analyses that assume a fixed exon structure for each gene. - Mikhail Spivakov
Adding a number of lanes / individuals increase the number of identified splice variants => many of alternatively spliced transcripts are rare - Mikhail Spivakov
Detecting eQTLs with RNAseq data: assume an additive model of gene expression, for each SNP look at the number of reads, use spearman rank correlation to assess the association - Mikhail Spivakov
detected 2431 genes (4219) exons associated with SNPs at p~0.01 - Mikhail Spivakov
eQTLs detected with exon counts centred around TSSs - Mikhail Spivakov
consider using RNAseq to detect SNP association with known splice variants - took 1767 known splice variants from Ensembl, tested for association against splicing pair counts, data look promising - Mikhail Spivakov
Q: what cells used? A: EBV-transformed lymphoblastoid cells. Q: is this a bit of a constraint to look only at these cells? A: looked at eQTLs in other lymphoblastoid cell lines collected long time ago, see a lot of correlation. Did not look at other tissues. - Mikhail Spivakov
Q: did they look at 20% unannotated exons? A: No, can't do much until new gene models are built from them. - Mikhail Spivakov
Q: where are the SNPs? Association with HCEs? Gene functional categories enriched for SNPs? - Mikhail Spivakov
A: in 1Mb windows, see enrichment for eQTLs right around the TSSs rather than remote regulatory regions. Not a large number fall within known regulatory elements. - Mikhail Spivakov
Next talk: Nadav Ahituv, Deconstructing gene regulatory elements - Mikhail Spivakov
Geneticists are good at finding genes from sequence, can do much less with detecting regulatory elements - in the vast space of 98% of the genome that is non-coding. Also don't know the effect of nucleotide substitutions at regulatory elements. - Mikhail Spivakov
What can we do about it? 1) use comparative genomics to find conserved non-coding regions. 2) high-throughput characterisation of enhancers - Mikhail Spivakov
At Berkley, looked at extremely conserved non-coding sequences (n~3000, 70% similarity, >=100bp human-fugu) and ultracobserved (n~250, 100% similarity). Many of them turn out to drive expression in transgenic assays in the mouse. Now moved to UCSF, are setting up the same assays in zebrafish - easier to move it to high-throughput analysis - Mikhail Spivakov
enhancer.lbl.gov: resource for experimentally validated human noncoding fragments with gene enhancer activity as assessed in transgenic mice. - Mikhail Spivakov
These data can be used in a variety of ways. They aim to understand the regulatory code at enhancers. Look for limb-specific signatures, check whether they see them at limb-associated genes, check for new regions with these signatures in the human genome, test in zebrafish (expression in fin), if positive check in mouse (expression in limb) - Mikhail Spivakov
look at ~480 ultraconserved elements (>50% non-coding, 100% identity and <=200b human-mouse-rat), depleted for SNPs - Mikhail Spivakov
extreme sequence constraint = extreme functional constraint? - Mikhail Spivakov
took four UCE: positive in enhancer assays, near genes when mutated lead to either a lethal or sexual development phenotype, variable in size - Mikhail Spivakov
generated mouse knockouts for these UCSs - heterozygous KOs are fine, even homozygotes have no apparent phenotype!!! - Mikhail Spivakov
so why is it? - Mikhail Spivakov
maybe we just can't detect the phenotype? - Mikhail Spivakov
maybe there's gene / reg module redundancy? It certainly exists - eg at UCEs upstream of ARX - Mikhail Spivakov
maybe it's because mutations will cause gain-of-function (by binding of other TFs) rather than loss-of-function? in this case, deleting the whole thing altogether won't help identify this - Mikhail Spivakov
support for this model: replaced a limb enhancer in mouse with a bat enhancer that has only 20bp difference -> observed limb elongation. Also bioinfomatic support showing that these seq's are prone to substitutions and indels (?) - Mikhail Spivakov
Take four UCE that they KO'd, put an 83bp indel and introduced a 3bp change, both chosen more or less randomly. Very preliminary results on the 83bp insertion into two of the enhancers: made mice, after 10 generations of backcross, het:het mating recovers fewer heterozygous and homozygous mice than expected. - Mikhail Spivakov
ISMB/ECCB
TT48: David Croft - Reactome tools for expression data and pathway visualization
Reactome collects processes at the cellular level. - Gabriele Sales
Plans for a new visualization for the website. - Gabriele Sales
At the top of the reactome webpage there is a link pointing to the new interface. - Gabriele Sales
The left bar of the interface collect links to single pathways. - Gabriele Sales
Google Map-like buttons: zoom in / out, scroll. - Gabriele Sales
By clicking on a compound, list of other pathways it participates to. - Gabriele Sales
Free text searches: multiple results displayed in a tab. - Gabriele Sales
The ENFIN project: http://www.enfin.org/ - Gabriele Sales
ISMB/ECCB
TT47: Chris Rawlings - Semantic Data Integration for Systems Biology Research
Also speaking: Catherine Canevet and Paul Fisher - Allyson Lister
Two systems to integrate data. - Gabriele Sales
BBSRC-funded research collaboration in Newcastle, Manchester, and Rothamsted : ONDEX and Taverna - Allyson Lister
Demo on the integration and validation of yeast metabolome models. - Gabriele Sales
Taverna is a workflow workbench to integrate tools (including web services). - Gabriele Sales
Taverna can link ONDEX and external (ex. PubMed) data sources via the web-service interface. - Gabriele Sales
When ONDEX works with Taverna, instead of using the pipeline manager you use the ONDEX web services and access ONDEX from Taverna. This means you can use Taverna to pull in data into ONDEX. - Allyson Lister
Outline of the demo: starting from Janboree Network SMBL, parse it into Ondex, remove currency metabolites and annotate using network analusis results. - Gabriele Sales
Then switch to Taverna. Identify orphans, retrive related enzymes, assemble a PubMed query and link results to the graph. - Gabriele Sales
the workflow for relevant pubmed entry retrieval seems staggeringly complex - but I guess this is because each input appears as a node in the workflow (is that correct ?) - Jim Procter
# ondex vs cytoscape? - Andrew Su
@Andrew - not sure - ondex uses Jung as the underlying layout engine (and Alan Kuchinsky of Cytoscape is asking a question!) - Jim Procter
Question: how does the system scales for large amount of data? - Gabriele Sales
@Jim, but they are similar use cases covered btw the two systems? - Andrew Su
@I think there are but the response to that question seemed to suggest that ondex was aimed at denovo network generation rather than massive visualization (this needs to be clarified) - Jim Procter
@Jim, thanks... - Andrew Su
ISMB/ECCB
HL54: Mona Singh - Search and discovery of recurring patterns with interactomes
Hairballs. - Roland Krause
Different large scale data sets, genetic, phosphorylation etc. exist but how to interpret? - Roland Krause
Add protein annotatons, sequence, structure, motifs, domains, functional characterization. - Roland Krause
Particular interested in interaction domains, analyzing cellular organization and interactomes. Can we discover and analyze recurring patterns. - Roland Krause
Analogy to multiple alignment of protein sequences with PROSITE pattern - Roland Krause
For networks schemas, nodes are description of proteins, e.g. domains. An extension of network by attributes, e.g. PFAM or PROSITE. - Roland Krause
Examples for network schema are homologous pathways. - Roland Krause
Netgrep has visual interface. - Roland Krause
No details of the algorithm, tricky problem but can solve it fast. - Roland Krause
Triangles, quad topology (linear combination of four), Y star topology of four. - Roland Krause
Automatically discovery of overrepresented schemas with direct interactions and sequence motifs. Using Y2H with filtering to address quality of the data set. - Roland Krause
Counting occurrences of labelled subgraphs, brute force approach using Netgrep. - Roland Krause
score each possible schema - Ruchira S. Datta
count how frequently it occurs, but different annotations have different frequencies - Ruchira S. Datta
need to account for overrepresentation - Ruchira S. Datta
Some annotations are more frequent than other, score has to incorporate both the frequency. - Roland Krause
Randomized network - Roland Krause
take each network, randomize it, count how often the annotation occurs in the randomized network and compute the overrepresentation term - Ruchira S. Datta
Estimate false discovery rate, look at features with FDR < 0.05 - Roland Krause
count how often the score occurs in the random graph, to estimate the false discovery rate - Ruchira S. Datta
when making the random graph with the pair schemas, preserve degree - Ruchira S. Datta
for triplet schemas, also preserve annotation - Ruchira S. Datta
For 3-vertex schemas, need to presever pairwise annotation acounts. - Roland Krause
for higher order schemas, also preserve triplets - Ruchira S. Datta
used Stub Wiring from Uri Alon's group for randomizing the pairwise schema - Ruchira S. Datta
no known algorithm for randomizing uniformly, need to approximate - Ruchira S. Datta
151 pairwise schema [...] - Roland Krause
do the proteins making up these schemas share biological processes? - Ruchira S. Datta
Hypergeometric evaluation of schemas. - Roland Krause
check enrichment versus background - Ruchira S. Datta
Graph of the pairwise schema network. - Roland Krause
For triplets, nodes are themselves pairwise schemas. - Roland Krause
Hubs in the graph are ras, kinas and [...] - Roland Krause
cross-interactomics: repeated the process in human - Ruchira S. Datta
Example of the DUP family in yeast, and pair comparison between human and yeast. - Roland Krause
Human-specific schemas, some consist of domains that also exist in yeast. - Roland Krause
Q. Robustness of the resulting networks. A. Removed 5%, found the network to be stable. Different results in different organisms. - Roland Krause
Q. Use of subgraph sampling? A. Can use any topology for searches. - Roland Krause
# Not sure I understand the question and answer but they do. - Roland Krause
Q. (Ruchira) Human-yeast network did you check orthology? A. Used PFAM annotation, did not worry about orthology. Some motifs in human are from the expansion. - Roland Krause
ISMB/ECCB
HL55: Michael Sammeth - The Computational Exploration of (Alternative) Splicing Mechanisms
Numbering all possible splice variant sites, assign ASCII symbol to each site, "alternative splicing code" that captures all differences between alternative splicing variation - sebi
ASTALAVISTA web service generates these strings at http://genome.imim.es/astalav... - sebi
ESTs can be truncated, and still splicing events can be detected as a sub-structure - sebi
"Bubbles" are cyclic graphs, complete events of splicing. Can construct bubble hierarchies for very complex splicing loci. - sebi
Workflow: Gene annotation -> database of events -> visualization with ASTALAVISTA; optionally add RNA-Seq data for expression levels of events (even simulated data from the Flux Capacitator) - sebi
Other ways to read this feed:Feed readerFacebook