Sign in or Join FriendFeed
FriendFeed is the easiest way to share online. Learn more »
ISMB/ECCB Stockholm 2009

ISMB/ECCB Stockholm 2009

17th Annual International Conference on Intelligent Systems for Molecular Biology & 8th European Conference on Computational Biology
The talk specific feeds will be created each day shortly before the start of the first presentation. Find talk specific blogs by searching here for the authors, the title of the talk or the talk identifier as given in the program (like HL03 for the 3rd Highlight paper) The feeds can also be accessed on the conference pages in the according sections: SIGs, Keynotes, Proceedings Track, Technology Track and Highlights and the last few blogs are shown on our web-portal page.

Happy blogging !!
Ruchira S. Datta
FFers open thread
I'm here and will be attending the BioPathways SIG. I figured there should be some place to say non-talk-specific things like "I've arrived!", so here it is. - Ruchira S. Datta
Will be switching between DAM, BOSC and the BioPathways. Once I've figured out which talk is when ;) - Oliver Hofmann
Uhoh. I sense a lack of places to recharge notebooks. This is going to be fun during the session breaks.. - Oliver Hofmann
I forsee a problem with the SIG posts in FF - only one heading, and many, many speakers.... Hopefully will be OK for people following. - Allyson Lister
The projectionist kindly suggested I power up in the projection cubicle during the coffee break. - Ruchira S. Datta
Network is slowing down to a crawl for me. - Oliver Hofmann
Sitting at home, following the conference. Great job folks! My only wish is that there was some place where the speakers would upload their presentations. - Lars Juhl Jensen
Help! Anyone got a UK plug adaptor with them I could borrow for an hour or so? Will happily exchange laptop charge for beer/coffee. - Cass Johnston
Oliver, I've been having that problem too. There were some times when I had posted a comment and wanted to post another one, but the first hadn't finished posting, so I had to wait for the post to finish, hold my comment in my head, and also pay attention to what the speaker was currently saying... - Ruchira S. Datta
Talked to the organizers. They are aware of the problem, IT is looking into increasing bandwidth. Yay. - Oliver Hofmann
@Oliver well done! Perhaps we could ask someone to look into more power extension leads? :) - Allyson Lister
Anyone up for a FF lunch / dinner / drink at some point during ISMB? - Oliver Hofmann
Sounds good. I'm about to go orienteering, though, so will be out of reach this evening... - Ruchira S. Datta
Monday and Tuesday are the only "free" nights, with poster sessions going from 5:45 to 8:30 pm. Shall we just set Monday night for the FF meet-up? When would people want to leave the conference at the earliest? 8 pm? - Michael Kuhn
Monday's going to be busy meeting everyone; I'd probably try and stick around 'till 8pm. Meet at the registration booth close to the entrance at 8:15? Can decide on a venue depending on our numbers and whether people are willing to travel to the city center. - Oliver Hofmann
I think I would prefer 8:40. I have a poster. I haven't figured out yet whether it's Monday or Tuesday, but others might have posters too? Last year, at least, I was explaining my poster to people pretty much constantly until the poster session closed. - Ruchira S. Datta
Monday or Tuesday is good for me - either way, and any time is fine. It's a good idea! - Allyson Lister
8:45 on Monday then, same place (registration area)? Do we have any locals with suggestions? - Oliver Hofmann
ok - see you all in registration at 8:45 monday :) - Allyson Lister
If everyone has already seen Gamla Stan by that time, we could go to Södermalm--plenty of good restaurants. I'm staying there. - Ruchira S. Datta
@Ruchira: works for me as long as we have a guide to get us there :) - Oliver Hofmann
@Allyson There are lots of power extensions along the windows just inside the T hall. Let me know if those are still too full and I'll see what I can do about getting more. - Dave Messina
@Oliver The conference organizers have bought high-speed internet. Anyone can go to the registration desk for a free login. Spread the word! - Dave Messina
@Dave, yep, got mine this morning. World of a difference. - Oliver Hofmann
@Dave, any chance of getting extension cords to the other conference rooms? Starved for outlets at K21 :) - Oliver Hofmann
I'll be arriving today, would it be worth to pack a extension cords with multiple sockets? - Roland Krause
@Roland: This is always a good idea. Problem is that some rooms only have power on the stage and in the projection booth. - Michael Kuhn
Organizers also have extra extension cords for emergencies, but even the main conference room has no outlets. Best to recharge during the breaks, but that means leaving the notebook unattended... - Oliver Hofmann
photos from ISMB/ECCB, please put on flickr under tag "ismbeccb2009" - Burkhard Rost
The projectionist here just unplugged my power cord from their outlet and said none of us will be allowed to use their outlet; we'll need to use the outlets in Hall C. He said the organizers did not pay for extra outlets, so we will still have this power problem. - Ruchira S. Datta
Well, at least I'll be able to sit closer to the front where I can see better. - Ruchira S. Datta
Even the "free" network is considerably faster and more responsive today. Well… we'll see what happens when the first coffee break hits. - Johann Visagie
@Ruchira: Ouch. That's not going to help. We can get extension cords and use them in the smaller meeting rooms though. - Oliver Hofmann
@Johann agreed - I also think the network is faster today - Allyson Lister
I love the fierce, ofttimes violent competition for the limited resource that is power outlets during coffee break! - Johann Visagie
It seems a simple #ismb is the twitter hashtag of choice. Also note Burkhard's suggestion above to use "ismbeccb2009" as Flickr tag. (Hey! Flickr tags can has spaces! ;) - Johann Visagie
@Johann: Bring your own socket multiplier and you are loved by everyone ;-) - Oliver Hofmann
Any chance of leaving / recharging our desktops at the registration desk during the breaks? Would rather not leave mine in a conference room unattended. - Oliver Hofmann
Lonely FFer near the registration desk is looking for some chatting... anyone? - Egon Willighagen
Sure, I'll wander by momentarily. - Ruchira S. Datta
I'm sitting on the ground in front of the registration desk, waering a black t-shirt... - Egon Willighagen
Any poster presenters here on FriendFeed? Maybe you can start a new post in the ISMB/ECCB room and attach your poster PDF? - Egon Willighagen
Great idea, Egon! Not being at the conference, I would very much appreciate being able to see some of the posters on FF. - Lars Juhl Jensen
Lars, are you in Stockholm? - Egon Willighagen
What about special sessions, I do not see them here. - Diego M. Riaño-Pachón
I'm having a special FF sessions outside, but I am the only participant :) (after a nice meet up with Ruchira earlier) - Egon Willighagen
@Egon - shame - I was offline for lunch. At least I assume it was lunch you said that. FF doesn't put up the time each comment was made - so add times please if you're talking about meetups! Hopefully see most of you this evening after the poster session anyway :) - Allyson Lister
P.S. The network seems to have slowed down again. Anyone else notice, or is it just me? - Allyson Lister
@Dave thanks for the power extension cable - except for the main victoria hall (as we discussed) it's really helped a lot :) - Allyson Lister
@Allysin, I'm still around... in front of E.A.T. now.... - Egon Willighagen
@Egon - ah, but I'm in a talk! :) - Allyson Lister
@Egon, no I decided not to go this year - Lars Juhl Jensen
@Allyson: As half of the conference is probably sharing the username / password for the closed network I am not surprised it is slowing down :) - Oliver Hofmann
Maybe it is because I am happily uploading Jmol 11.7.45 to SourceForge ? :) - Egon Willighagen
@Allyson Glad to help! - Dave Messina
@Oliver Hopefully it's that rather than someone downloading Genbank. :) Still plenty of FREE logins for the closed wifi are available at the registration desk for anyone who wants one. - Dave Messina
Fun to see how quickly many of us have found a workaround for the 'Oops'... :) - Egon Willighagen
Is there a place for posting jobs/student positions available to ISMB attendees? (i.e. I'm not at the ISMB this year, so am definitely checking out FF and related resources and thank you all for your posts! - Fiona Brinkman
Does anyone know how to 'filter out' all group:ismbeccb2009 messages in my main feed? So that I can just open the ISMB/ECCB 2009 group in a separate window? - Egon Willighagen
Thanks to everyone - organizers, reviewers, presenters, friends, bloggers etc etc etc : I must away, but had loads of fun! :) - Allyson Lister
@Allyson, safe travels. @all, see you in Boston. Feel free to PM/mail me for tips where to stay, eat or party in the city. - Oliver Hofmann
nice meeting you, Allyson, and have a safe trip! - Ruchira S. Datta
I am leaving too, was nice meeting you all and have a good trip back. - Roland Krause
Conference finally closed. - Peter Menzel
i'm now at the Stockholm airport en route to Cambridge for another conference (!) It was fun meeting some of you and microblogging together. Hope we'll see each other around at future conferences. Next year in Boston! - Ruchira S. Datta
@Egon: if you go to the ismbeccb09 room and click the "(edit)" link to the right of "Lists" then you can deselect it from your Home feed. - Lars Juhl Jensen
A top-ten of talks (well, not really - just the top ten of my ISMB posts based on number of hits). What was everyone interested in enough to click through to my blog? Find out: - Allyson Lister
Lars Juhl Jensen
Wordle cloud of the contributors (as requested by Cameron Neylon)
no, sorry - I have not found a way to make Wordle keep words together - Lars Juhl Jensen
with an underscore ? - Pierre Lindenbaum
yes, Frank - that will obviously work (see new post) - Lars Juhl Jensen
Oliver Hofmann
ISMB/ECCB Stockholm 2009: Bioinformatics Core Facilities Workshop, Fran Lewitter
Project priorities: just long term misses new customers, grant opportunities. Focus on money or project size misses pilot projects, does not allow diversification. Suggestion: based on merit, what allows the core to grow in new directions, expand, supports the institutional community in general - Oliver Hofmann
Hiring: 50% FTE available for a new person, enough to get someone started. Identify new technologies, hire to get an early start. Consultants can fill gaps. - Oliver Hofmann
Time: maintain an overview of timeframes (putative project starts). Wrap up projects to avoid task switching overhead as much as possible. Make time for the planning stages. - Oliver Hofmann
Expectations: be open and transparent with regards to availability, feasibility, stick to realistic time estimates and turn down work if the resources are not available. Collaborate between Cores - Oliver Hofmann
Cancer UK, Cambridge Core. Prioritize short tasks, genomic tasks. Use queuing system, use steering committee to guide on participation in long term researcher-based projects - Oliver Hofmann
usually 6-10 projects per person at any given time (ouch...) - Oliver Hofmann
Manage workload: define scope early, manage using collaboration software, deliver data in stages - Oliver Hofmann
automate and standardize - Oliver Hofmann
Training / empowering researchers the most important aspect - Oliver Hofmann
Report regularly to all groups / departments, communicate - Oliver Hofmann
Request co-authorship (but do not seem to require it) - Oliver Hofmann
Start of discussion - Oliver Hofmann
Current developments and changes: focus on next-gen - Oliver Hofmann
Benefits of training - Oliver Hofmann
Chargeback model. half of participants fully supported by their institutes - Oliver Hofmann
Hourly charges between 70 and 120$ - Oliver Hofmann
Authorship: mostly just acknowledgement, many cores with focus on master's level students - Oliver Hofmann
Collaboration: central place to share knowledge of methods, tools, evaluation. One place to deposit this information could be the - Oliver Hofmann
Try to find ways to pool information and resources (similar to the BOSC OpenBio projects) - Oliver Hofmann
Switch of topics on how to handle next-gen seq influx - Oliver Hofmann
What kind of questions are being asked, can they handle the data themselves, and is there any way to build re-useable workflows? Experience seems to be that so far no question (beyond the assembly step) has come up twice. - Oliver Hofmann
Develop method agnostic tools that help the biologist to get a handle on their own data, Eg - Oliver Hofmann
Mikhail Spivakov
Special Session 6: Regulatory Genome Architecture and Noncoding Mutations in Human Disease
Talk: Veronica van Heyningen, "Lessons from regulatory mutations in developmental anomalies" - Mikhail Spivakov
Highly conserved non-coding DNA seqs emerge as regulator elements - bind multiple TFs - Mikhail Spivakov
Genes with complex expression patterns require complex control - Mikhail Spivakov
Disease-associated chromosomal breaks outside gene highlight distant regulatory elements, often within neighboring genes - Mikhail Spivakov
Different mechanisms of disruption: - Mikhail Spivakov
key elements separated from transcriptional unit - total loss of function - Mikhail Spivakov
altered chromatin organisation (originally position effect variegation) - Mikhail Spivakov
Pax6, Spx2, Otx2 - genes with important role in eye development that are also important in the brain; continue to be expresed in the adult organism as well - Mikhail Spivakov
Explore Pax6 long-range control elements - up to 180 kb away from the gene - Mikhail Spivakov
use distant elements in reporter transgenic analysis to look at expression patterns they drive - different elements drive expression in different tissues at different times. however, find large overlap, so there's no easy rule according to which distant regulatory elements work together to establish the expression pattern - Mikhail Spivakov
some distant elements are within a neighboring housekeeping gene - Mikhail Spivakov
study deletions in various regions by using a reporter construct, compare with isolated enhancer function - Mikhail Spivakov
find an element in Pax6 intron that drives expression in the (zebrafish) heart - unexpectedly - Mikhail Spivakov
also find a Pax6-response element: autoregulation - Manuel et al, Development, 2007 - Mikhail Spivakov
a mutation in a Pax6 reg element located in the intron of the neighboring gene has been associated with (a mild form of) epilepsy - Mikhail Spivakov
some enhancers work only at later developmental stages (eg, E17.5) - Mikhail Spivakov
need methods for phenotype prediction based on regulatory mutations - eg, enhancer-driven knockouts - Mikhail Spivakov
in zerbafish, there are pax6 forms - pax6 a (expressed in the barin), pax6b (expressed in the pancreas), both are expressed in the eye - Mikhail Spivakov
the regulatory elements for the brain were lost from pax6b, pax6a has lost its neighboring gene (ie, the reg elements located within it), pax6b retained some a part of it - Mikhail Spivakov
Move to Sox9 - Mikhail Spivakov
haploinsufficiency causes campomelic dysplasia and sex reversal - the breakpoints can be very far away, the severity of phenotype decreases with the breakpoint position being further away - however mutations causing Pierre Robin syndrome may localise to a very specific long distance element ~1.1 Mb away from the gene, small deletions at 1.4 Mb upstream _AND_ 1.4 Mb downstream. How these elements work together is yet to be determined. - Mikhail Spivakov
Another example: Van der Woude severe phenotype caused by intragenic mutations in IRF6; a regulatory variant of IRF6 (a SNP upstream of the gene, disrupting an AP2alpha binding site) predisposes to sporadic CLP (cleft lip and palate) - Mikhail Spivakov
Discuss developmental abnormalities caused by mutations in SHH (sonic hedgehog) - Mikhail Spivakov
a related phenotype is produced by an insertion in the intron of a neighboring gene - LMBR1 - Mikhail Spivakov
Pax6 and Sox2 interact to cross and autoregulate, mutations in the "double" binding sites (ie for both) at different cells result in mutant phenotypes. Note that TF dosage is important. - Mikhail Spivakov
Next talk: James P. Noonan, The Role of Developmental cis-Regulatory Change in Human Evolution - Mikhail Spivakov
what makes us human? unique abilities <- developmental changes (brain, limbs etc). What's the molecular basis of these traits? - Mikhail Spivakov
changes in gene regulation vs changes in genes themselves. perhaps there's a lot of the former (in the evolution of primates) - Mikhail Spivakov
two approaches: 1) in vivo chacaterization of cis-regulatory modules with human-specific developmental functions, 2) whole-transcriptome comparative analysis - Mikhail Spivakov
start with (1). how to identify developmentally active CRMs in the human genome? - Mikhail Spivakov
the more conserved an element is the more likely it'll function as an enhancer - Mikhail Spivakov
also look at chromatin signatures - for example, p300 binding sites are 90% predictive of enhancers (Visel et al, Nature 2009) - Mikhail Spivakov
look for noncoding elements that are conserved across species but rapidly evolve in humans - human-accelerated conserved noncoding sequences (HACNSs), Prabhakar et al, Science 2006. - Mikhail Spivakov
Developed stats models to measure evolutionary rates, estimate probability of human-specific substitution, calculate likelihood of observed pattern of substitutions in each noncoding element - estimate acceleration p-value. Identified 992 HACNSs out of ~10000 conserved elements - Mikhail Spivakov
associate with neighboring genes -> GO analysis. only 1/734 functional categories showed significant association: cell adhesion! More detailed look - neuronal cell adhesion, axon guidance, synapse formation - Mikhail Spivakov
Do HACNSs function as reg elements in vivo? Transgenic assays in mouse (select elements based on overall constraint and human-specific acceleration). Many elements show specific developmental patterns of expression in these assays - Mikhail Spivakov
Focus on HACNS1 - associated with gbx2, highly conserved in all terrestrial vertebrates, evolving 4x faster than neutral rate in human, sequence changes are fixed. The sequence change is association with a gain of function: human-specific anterior limb expression. - Mikhail Spivakov
this gain of function has been achieved by 13 substitutions in the sequence compared to chimp (proved by "humanizing" the chimp region and "chimping" the human region by making these substitutions) - Mikhail Spivakov
drives expression in the anterior-most digit at e13.5 - may contribute to the development of longer thumbs in the human compared to chimp/orangutan? - Mikhail Spivakov
HACNSs: selection or biased gene conversion? Biased gene conversion - postulated to increase the fixation rate of AT to GC substitutions. Mutagenic effect of recombination hotspots + preference in recombination-associated DNA repair toward GC vs AT pairs. - Mikhail Spivakov
Synergy between positive selection and BGC in HACNS1? - Mikhail Spivakov
HACNSs are not biased towards high-recombining regions. The human-specific substitution rate is not elevated outside of HANS1 - but GC rate is. - Mikhail Spivakov
Move from genotype to phenotype: comparative gene expression profiling in developing cortex. - Mikhail Spivakov
Mouse vs human fetal forebrain, look at gene expression profiles of several cortical stem cells by RNAseq. Purify cell populations by laser capture microdissection. (collaboration with Rakic/Ayoub @ Yale). Recover 50 ng of total RNA from ~10,000 cells, which is more than enough for RNAseq. Comparison of different cell populations in the developing mouse cortex - very high correlation of expression. Differentially expressed genes "make sense". - Mikhail Spivakov
Ultimately, want to associate differentially expressed genes with HACNSs - Mikhail Spivakov
Next talk: Steve Montgomery, Population genomics of human gene expression using next-generation sequencing technology - Mikhail Spivakov
looked at two different quantitative traits: exon reads and splicing pairs - Mikhail Spivakov
~80% reads map to known exons. 40% reads span multiple exons (insert size: ~150bps) - Mikhail Spivakov
data normalisation of RNAseq data is still a challenge. used a simple scaling approach - Mikhail Spivakov
test data filter criterion: <10% of individuals equal to zero - Mikhail Spivakov
see ~ 11,000 genes, corresponding to ~95,000 exons; ~23,000 exon pairs -> from 1 lane per individual. - Mikhail Spivakov
there's a decent correlation between any two individuals (same sample same run: corr=0.93, diff sample, same run: corr=0.89, diff sample, diff run: corr=0.86) - Mikhail Spivakov
Correlation of exon reads within genes (test for freq of alternative splicing) - corr is good, but not perfect for exons with high mean read count, so some information is escaped by array analyses that assume a fixed exon structure for each gene. - Mikhail Spivakov
Adding a number of lanes / individuals increase the number of identified splice variants => many of alternatively spliced transcripts are rare - Mikhail Spivakov
Detecting eQTLs with RNAseq data: assume an additive model of gene expression, for each SNP look at the number of reads, use spearman rank correlation to assess the association - Mikhail Spivakov
detected 2431 genes (4219) exons associated with SNPs at p~0.01 - Mikhail Spivakov
eQTLs detected with exon counts centred around TSSs - Mikhail Spivakov
consider using RNAseq to detect SNP association with known splice variants - took 1767 known splice variants from Ensembl, tested for association against splicing pair counts, data look promising - Mikhail Spivakov
Q: what cells used? A: EBV-transformed lymphoblastoid cells. Q: is this a bit of a constraint to look only at these cells? A: looked at eQTLs in other lymphoblastoid cell lines collected long time ago, see a lot of correlation. Did not look at other tissues. - Mikhail Spivakov
Q: did they look at 20% unannotated exons? A: No, can't do much until new gene models are built from them. - Mikhail Spivakov
Q: where are the SNPs? Association with HCEs? Gene functional categories enriched for SNPs? - Mikhail Spivakov
A: in 1Mb windows, see enrichment for eQTLs right around the TSSs rather than remote regulatory regions. Not a large number fall within known regulatory elements. - Mikhail Spivakov
Next talk: Nadav Ahituv, Deconstructing gene regulatory elements - Mikhail Spivakov
Geneticists are good at finding genes from sequence, can do much less with detecting regulatory elements - in the vast space of 98% of the genome that is non-coding. Also don't know the effect of nucleotide substitutions at regulatory elements. - Mikhail Spivakov
What can we do about it? 1) use comparative genomics to find conserved non-coding regions. 2) high-throughput characterisation of enhancers - Mikhail Spivakov
At Berkley, looked at extremely conserved non-coding sequences (n~3000, 70% similarity, >=100bp human-fugu) and ultracobserved (n~250, 100% similarity). Many of them turn out to drive expression in transgenic assays in the mouse. Now moved to UCSF, are setting up the same assays in zebrafish - easier to move it to high-throughput analysis - Mikhail Spivakov resource for experimentally validated human noncoding fragments with gene enhancer activity as assessed in transgenic mice. - Mikhail Spivakov
These data can be used in a variety of ways. They aim to understand the regulatory code at enhancers. Look for limb-specific signatures, check whether they see them at limb-associated genes, check for new regions with these signatures in the human genome, test in zebrafish (expression in fin), if positive check in mouse (expression in limb) - Mikhail Spivakov
look at ~480 ultraconserved elements (>50% non-coding, 100% identity and <=200b human-mouse-rat), depleted for SNPs - Mikhail Spivakov
extreme sequence constraint = extreme functional constraint? - Mikhail Spivakov
took four UCE: positive in enhancer assays, near genes when mutated lead to either a lethal or sexual development phenotype, variable in size - Mikhail Spivakov
generated mouse knockouts for these UCSs - heterozygous KOs are fine, even homozygotes have no apparent phenotype!!! - Mikhail Spivakov
so why is it? - Mikhail Spivakov
maybe we just can't detect the phenotype? - Mikhail Spivakov
maybe there's gene / reg module redundancy? It certainly exists - eg at UCEs upstream of ARX - Mikhail Spivakov
maybe it's because mutations will cause gain-of-function (by binding of other TFs) rather than loss-of-function? in this case, deleting the whole thing altogether won't help identify this - Mikhail Spivakov
support for this model: replaced a limb enhancer in mouse with a bat enhancer that has only 20bp difference -> observed limb elongation. Also bioinfomatic support showing that these seq's are prone to substitutions and indels (?) - Mikhail Spivakov
Take four UCE that they KO'd, put an 83bp indel and introduced a 3bp change, both chosen more or less randomly. Very preliminary results on the 83bp insertion into two of the enhancers: made mice, after 10 generations of backcross, het:het mating recovers fewer heterozygous and homozygous mice than expected. - Mikhail Spivakov
TT48: David Croft - Reactome tools for expression data and pathway visualization
Reactome collects processes at the cellular level. - Gabriele Sales
Plans for a new visualization for the website. - Gabriele Sales
At the top of the reactome webpage there is a link pointing to the new interface. - Gabriele Sales
The left bar of the interface collect links to single pathways. - Gabriele Sales
Google Map-like buttons: zoom in / out, scroll. - Gabriele Sales
By clicking on a compound, list of other pathways it participates to. - Gabriele Sales
Free text searches: multiple results displayed in a tab. - Gabriele Sales
The ENFIN project: - Gabriele Sales
TT47: Chris Rawlings - Semantic Data Integration for Systems Biology Research
Also speaking: Catherine Canevet and Paul Fisher - Allyson Lister
Two systems to integrate data. - Gabriele Sales
BBSRC-funded research collaboration in Newcastle, Manchester, and Rothamsted : ONDEX and Taverna - Allyson Lister
Demo on the integration and validation of yeast metabolome models. - Gabriele Sales
Taverna is a workflow workbench to integrate tools (including web services). - Gabriele Sales
Taverna can link ONDEX and external (ex. PubMed) data sources via the web-service interface. - Gabriele Sales
When ONDEX works with Taverna, instead of using the pipeline manager you use the ONDEX web services and access ONDEX from Taverna. This means you can use Taverna to pull in data into ONDEX. - Allyson Lister
Outline of the demo: starting from Janboree Network SMBL, parse it into Ondex, remove currency metabolites and annotate using network analusis results. - Gabriele Sales
Then switch to Taverna. Identify orphans, retrive related enzymes, assemble a PubMed query and link results to the graph. - Gabriele Sales
the workflow for relevant pubmed entry retrieval seems staggeringly complex - but I guess this is because each input appears as a node in the workflow (is that correct ?) - Jim Procter
# ondex vs cytoscape? - Andrew Su
@Andrew - not sure - ondex uses Jung as the underlying layout engine (and Alan Kuchinsky of Cytoscape is asking a question!) - Jim Procter
Question: how does the system scales for large amount of data? - Gabriele Sales
@Jim, but they are similar use cases covered btw the two systems? - Andrew Su
@I think there are but the response to that question seemed to suggest that ondex was aimed at denovo network generation rather than massive visualization (this needs to be clarified) - Jim Procter
@Jim, thanks... - Andrew Su
HL54: Mona Singh - Search and discovery of recurring patterns with interactomes
Hairballs. - Roland Krause
Different large scale data sets, genetic, phosphorylation etc. exist but how to interpret? - Roland Krause
Add protein annotatons, sequence, structure, motifs, domains, functional characterization. - Roland Krause
Particular interested in interaction domains, analyzing cellular organization and interactomes. Can we discover and analyze recurring patterns. - Roland Krause
Analogy to multiple alignment of protein sequences with PROSITE pattern - Roland Krause
For networks schemas, nodes are description of proteins, e.g. domains. An extension of network by attributes, e.g. PFAM or PROSITE. - Roland Krause
Examples for network schema are homologous pathways. - Roland Krause
Netgrep has visual interface. - Roland Krause
No details of the algorithm, tricky problem but can solve it fast. - Roland Krause
Triangles, quad topology (linear combination of four), Y star topology of four. - Roland Krause
Automatically discovery of overrepresented schemas with direct interactions and sequence motifs. Using Y2H with filtering to address quality of the data set. - Roland Krause
Counting occurrences of labelled subgraphs, brute force approach using Netgrep. - Roland Krause
score each possible schema - Ruchira S. Datta
count how frequently it occurs, but different annotations have different frequencies - Ruchira S. Datta
need to account for overrepresentation - Ruchira S. Datta
Some annotations are more frequent than other, score has to incorporate both the frequency. - Roland Krause
Randomized network - Roland Krause
take each network, randomize it, count how often the annotation occurs in the randomized network and compute the overrepresentation term - Ruchira S. Datta
Estimate false discovery rate, look at features with FDR < 0.05 - Roland Krause
count how often the score occurs in the random graph, to estimate the false discovery rate - Ruchira S. Datta
when making the random graph with the pair schemas, preserve degree - Ruchira S. Datta
for triplet schemas, also preserve annotation - Ruchira S. Datta
For 3-vertex schemas, need to presever pairwise annotation acounts. - Roland Krause
for higher order schemas, also preserve triplets - Ruchira S. Datta
used Stub Wiring from Uri Alon's group for randomizing the pairwise schema - Ruchira S. Datta
no known algorithm for randomizing uniformly, need to approximate - Ruchira S. Datta
151 pairwise schema [...] - Roland Krause
do the proteins making up these schemas share biological processes? - Ruchira S. Datta
Hypergeometric evaluation of schemas. - Roland Krause
check enrichment versus background - Ruchira S. Datta
Graph of the pairwise schema network. - Roland Krause
For triplets, nodes are themselves pairwise schemas. - Roland Krause
Hubs in the graph are ras, kinas and [...] - Roland Krause
cross-interactomics: repeated the process in human - Ruchira S. Datta
Example of the DUP family in yeast, and pair comparison between human and yeast. - Roland Krause
Human-specific schemas, some consist of domains that also exist in yeast. - Roland Krause
Q. Robustness of the resulting networks. A. Removed 5%, found the network to be stable. Different results in different organisms. - Roland Krause
Q. Use of subgraph sampling? A. Can use any topology for searches. - Roland Krause
# Not sure I understand the question and answer but they do. - Roland Krause
Q. (Ruchira) Human-yeast network did you check orthology? A. Used PFAM annotation, did not worry about orthology. Some motifs in human are from the expansion. - Roland Krause
HL55: Michael Sammeth - The Computational Exploration of (Alternative) Splicing Mechanisms
Numbering all possible splice variant sites, assign ASCII symbol to each site, "alternative splicing code" that captures all differences between alternative splicing variation - sebi
ASTALAVISTA web service generates these strings at - sebi
ESTs can be truncated, and still splicing events can be detected as a sub-structure - sebi
"Bubbles" are cyclic graphs, complete events of splicing. Can construct bubble hierarchies for very complex splicing loci. - sebi
Workflow: Gene annotation -> database of events -> visualization with ASTALAVISTA; optionally add RNA-Seq data for expression levels of events (even simulated data from the Flux Capacitator) - sebi
HL56: Kevin Cohen - Nominalization and alternations in the language of molecular biology: Implications for text mining
nominalizations are dominant in biomedical texts, e.g. "expression" is much more used than "express" - Michael Kuhn
Cohen et al, 2008 [not a very useful reference]: nominalizations are more difficult to handle than verbs, but can yield higher precision - Michael Kuhn
alternation: variations like active/passive. much less characterized for nouns than for verbs - Michael Kuhn
e.g. pre-nominal arguments: agent: "phenobarbital treatment" or patient: "cancer treatment" - Michael Kuhn
previous work has tried to handle nominalizations, e.g. Ono et al (2001): interactions, association, complex and binding - Michael Kuhn
use in-house tool to do annotation of nominalizations - Michael Kuhn
alternations are very diverse, contrary to previous prediction that there would only be a limited number of alternations in scientific literature (sub-language model) - Michael Kuhn
HL53: Chris Taylor - Promoting coherent minimum reporting guidelines for biological and biomedical investigations: the MIBBI project
Standards are hugely dependent on their respective communities - Allyson Lister
Much of biological experiments can fall under an ISA (Investigation-Study-Assay) structure. - Allyson Lister
You should then use three types of standards: syntax (FuGE, ISA-TAB etc), semantics, and scope. MIBBI is all about scope. - Allyson Lister
Why do we care about standards? Data exchange, comprehensibility, and scope for reuse. - Allyson Lister
"Metaprojects": FuGE, OBI, ISA-TAB - draw together many different domains and present in structure/semantics useful across all. - Allyson Lister
When the independent MI projects overlap, arbitrary decisions on wording and substructuring make integration difficult. This makes it hard to take parts of different guidelines - not very modular. This is what MIBBI helps with. - Allyson Lister
MIBBI promotes gradual integration of checklists. - Allyson Lister
Nature Biotechnology 26, 889 - 896 (2008) doi:10.1038/nbt0808-889 - Allyson Lister
HL52: Georg Zeller - Characterizing transcriptome plasticity using whole-genome tiling arrays and machine learning
Motivation: find all genes in the model plant Arabidopsis thaliana. - Gabriele Sales
profile transcriptome changes under stress - Michael Kuhn
tiling arrays: cost-effective, but noisy - Michael Kuhn
normalization pipeline: background correction to reduce imaging artifacts - Michael Kuhn
quantile normalization: comparison between arrays - Michael Kuhn
Normalization: background correction, quantile normalization between arrays, transcript normalization (to remove probe sequence bias) - Gabriele Sales
transcript normalization: probe sequence bias reduced - Michael Kuhn
Not all exon probes show the same level. - Gabriele Sales
assume that ideal transcripts have constant level of expression, try to predict deviation between actual and ideal signal intensities - Michael Kuhn
Sequence-based quantile normalization: exon and intron sequences are normalized to 2 distinct levels - sebi
transcript normalization improves normalization between background and exon signals - Michael Kuhn
developed own segmentation algorithm, - Michael Kuhn
Goal: characterize tiling arrays probes into exonic, intronic and intergenic. - Gabriele Sales
mSTAD: a hidden Markov-SVM - sebi
mSTAD approach: Zeller et al., Pac Symp Biocomp 2008 - Michael Kuhn
mSTAD approach: a state is associated to each probe given its level and its context. - Gabriele Sales
mSTAD vs Affymetrix TAS - Gabriele Sales
can also find new transcripts that are not covered by ESTs/cDNAs so far - Michael Kuhn
can validate 37 out of 47 tested cases - Michael Kuhn
Examples of differential expression: stresses induce the transcriptions of extra regions. - Gabriele Sales
mSTAD can identify previously unannotated exons - sebi
find new stress-induced TARs - Michael Kuhn
could be: stress-induced alternative transcripts, overlapping transcripts, or new stress-induced genes - Michael Kuhn
Validation via RT-PCR. - Gabriele Sales
Zeller et al, Plant J, 2009 - Michael Kuhn
predict new genes if distance >500 nt to the nearest annotated gene - Michael Kuhn
last part of the talk: analysis of dicer mutant (DCL1). DCL1 is unvolved in processing miRNA precursor genes into functional miRNA - Michael Kuhn
in dcl1 knock-out mutatants, substrate transcripts of DCL1 accumulate - Michael Kuhn
findings: many intergenic TARs are targets of DCL1 - Michael Kuhn
dcl1 is also involved in the silencing of transposons. - Gabriele Sales
DCL1 is involved in transposon silencing: find strong expression of transposons in dcl1 mutant - Michael Kuhn
outlook: include sequence effects (e.g. GC bias, seen in RNA seq) into transcript normalization - Michael Kuhn
HL51: Jüri Reimand - GraphWeb: functional analysis of genomic networks
A tool for large, genome scale networks is applied to yeast data. - Roland Krause
The modular nature of molecular cell biology can be seen in today's large networks. A broad overview over the different, complex networks. - Roland Krause
Many biological entities can be seen as networks, e.g. proteins and genes, measured by DNA binding, physical and genetic interactions, which can be combined into heterogenous networks. - Roland Krause
Edges are not born equal, we need edge-weights, easy in gene expression networks. SAGA and SWI/SNF example. Assign local weights to rank edges in the network - Roland Krause
Data set are not born equal. Small scale data set are more reliable than global approaches. - Roland Krause
# Not sure I agree fully here, but he is merely motivating the weights approach. - Roland Krause
Some data set are trusted more than others. - Roland Krause
Different scores for global and local weights can be combined. - Roland Krause
Hairball. - Roland Krause
Everything is interconnected, no way to interpret. Needs to be dissected. - Roland Krause
Cliques, hubs, set of neighbors can be used to find the modules. Graph clustering are unsupervised approaches, using MCL or between centrality clustering. - Roland Krause
Functional enrichment can be studied using miRNAs, KEGG, Reactome, GO, TFBS. - Roland Krause
The statistical perspective can be computed by hypergeometric tests, could be normalized with the number of proteins in the module. - Roland Krause
Graph partitioning to detect dense regions, than enrichment, only highlight those with high scores. - Roland Krause
Analyze public data set from yeast, breaking down into hub based modules. - Roland Krause
Ah, the proteasome. - Roland Krause
Interacting with YHB1, probably makes sense to a yeast biologist. - Roland Krause
Export to cytoscape for visualizaition. - Roland Krause
An Estonia endeavor (Postdoc positions open) - Roland Krause
TT49: Lukas Marsalek - BALLView: a molecular viewer and modelling tool with real-time raytracing capabilities
Link to preliminary pictures Definitely looks cool and I hope it will be useful as well. - Pawel Szczesny
TT46: Richard Smith - InterMine – open source data warehouse and query interface
HL59: Florian Leitner - Comparative community assessments for applied biomedical text mining: BioCreative II challenge and metaservices.
HL60: Liang Chen - Studying alternative splicing regulatory networks through partial correlation analysis
Allyson Lister
wow - found power in victoria hall. Still two sockets free - lefthand side, 4th row up, first aisle in (i.e. not on the far left)
Thanks so much Allyson! - Ruchira S. Datta
Glad to be of service @Ruchira. - Allyson Lister
PT37: Michiaki Hamada - Predictions of RNA Secondary Structure by Combining Homologous Sequence Information
Need for better RNA secondary structure prediction with increasing awareness of functional ncRNAs - Cass Johnston
Most algorithms at the moment don't allow pseudoknots - Cass Johnston
Minimum Free Energy approaches: Mfold, RNAfold etc. But many structures close to MFE - Cass Johnston
Maximizing expected accuracy CONTRAfold etc. - Cass Johnston
CentroidFold (their algorithm) is an MEA tool. Performs better than RNAFold, Sfold, Contrafold... (not sure what the test set was) - Cass Johnston
Using homology to further improve accuracy of structure prediction (previous approaches: RNAalifold, McCaskill) - Cass Johnston
Sankoff sequence/structure alignment of sets of homologous sequences plus MEA. Computationally unfeasible. - Cass Johnston
Approximate the Sankoff method such that it is practical to run the method even for long RNA sequences - Cass Johnston
Compared CentroidAliFold to other state of the art methods. Outperforms conventional secondary structure prediction (ie. MFE-based) and outperforms everything except RAF (comparable) for approaches using homology too. - Cass Johnston
Much quicker than RAF - Cass Johnston
Method uses Nussinov-style dynamic programming to predict secondary structure. Maximises the sum of base pairing probabilities in the predicted secondary structure. - Cass Johnston
Hamada, Bioinformatics 25 465-473 (2009). Poster U53 - Cass Johnston
New software is called CentroidHomfold and will be available soon from - Cass Johnston
Question: Algorithm tested on structural RNAs, can it be adapted to handle mRNAs etc with more flexible structures? Answer: Possible, but non-trivial - Cass Johnston
Competitors: PETfold - Peter Menzel
Keynote: Thomas Lengauer - Chasing the AIDS Virus
Background in mathematics and Computer Science - Allyson Lister
protein bioinformatics, computational drug screening and design - Ruchira S. Datta
previously full prof at U. Paderborn - Ruchira S. Datta
on steering board of ECCB since its founding - Ruchira S. Datta
an exciting story, downstream of the bulk of computational biology in the medical field - Ruchira S. Datta
process flow usually ends with finding and optimizing potential drug targets - Ruchira S. Datta
Start when the drugs are available on the marketplace and they support personalized medicine, and which drugs to give to AIDS patients. - Allyson Lister
Personalized medicine, they start when the drugs are in the market place - Diego M. Riaño-Pachón
in this case, support difficult decision of a doctor: what drug to give to the AIDS patient - Ruchira S. Datta
33m HIV infected patients in 2007 - Peter Menzel
25 M deaths since 1981 - Gabriele Sales
greatly affected in Africa - Ruchira S. Datta
Aids awareness is waning - Oliver Hofmann
Europe appears little affected, but this may be deceptive - Ruchira S. Datta
...and increasing infection rate nowadays - Peter Menzel
AIDS awareness campaigns have waned in recent years, and as a consequence there is an increase in infection rates again. - Allyson Lister
At least in Germany, AIDS awareness is reducing - Diego M. Riaño-Pachón
'AIDS almost a lost cause' (no way it is under control currently) - Oliver Hofmann
AIDS is rampant and almost a lost cause. AIDS is nowhere near under control. - Ruchira S. Datta
small molecule: 10000 letters - Ruchira S. Datta
HIV virus: small genome (10k bases) - Gabriele Sales
double single-strand RNA genome - Gabriele Sales
virus attaches via surface proteins to the T-cell - Ruchira S. Datta
HIV has a duplicated single-stranded RNA genome - Peter Menzel
transferred into nucleus via viral protein called integrase - Ruchira S. Datta
no way to get the virus out of an infected cell - Ruchira S. Datta
It's not possible to remove the virus from the infected cell. - Gabriele Sales
viral particles assemble in a complex process, not completely understood - Ruchira S. Datta
during maturation process, strings of proteins are cut into their functional parts - Ruchira S. Datta
viral protein protease makes virus effective at infecting a new cell - Ruchira S. Datta
AIDS virus is by far the best understood of all viruses - Ruchira S. Datta
Drug design starts with understanding the life cycle. Best understood virus. - Oliver Hofmann
life cycle of HIV is already understood very well - Peter Menzel
HIV is the best understood virus - Diego M. Riaño-Pachón
Various drugs blocking different phases of the virus infection. - Gabriele Sales
There are a number of drugs that blocks the fusion of the virus with the cell, 17 blocking reverse transcriptase, etc. - Allyson Lister
17 drugs blocking transcription - Diego M. Riaño-Pachón
one new drug blocks attachments of virus to cell; another blocks fusion; 17 drugs block reverse transcription; 1 drug blocks integration; 8 block maturation - Ruchira S. Datta
17 or 70? :) - Allyson Lister
will explain why just one drug doesn't suffice - Ruchira S. Datta
HIV is extremely dynamically evolving, possibly the most dynamically evolving virus known - Ruchira S. Datta
17 :-) - Peter Menzel
moving target: over 10 million virus particles turned over per day per patient - Ruchira S. Datta
A drug may be efficient against the wild type, but not against mutants. - Gabriele Sales
wild type viruses are most fit under natural condition; drug will be very effective on wild type but will very quickly select for resistance - Ruchira S. Datta
Always going to be minority variants resistant to the drug which will escape drug treatment - Oliver Hofmann
reverse transcriptase is a bad copier, enabling variation every time the virus replicates - Ruchira S. Datta
There is no drug targeting all mutants. - Gabriele Sales
Hence drug cocktails that catch all variants. Doesn't work, best we can do is postpone the onset - Oliver Hofmann
therefore need drug cocktail that catches all of them, but this is utopia and doesn't happen; there is no drug therapy that works forever - Ruchira S. Datta
The virus always wins, we can only postpone the loss - Diego M. Riaño-Pachón
HAART: highly active anti-retroviral therapy; administer at least two drugs of different classes (targetting different proteins, working in different ways) - Ruchira S. Datta
number of viral RNA in the blood is a major clinical indicator - Peter Menzel
therapy is effective for some time, until new strain develops that is resistant - Ruchira S. Datta
50 copies is the current limit of detection for blood tests (unclear per what?) - Oliver Hofmann
Detection limit: 50 copies of the virus. - Gabriele Sales
this is the main question in treating patients, and is very difficult - Ruchira S. Datta
so far only viral genome, not host genome, is being considered - Ruchira S. Datta
people have built mutation tables: synopsis of global clinical experience of how virus responds to treatment - Ruchira S. Datta
Mutation tables: collection of responses of the virus to various treatments. - Gabriele Sales
In the past, they've built mutation tables - global collection of clinical experience - Allyson Lister
e.g., protease inhibitors - Ruchira S. Datta
Gotta love the overlap in comments! :) - Allyson Lister
Allyson: heh - Ruchira S. Datta
@Allyson: too many bloggers here ;-) - Gabriele Sales
(I am now waiting before posting.. only to see everyone else waiting, too... gah) - Oliver Hofmann
An expert group will build this table. - Allyson Lister
Mutation table created by a group of experts, after heated discussions - Diego M. Riaño-Pachón
a particular SNP suggests that the virus is resistant to Atanovir - Ruchira S. Datta
I say just go with the flow and just get some overlap - it's good! - Allyson Lister
Allyson: knowledge of the crowd.. :-) - Peter Menzel
can't collect a lot of data: if have seen resistance, don't want to subject future patients to this therapy - Ruchira S. Datta
@Peter - definitely! - Allyson Lister
but if there are too many mutations in the table, won't be able to administer therapy to any patient--every patient will have some of these - Ruchira S. Datta
Tables limited because mutations are not acting independently. - Gabriele Sales
mutation tables carry not enough information.. - Peter Menzel
-> expert systems ? - Peter Menzel
interdependencies cannot be captured by mutation tables; need rule-based expert systems - Ruchira S. Datta
Mutations act in the context of the remaining genome (and the host genome) - Oliver Hofmann
Mutation table ignores the context of mutations and synergies - Diego M. Riaño-Pachón
unfortunately, the medical community calls these "algorithms" - Ruchira S. Datta
virologists ask: "Is this kind of resistance analysis objective?" "Can we not let the clinical data speak for themselves?" i.e., circumvent political process of decisionmaking of what goes in the tables - Ruchira S. Datta
Started by building a clinical database. - Gabriele Sales
Then the comp biol at his group enter, by request from MDs, into the picture - Diego M. Riaño-Pachón
at the start, no clinical database existed - Ruchira S. Datta
Avoiding community decisions by querying clinical databases.. which did not exist at the beginning of the project - Oliver Hofmann
phenotypic data: extract patient's virus and expose it to drugs - Peter Menzel
phenotypic data: expose the virus to different drug concentrations in cell culture - Ruchira S. Datta
observe fitness of resistant vs wild type curve of how much drug needed to suppress replication - Ruchira S. Datta
measure the "resistence factor": how much more drug is necessary to keep a mutant under control as the wild type - Gabriele Sales
One value: drug increase required to overcome mutation effect - Oliver Hofmann
quantify as "resistance factor" - Ruchira S. Datta
map viruses to drug concentrations that are effective - Peter Menzel
only a few labs can do this kind of analysis - Ruchira S. Datta
but this data is too expensive and too slow to make for clinicians - Allyson Lister
people have resorted to viral genome, as sequencing the viral genome is easy and fast - Ruchira S. Datta
bioinformatic resistance analysis can replace phenotypic lab test - Ruchira S. Datta
now: sequencing individual virus' genomes - Peter Menzel
Multivariate statistical learning approaches - Oliver Hofmann
multivariate statistical learning on db of genotype-phenotype pairs - Ruchira S. Datta
Training on a genotype-phenotype pairs database (1000+ HIV variants). - Gabriele Sales
The training data is the genotype-phenotype pairs of 1000+ HIV variants - Allyson Lister
"Every HI virus is ugly" - Peter Menzel
want statistical model that will either regress or classify - Ruchira S. Datta
quality criteria: predictive power, interpretability - Ruchira S. Datta
A statistical model that regresses or classifies. Statistical power is not the only factor, results need to be interpretable - Oliver Hofmann
doctors in the field want interpretable models, and will sacrifice a few % accuracy for this - Ruchira S. Datta
regression: estimate resistance factor - Ruchira S. Datta
Clinicians require interpretable models, why they make the predictions they do - Diego M. Riaño-Pachón
Classification into two classes: susceptible or not. - Gabriele Sales
classification based on cutoff values - Ruchira S. Datta
interpretable model is a decision tree - Ruchira S. Datta
Decision tree classifier: tease out interdependence between different mutations - Oliver Hofmann
along branches of decision tree, query different amino acid positions - Ruchira S. Datta
Decision tree encoding the ammino acids conferring resistance. - Gabriele Sales
Beerenwinkiel et al. PNAS 2002 99 (12) 8271-6 - Allyson Lister
much more informative than mutation tables. - Peter Menzel
virus can be resensitized by multiple mutations - Ruchira S. Datta
Allows to find re-sensitization (sp?) effects - Oliver Hofmann
one decision tree for each drug - Diego M. Riaño-Pachón
Web tool: - Gabriele Sales, so far most used clinical tool by them - Ruchira S. Datta
one particular patient was a difficult case, convinced doctors that this tool has some use - Ruchira S. Datta
Genotype is aligned to the wt and mutations are identified - Allyson Lister
Identifies genome variations (alignment to the wild type) - Oliver Hofmann
on server, using regression with linear SVMs, not classification with decision trees - Ruchira S. Datta
now uses SVMs instead of decision tree - Peter Menzel
List of drugs by estimated effect (based on SVM). - Gabriele Sales
gives estimated resistance factor - Diego M. Riaño-Pachón
Using linear SVM for regression: a line for each drug and have est resistance factor, and normalization with Z-score, and the scored mutations. - Allyson Lister
use z-scores, as absolute values of resistance factors are not comparable btw drugs - Ruchira S. Datta
Each drug with an estimates resistance factor, Z-score (for comparative purposes) and a list of scored mutations based on their weight - Oliver Hofmann
this difficult patient is full of mutations, has resistance to every known drug per the mutation table - Ruchira S. Datta
"out-therapy" - doctors say positively they can't help him any more - Ruchira S. Datta
but they saw that some of these mutations actually resensitize! - Ruchira S. Datta
Some mutations, which confer resistance to some things (e.g. 76V in the anecdotal example) actual confers re-sensitisation and therefore would have a positive effect. Couldn't have been done with mutation tables! - Allyson Lister
Give one drug to retain re-sensitation mutation, add second drug to exploit the re-sensitation effect - Oliver Hofmann
one mutation conferring resistance to two drugs, resensitized the virus to other two drugs - Diego M. Riaño-Pachón
this could not have been found via the mutation table; the patient was on the recommended therapy from March 2003 until April 2009 and blood was clear of virus - Ruchira S. Datta
natural next question: predict in what direction the virus will evolve under a given drug therapy - Ruchira S. Datta
Now: model the viral evolution - Peter Menzel
Next question is: how the virus will evolve in reaction to a certain drug? - Gabriele Sales
not possible by mutation table - Ruchira S. Datta
hacking viral evolution neat. :) - Nav
The virus does not change randomly; it follows specific mutational paths. - Gabriele Sales
The virus "chooses" mutation paths, do not know why - Diego M. Riaño-Pachón
simulated by fair mutations, but the virus does not mutate by flipping a fair coin, it chooses useful mutations (!) don't know how it does that - Ruchira S. Datta
virus follows specific mutational paths into resistance - Ruchira S. Datta
('chooses' is probably not the right word for this :) ) - Oliver Hofmann
want to find such paths in the database - Ruchira S. Datta
Oliver: hopefully not, it's very strange - Ruchira S. Datta
What are the preferred directions the virus goes? - Diego M. Riaño-Pachón
Not enough data to make accurate models. - Gabriele Sales
Longitudinal data missing - Oliver Hofmann
would like longitudinal data, but have cross-sectional data: lots of patients, but few data points on each one - Ruchira S. Datta
used to build mutagenetic trees - Ruchira S. Datta
Viral evolution is modeled using tree structures. - Gabriele Sales
the TAM1 path is found by seeing the virus does *not* follow every possible path - Ruchira S. Datta
mutagenetic trees with probabilities on branches - Peter Menzel
method derives mixtures of mutagenetic trees, not single ones - Ruchira S. Datta
22% of data follow one tree, 78% another - Ruchira S. Datta
can make time models, to estimate average time until a mutation is acquired - Ruchira S. Datta
predict length of tunnel created for the virus by drug therapy, on its path to resistance - Ruchira S. Datta
Estimate probability that a virus will acquire a certain resistance within a given time - Oliver Hofmann
applet predicts which therapy appears most promising - Ruchira S. Datta
Can try to maximize that duration - Oliver Hofmann
Now: Algorithm calculates success probabilities for a certain therapy suggested by the software - Peter Menzel
Do they actually follow the other tree? Could they be transitioning through the other mutations faster than you are sampling? - Nav
therapy optimization with THEO - Ruchira S. Datta
this strange sampling of effective paths reminds me of quantum computing - Ruchira S. Datta
Doctors still do not trust computers to the very end. Good! - Peter Menzel
EuResist: Europe-wide collection of resistance data - Ruchira S. Datta
Rosen-Zvi, Altmann et al BIoinformatics, 2008 - Gabriele Sales
server akin to geno2pheno on the internet; ROC curves show it has better performance than expert systems - Ruchira S. Datta
Most 'fun' consortium he's been involved in (would love to know the criteria for that :) ) - Oliver Hofmann
so does THEO - Ruchira S. Datta
German database of ineffective therapies (thanks to honesty of doctors) - Ruchira S. Datta
without THEO, chance of failure is >24%, without <10% - Ruchira S. Datta
Error in therapy classification: 24% without THO, below 15% with it. - Gabriele Sales
geno2pheno accessed from 30 countries - Ruchira S. Datta
2/3 AIDS patients in Germany treated with geno2pheno server - Peter Menzel
also need to keep finding new drugs - Ruchira S. Datta
Constant need new drugs as the virus _will_ evolve with time towards resistance - Oliver Hofmann
one target: the viral entry - Ruchira S. Datta
the cell cooperates in drawing the virus in - Ruchira S. Datta
nice movie showing HIV infection of the cell. Wish I could link that - Oliver Hofmann
CD4 receptor and coreceptor on human cell; first GP120 attaches to CD4, then to coreceptor - Ruchira S. Datta
then virus drills down and the particles fuse - Ruchira S. Datta
so coreceptor protein will be the target; there is one drug targeting this, from Pfizer, Maroviroc (sp?) - Ruchira S. Datta
some people cannot be infected. - Allyson Lister
1% of Caucasian population does not have this coreceptor, and cannot be infected with HIV - Ruchira S. Datta
1% of caucasians do not have the coreceptor - Peter Menzel
some viral variants can use a different coreceptor (CCR5 vs CXCR4) - Ruchira S. Datta
once you're in therapy the virus can switch - Allyson Lister
people with the CCR5 deletion don't get AIDS, so the virus first goes through here, but it later switches to the other one - Ruchira S. Datta
@Oliver, thanks for the link - Diego M. Riaño-Pachón
have duotropic strain; need genotypic assay - Ruchira S. Datta
have genotypic prediction of viral tropism; this server has the most hits - Ruchira S. Datta
the lab is in San Francisco; German doctors don't want to send samples all the way there and are incented to use the genotypic test - Ruchira S. Datta
35 amino acids in V3 loop; used to look at residues 11 and 25, but multivariate server does much better - Ruchira S. Datta
supply method with structural descriptor of the V3 loop, and use it in the predictor, which increases the sensitivity - Ruchira S. Datta
results shown were toy, based on clonal data - lab sample of *single* virus strain - Ruchira S. Datta
Viral sequences are ambiguous in numerous positions. - Gabriele Sales
in patient have several strains, so ambiguous base calls; in practice, these are "bulk data" and greatly reduce sensitivity - Ruchira S. Datta
therefore add clinical correlates such as virus load, CD4 load, etc. that are easily drawn from patient - Ruchira S. Datta
Add extra information -- clinical parameters -- to increase sensitivity - Oliver Hofmann
still phenotypic test gives much more accurate result, so clinical value is contested - Ruchira S. Datta
Sensitivity goes from 80% to 40% if you move from clonal to bulk data. - Allyson Lister
study shows that in clinical picture, no difference btw genotypic and phenotypic test - Ruchira S. Datta
go away from sanger sequencing to increase accuracy - Allyson Lister
still not satisfying, so want to increase accuracy - Ruchira S. Datta
move away from Sanger sequencing; if virus occurs in only 10%, won't even see it - Ruchira S. Datta
therefore use deep sequencing - Ruchira S. Datta
Sanger sequencing can now be replaced by deep sequencing, solving ambiguities. - Gabriele Sales
need to assemble 454 reads - Ruchira S. Datta
coreceptor of 35 aa can be resolved by single read - Ruchira S. Datta
ultra-deep sequencing for getting sequence information for the whole quasi species - Peter Menzel
Up to 130k sequences (reads?) from a single drop of blood - Oliver Hofmann
elbow shaped curve in one case, less of one coreceptor type => drug against coreceptor likely to be effective - Ruchira S. Datta
need to establish cutoffs; with prediction specificity of 90%, size of X4 minority must exceed 5%; call this R5 and use the drug - Ruchira S. Datta
o/w, call it X4, drug will be ineffective - Ruchira S. Datta
but need to choose parameter values - Ruchira S. Datta
There might be better descriptors, but establishing these in the community is going to be difficult - Oliver Hofmann
Zero delay between results and bedside application (unlike traditional drug development with lead times of 10+ years) - Oliver Hofmann
Niko Beerenwinkel worked on haplotype prediction, now at ETH Zürich - Ruchira S. Datta
this is not just academic software, need this to be highly available, not just dependent on grad students - Ruchira S. Datta
Rolf Kaiser, w/o exposure to computers, conceived of this project; thankful for his vision - Ruchira S. Datta
formed society for furthering the software, stable under various losses of funding - Ruchira S. Datta
(only to get the blog on top of the ISCB portal site) - Reinhard Schneider
Roland Krause
Another good reason to micro-blog: Colleagues that fell asleep in the session can use it to catch up with the talk (name known but withheld).
Allyson Lister
I have a spare Vasa museum ticket from a friend who unfortunately can't make it :( His misfortune may be your gain. I'm in the keynote, left hand side on the aisle by the power cables. Find me there before I leave the keynote and the ticket's yours!
Michael Kuhn
eQTL special session: Andrew Su
patterns of gene annotation in EntrezGene: few well studies genes, almost 50% of all genes have 0 linked references - Michael Kuhn
want to use GWAS to discover disease genes - Michael Kuhn
use mouse diversity panel - Michael Kuhn
quite a bit of phenotypic variability in mouse strains - Michael Kuhn
Haplotype Association Mapping, Pletcher et al 2004, McClurg et al 2006 and 2007 - Michael Kuhn
eQTL maps: Probe set genomic position vs. QTL/SNP genomic position - Michael Kuhn
cis-QTL band: strong diagonal, trans-QTL bands - Michael Kuhn
trans: non-local regulation through diffusable factors - Michael Kuhn
functional analysis of trans bands: rank according to association score, gene enrichment test on GO / KEGG - Michael Kuhn
can use this to identify novel pathway members (in OxPhos pathway) - Michael Kuhn
add orthogonal data set: many tissues in one mouse strain (as opposed to normal eQTL data: same tissue in many strains) - Michael Kuhn
--> Gene Atlas - Michael Kuhn
get 10 inferred new pathway members, sometimes see that the genes are just missing annotations - Michael Kuhn
can also look for putative regulators - Michael Kuhn
find Cyclin H as potent OxPhos regulator, validate experimentally - Michael Kuhn
technical caveat: nonparametric estimation of background requires careful permutation strategy, Breitling R et al, 2008, PLoS Genetics - Michael Kuhn
Thanks for the live-blogging Michael! Your notes made me realize I never included a reference to the PLoS Genetics paper describing the work. It is here: - Andrew Su
TT42: Karim Chine - Computational Biology in the cloud, towards a federative and collaborative R-based platform
Eamonn Maguire talking on behalf of Karim Chine - Gabriele Sales
Java program (web app or web start app) built on R and Scilab - Gabriele Sales
Biocep-R - Allyson Lister
RESTful API. - Allyson Lister
BIOCEP-R has advanced graphics - more than with regular R - Allyson Lister
Built on top of R and Scilab - Cass Johnston
wow - their software is an "ecosystem" :) - Allyson Lister
Computational scripts: R / Python / Groovy - Gabriele Sales
The BIOCEP computational open platform ecosystem: computational data sources, resources, components, GUIs, web services and scripts - Allyson Lister
Data storage: local, NFS, FTP, S3 - Gabriele Sales
Components such as R, Bioconductor; GUIs with collaborative views; Scripting (R/Python/Ruby); stateless web services; NFS/FTP/S3 storage; cluster/grid support - Oliver Hofmann
The R Virtualization is like a mini-desktop. - Allyson Lister
Server side spreadsheet sync'd on all clients. - Gabriele Sales
Visual GUI builder in Netbeans. - Gabriele Sales
Creation of EC2 workers on demand. Supports multiple worker pools. - Gabriele Sales
Biocep can automatically add workers on the cloud on demand given a certain load - Oliver Hofmann
Uses Amazon Elastic Cloud (Is that right?) - Allyson Lister
A bit of free ads for AWS... - Lars G. T. Jorgensen
Looks like the kind of tools needed to get people to utilize the cloud... - Lars G. T. Jorgensen
Showing us the R console and some simple operations - Allyson Lister
FYI: AWS are giving grants for use of their services: - Cass Johnston
The web services part means you can use BIOCEP to connect to a cloud instance. - Allyson Lister
(It's actually an incredible wrapper / workbench to tie together all kinds of different tools and algorithms) - Oliver Hofmann
Showing client - server interactions on a single machine. The automatically synchronized spreadsheet. - Gabriele Sales
Go find Eamonn if you have any questions! - Allyson Lister
Looks very promising.. I especially liked the easy selection of spreadsheet cells and using them in the virtual R workbench. - Peter Menzel
PT47: Oliver Stegle - Predicting and Understanding the Stability of G-Quadruplexes
G-Quadruplexes are stable structures of RNA and DNA - Roland Krause
Typically of the from GGGACTAAGGGACTTCCCACTTGG - Roland Krause
Will form spontaneously, have role in transcriptional control and telomeres. - Roland Krause
Are these patterns really stable? It's the first indicator of a functional role. Melting temp will be a proxy for stability - Allyson Lister
Overrepresented in promoter regions [Hupper and Balasumbramanian, 2007] - Roland Krause
Melting temperature can be predicted and experimentally verified. It's low throughput though, rules are limited, complicated non-linear relationships. - Roland Krause
Gaussian processes (GP) regression with different error rates across the sequence. - Roland Krause
Gives the posterior distribution of function values given a training set. - Roland Krause
Needs a covariance function (kernel), a likelihood model and hyperparameters. - Roland Krause
Product ansatz to construct a joint covariance function of concentration and sequence. - Roland Krause
Uses common k-mer substrings. - Roland Krause
Use the squared distance of the sequence features such as composition, length of spacers etc. - Roland Krause
these structures are so stable sometimes they never melt. - Allyson Lister
Training data is noisy with outliers, sometimes only gives a bound. - Roland Krause
Non-Gaussian likelihood model is robust, incorporates observations as bounds. - Roland Krause
Hyperparameters estimated either with MAP or MCMC sampling. - Roland Krause
they look at 260 quadruplexes (one of the first data sets available for quadruplexes) - Allyson Lister
Dataset with 260 G-quadruplexes, compared to SVMs and GPs. - Roland Krause
GPs outperform SVMs in mean squared error and mean log probability. - Roland Krause
Gives confidence in the prediction. - Roland Krause
With a 50/50 training split, the predictions (with the error bars) always overlap with the "truth" line, sometimes with a large uncertainty. Everything is predicted within 10 degrees C. - Allyson Lister
The relevance of the features for the hyperparameters can be shown, one of the length parameters is most important. - Roland Krause
Genome-wide GQ prediction in human identifies 359,548 candidate sequences. - Roland Krause
60% is in the 10° Tm range (which is pretty good). - Roland Krause
Are they functional and in promoter regions? - Roland Krause
Quadruplexes are overrepresented in the promoter regions by order of magnitude than anywhere else - Allyson Lister
Weak hint, may not to be expected much more from only 260 examples. - Roland Krause
Just about significant. - Roland Krause
Future work includes further biological validation. - Roland Krause
Blog post: (Interesting talk - some of it was a little over my head though!) - Allyson Lister
My question: Are GQs connected to high-GC promoters? A. Only further validation will tell. - Roland Krause
Michael Kuhn
eQTL special session: Rob Williams, The phenotype revolution
genetic reference populations: "fixed" populations - Michael Kuhn
BXD genetic reference population - Leopold Parts
BXD: homozygous strain - Michael Kuhn
F2 cross between DBA/2J (oldest inbred) and C57BL/6J (standard black) - Leopold Parts
inbreed the crosses, headed towards 100+ lines - Leopold Parts
have homozygous permutation of the parental genomes - Michael Kuhn
inbred strains originally to avoid southerns - Leopold Parts
can select species and tissues - Michael Kuhn
can resemble the same genome over and over again: thus can suppress environmental influence - Michael Kuhn
dense microphenotypes for the bxd strains, specifically brain - Leopold Parts
few recombination events a problem - large linkage blocks with a lot of genes - Leopold Parts
can use geneNetwork to straightforwardly get a QTL map - Leopold Parts
and zoom in on cis and trans associatios, follow up genes, create expression network - Leopold Parts
then plot heatmaps for hypothesis generation - Leopold Parts
explore trans bands for the bxd collection. One for the pigment mutation. Many usually artefactual (in mouse datasets) - Leopold Parts
TT45: James Cavalcoli - Using the NCIBI Suite of Integrated Tools and Data
HL48: Israel Steinfeld - Architecture of CpG methylation in the human genome
DNA methylation: modification of cytosines in CpG dinucleotides, maintained across cell divisions - Marcel Martin
CG dinucleotide content in HG: 1%, expected: 4.5% - Marcel Martin
CpG islands: regions on DNA that contain many CpGs. 28000 islands annotated in HG. almost all of them are near gene promoters - Marcel Martin
mDIP: methyl-DNA immunoprecipitation assay, similar to ChIP-chip. 244k DNA methylation array - Marcel Martin
array methylation score (IMS): average signal for all probes mapped to it. bimodal distribution. house keeping genes are methylated (ie, on one side of the distribution) - Marcel Martin
approx 15 samples (different tissues). almost all are not methylated (~70%) - Marcel Martin
Nature: Sp1 elements protect a CpG island from de novo methylation, Michael Brandeis et al, Nature 371, September 1994 - Marcel Martin
DRIM Discovering Rank Imbalanced Motifs: - Marcel Martin
use machine learning to distinguish between methyl. and nonmethyl. islands. - Marcel Martin
UMR: undermethylated region (?) - Marcel Martin
designed a new tiling array that covers all predicted UMRs - Marcel Martin
conclusions: 4400 predicted regions were confirmed as UMRs. 923 of the UMRs are placed near known TSS. no one-to-one correspondence between CpG islands and nonmethylated regions. also: yes, there is tissue-specific methylation (didn't go into detail) - Marcel Martin
HL49: Andrei Zinovyev - Robust simplifications of multiscale biochemical networks
model reduction: simplify models in order to understand - Ruchira S. Datta
e.g., "Consider a spherical cow..." A farmer hires a physicist to help with milk production, who takes half a year to produce a paper beginning thusly. - Ruchira S. Datta
complexity of data and complexity of models: non-identifiable models still have robust properties - Ruchira S. Datta
See e.g. Chen et al, Molecular Systems Biology 5:239, 2009 - Ruchira S. Datta
given only the order relations between the model parameters, can we provide robust first and second order solutions? - Ruchira S. Datta
biological systems are hierarchical and multiscale (an observation, not a theorem) - Ruchira S. Datta
structure: functional modules, motifs; scales: time scales, concentratio scales - Ruchira S. Datta
this makes it possible to neglect small quantities in favor of larger ones, given proper theory - Ruchira S. Datta
aymptotic approximations in chemical kinetics: quasi-equilibrium (fast reactions), quasi-steady state, ..., quasistationary - Ruchira S. Datta
enzymatic catalysis in quasiequilibrium vs quasistationary approximations gives very different results - Ruchira S. Datta
rate limiting step: steady state rate is determined by slowest reaction in the chain - Ruchira S. Datta
what is the equivalent for a complex network? - Ruchira S. Datta
dominant dynamical system (DDS): auxiliary minimal dynamical system which gives the main asymptotic terms of the stationary state and relaxation in terms of well separated time scales - Ruchira S. Datta
not unique - Ruchira S. Datta
monomolecular networks with time separation can be solved without exact knowledge of kinetic rates - Ruchira S. Datta
Theorem: for such systems, the eigenvalues have only -1, 0, 1 values. These are determined only by the network topology and the order of the parameters. - Ruchira S. Datta
DDS for linear networks: cycle gluing; cut into cycles and contract until have noncyclic system, then solve that - Ruchira S. Datta
what about non-linear systems with non-monomolecular reacionts? - Ruchira S. Datta
if one concentration changes much more slowly, treat it as a parameter - Ruchira S. Datta
1) identify linear or pseudo-linear subsystems; 2) neglect small quantities (use idempotent algebra) - Ruchira S. Datta
model reduction preserves dynamics; need to identify critical and non-critical model parameters - Ruchira S. Datta
characteristic functions are ratios of monomials in the initial parameters - Ruchira S. Datta
HL47: Keren Lasker - Fitting multiple components into a cryoEM map of their assembly
cryoEM has become a standard tool for structural characterization of large protein complexes - Anne Tuukkanen
Complex modeling by using em maps and minimizing an objective function. The objective function includes terms for geometric complementary, a fitting score and term for envelope penetration. - Anne Tuukkanen
Andrew Su
Anyone interested in donating/selling a ticket to the Vasa dinner on Wednesday? If so, please email me -- asu at gnf dot org...
@Andrew - I have a ticket you can have for free - I've just come into possession of it. Meet at Victoria hall just as coffee starts? It'll be empty and you'll be able to see me - Allyson Lister
emailing now... - Allyson Lister
Other ways to read this feed:Feed readerFacebook