17th Annual International Conference on Intelligent Systems for Molecular Biology & 8th European Conference on Computational Biology The talk specific feeds will be created each day shortly before the start of the first presentation. Find talk specific blogs by searching here for the authors, the title of the talk or the talk identifier as given in the program (like HL03 for the 3rd Highlight paper) The feeds can also be accessed on the conference pages in the according sections: SIGs, Keynotes, Proceedings Track, Technology Track and Highlights and the last few blogs are shown on our web-portal page.
I'm here and will be attending the BioPathways SIG. I figured there should be some place to say non-talk-specific things like "I've arrived!", so here it is.
- Ruchira S. Datta
Will be switching between DAM, BOSC and the BioPathways. Once I've figured out which talk is when ;)
- Oliver Hofmann
Uhoh. I sense a lack of places to recharge notebooks. This is going to be fun during the session breaks..
- Oliver Hofmann
I forsee a problem with the SIG posts in FF - only one heading, and many, many speakers.... Hopefully will be OK for people following.
- Allyson Lister
The projectionist kindly suggested I power up in the projection cubicle during the coffee break.
- Ruchira S. Datta
Network is slowing down to a crawl for me.
- Oliver Hofmann
Sitting at home, following the conference. Great job folks! My only wish is that there was some place where the speakers would upload their presentations.
- Lars Juhl Jensen
Help! Anyone got a UK plug adaptor with them I could borrow for an hour or so? Will happily exchange laptop charge for beer/coffee.
- Cass Johnston
Oliver, I've been having that problem too. There were some times when I had posted a comment and wanted to post another one, but the first hadn't finished posting, so I had to wait for the post to finish, hold my comment in my head, and also pay attention to what the speaker was currently saying...
- Ruchira S. Datta
Talked to the organizers. They are aware of the problem, IT is looking into increasing bandwidth. Yay.
- Oliver Hofmann
@Oliver well done! Perhaps we could ask someone to look into more power extension leads? :)
- Allyson Lister
Anyone up for a FF lunch / dinner / drink at some point during ISMB?
- Oliver Hofmann
Sounds good. I'm about to go orienteering, though, so will be out of reach this evening...
- Ruchira S. Datta
Monday and Tuesday are the only "free" nights, with poster sessions going from 5:45 to 8:30 pm. Shall we just set Monday night for the FF meet-up? When would people want to leave the conference at the earliest? 8 pm?
- Michael Kuhn
Monday's going to be busy meeting everyone; I'd probably try and stick around 'till 8pm. Meet at the registration booth close to the entrance at 8:15? Can decide on a venue depending on our numbers and whether people are willing to travel to the city center.
- Oliver Hofmann
I think I would prefer 8:40. I have a poster. I haven't figured out yet whether it's Monday or Tuesday, but others might have posters too? Last year, at least, I was explaining my poster to people pretty much constantly until the poster session closed.
- Ruchira S. Datta
Monday or Tuesday is good for me - either way, and any time is fine. It's a good idea!
- Allyson Lister
8:45 on Monday then, same place (registration area)? Do we have any locals with suggestions?
- Oliver Hofmann
ok - see you all in registration at 8:45 monday :)
- Allyson Lister
If everyone has already seen Gamla Stan by that time, we could go to Södermalm--plenty of good restaurants. I'm staying there.
- Ruchira S. Datta
@Ruchira: works for me as long as we have a guide to get us there :)
- Oliver Hofmann
@Allyson There are lots of power extensions along the windows just inside the T hall. Let me know if those are still too full and I'll see what I can do about getting more.
- Dave Messina
@Oliver The conference organizers have bought high-speed internet. Anyone can go to the registration desk for a free login. Spread the word!
- Dave Messina
@Dave, yep, got mine this morning. World of a difference.
- Oliver Hofmann
@Dave, any chance of getting extension cords to the other conference rooms? Starved for outlets at K21 :)
- Oliver Hofmann
I'll be arriving today, would it be worth to pack a extension cords with multiple sockets?
- Roland Krause
@Roland: This is always a good idea. Problem is that some rooms only have power on the stage and in the projection booth.
- Michael Kuhn
Organizers also have extra extension cords for emergencies, but even the main conference room has no outlets. Best to recharge during the breaks, but that means leaving the notebook unattended...
- Oliver Hofmann
photos from ISMB/ECCB, please put on flickr under tag "ismbeccb2009"
- Burkhard Rost
The projectionist here just unplugged my power cord from their outlet and said none of us will be allowed to use their outlet; we'll need to use the outlets in Hall C. He said the organizers did not pay for extra outlets, so we will still have this power problem.
- Ruchira S. Datta
Well, at least I'll be able to sit closer to the front where I can see better.
- Ruchira S. Datta
Even the "free" network is considerably faster and more responsive today. Well… we'll see what happens when the first coffee break hits.
- Johann Visagie
@Ruchira: Ouch. That's not going to help. We can get extension cords and use them in the smaller meeting rooms though.
- Oliver Hofmann
@Johann agreed - I also think the network is faster today
- Allyson Lister
I love the fierce, ofttimes violent competition for the limited resource that is power outlets during coffee break!
- Johann Visagie
It seems a simple #ismb is the twitter hashtag of choice. Also note Burkhard's suggestion above to use "ismbeccb2009" as Flickr tag. (Hey! Flickr tags can has spaces! ;)
- Johann Visagie
@Johann: Bring your own socket multiplier and you are loved by everyone ;-)
- Oliver Hofmann
Any chance of leaving / recharging our desktops at the registration desk during the breaks? Would rather not leave mine in a conference room unattended.
- Oliver Hofmann
Lonely FFer near the registration desk is looking for some chatting... anyone?
- Egon Willighagen
I'm having a special FF sessions outside, but I am the only participant :) (after a nice meet up with Ruchira earlier)
- Egon Willighagen
@Egon - shame - I was offline for lunch. At least I assume it was lunch you said that. FF doesn't put up the time each comment was made - so add times please if you're talking about meetups! Hopefully see most of you this evening after the poster session anyway :)
- Allyson Lister
P.S. The network seems to have slowed down again. Anyone else notice, or is it just me?
- Allyson Lister
@Dave thanks for the power extension cable - except for the main victoria hall (as we discussed) it's really helped a lot :)
- Allyson Lister
@Allysin, I'm still around... in front of E.A.T. now....
- Egon Willighagen
@Allyson: As half of the conference is probably sharing the username / password for the closed network I am not surprised it is slowing down :)
- Oliver Hofmann
Maybe it is because I am happily uploading Jmol 11.7.45 to SourceForge ? :)
- Egon Willighagen
@Oliver Hopefully it's that rather than someone downloading Genbank. :) Still plenty of FREE logins for the closed wifi are available at the registration desk for anyone who wants one.
- Dave Messina
Fun to see how quickly many of us have found a workaround for the 'Oops'... :)
- Egon Willighagen
Is there a place for posting jobs/student positions available to ISMB attendees? (i.e. http://tinyurl.com/brinkma...) I'm not at the ISMB this year, so am definitely checking out FF and related resources and thank you all for your posts!
- Fiona Brinkman
Does anyone know how to 'filter out' all group:ismbeccb2009 messages in my main feed? So that I can just open the ISMB/ECCB 2009 group in a separate window?
- Egon Willighagen
Thanks to everyone - organizers, reviewers, presenters, friends, bloggers etc etc etc : I must away, but had loads of fun! :)
- Allyson Lister
@Allyson, safe travels. @all, see you in Boston. Feel free to PM/mail me for tips where to stay, eat or party in the city.
- Oliver Hofmann
nice meeting you, Allyson, and have a safe trip!
- Ruchira S. Datta
I am leaving too, was nice meeting you all and have a good trip back.
- Roland Krause
i'm now at the Stockholm airport en route to Cambridge for another conference (!) http://www.functionalgenomics.org.uk/section... It was fun meeting some of you and microblogging together. Hope we'll see each other around at future conferences. Next year in Boston!
- Ruchira S. Datta
@Egon: if you go to the ismbeccb09 room and click the "(edit)" link to the right of "Lists" then you can deselect it from your Home feed.
- Lars Juhl Jensen
A top-ten of talks (well, not really - just the top ten of my ISMB posts based on number of hits). What was everyone interested in enough to click through to my blog? Find out: http://themindwobbles.wordpress.com/2009...
- Allyson Lister
Project priorities: just long term misses new customers, grant opportunities. Focus on money or project size misses pilot projects, does not allow diversification. Suggestion: based on merit, what allows the core to grow in new directions, expand, supports the institutional community in general
- Oliver Hofmann
Hiring: 50% FTE available for a new person, enough to get someone started. Identify new technologies, hire to get an early start. Consultants can fill gaps.
- Oliver Hofmann
Time: maintain an overview of timeframes (putative project starts). Wrap up projects to avoid task switching overhead as much as possible. Make time for the planning stages.
- Oliver Hofmann
Expectations: be open and transparent with regards to availability, feasibility, stick to realistic time estimates and turn down work if the resources are not available. Collaborate between Cores
- Oliver Hofmann
Cancer UK, Cambridge Core. Prioritize short tasks, genomic tasks. Use queuing system, use steering committee to guide on participation in long term researcher-based projects
- Oliver Hofmann
usually 6-10 projects per person at any given time (ouch...)
- Oliver Hofmann
Manage workload: define scope early, manage using collaboration software, deliver data in stages
- Oliver Hofmann
Try to find ways to pool information and resources (similar to the BOSC OpenBio projects)
- Oliver Hofmann
Switch of topics on how to handle next-gen seq influx
- Oliver Hofmann
What kind of questions are being asked, can they handle the data themselves, and is there any way to build re-useable workflows? Experience seems to be that so far no question (beyond the assembly step) has come up twice.
- Oliver Hofmann
key elements separated from transcriptional unit - total loss of function
- Mikhail Spivakov
altered chromatin organisation (originally position effect variegation)
- Mikhail Spivakov
Pax6, Spx2, Otx2 - genes with important role in eye development that are also important in the brain; continue to be expresed in the adult organism as well
- Mikhail Spivakov
Explore Pax6 long-range control elements - up to 180 kb away from the gene
- Mikhail Spivakov
use distant elements in reporter transgenic analysis to look at expression patterns they drive - different elements drive expression in different tissues at different times. however, find large overlap, so there's no easy rule according to which distant regulatory elements work together to establish the expression pattern
- Mikhail Spivakov
some distant elements are within a neighboring housekeeping gene
- Mikhail Spivakov
study deletions in various regions by using a reporter construct, compare with isolated enhancer function
- Mikhail Spivakov
find an element in Pax6 intron that drives expression in the (zebrafish) heart - unexpectedly
- Mikhail Spivakov
also find a Pax6-response element: autoregulation - Manuel et al, Development, 2007
- Mikhail Spivakov
a mutation in a Pax6 reg element located in the intron of the neighboring gene has been associated with (a mild form of) epilepsy
- Mikhail Spivakov
some enhancers work only at later developmental stages (eg, E17.5)
- Mikhail Spivakov
need methods for phenotype prediction based on regulatory mutations - eg, enhancer-driven knockouts
- Mikhail Spivakov
in zerbafish, there are pax6 forms - pax6 a (expressed in the barin), pax6b (expressed in the pancreas), both are expressed in the eye
- Mikhail Spivakov
the regulatory elements for the brain were lost from pax6b, pax6a has lost its neighboring gene (ie, the reg elements located within it), pax6b retained some a part of it
- Mikhail Spivakov
haploinsufficiency causes campomelic dysplasia and sex reversal - the breakpoints can be very far away, the severity of phenotype decreases with the breakpoint position being further away - however mutations causing Pierre Robin syndrome may localise to a very specific long distance element ~1.1 Mb away from the gene, small deletions at 1.4 Mb upstream _AND_ 1.4 Mb downstream. How these elements work together is yet to be determined.
- Mikhail Spivakov
Another example: Van der Woude severe phenotype caused by intragenic mutations in IRF6; a regulatory variant of IRF6 (a SNP upstream of the gene, disrupting an AP2alpha binding site) predisposes to sporadic CLP (cleft lip and palate)
- Mikhail Spivakov
Discuss developmental abnormalities caused by mutations in SHH (sonic hedgehog)
- Mikhail Spivakov
a related phenotype is produced by an insertion in the intron of a neighboring gene - LMBR1
- Mikhail Spivakov
Pax6 and Sox2 interact to cross and autoregulate, mutations in the "double" binding sites (ie for both) at different cells result in mutant phenotypes. Note that TF dosage is important.
- Mikhail Spivakov
Next talk: James P. Noonan, The Role of Developmental cis-Regulatory Change in Human Evolution
- Mikhail Spivakov
what makes us human? unique abilities <- developmental changes (brain, limbs etc). What's the molecular basis of these traits?
- Mikhail Spivakov
changes in gene regulation vs changes in genes themselves. perhaps there's a lot of the former (in the evolution of primates)
- Mikhail Spivakov
two approaches: 1) in vivo chacaterization of cis-regulatory modules with human-specific developmental functions, 2) whole-transcriptome comparative analysis
- Mikhail Spivakov
start with (1). how to identify developmentally active CRMs in the human genome?
- Mikhail Spivakov
the more conserved an element is the more likely it'll function as an enhancer
- Mikhail Spivakov
also look at chromatin signatures - for example, p300 binding sites are 90% predictive of enhancers (Visel et al, Nature 2009)
- Mikhail Spivakov
look for noncoding elements that are conserved across species but rapidly evolve in humans - human-accelerated conserved noncoding sequences (HACNSs), Prabhakar et al, Science 2006.
- Mikhail Spivakov
Developed stats models to measure evolutionary rates, estimate probability of human-specific substitution, calculate likelihood of observed pattern of substitutions in each noncoding element - estimate acceleration p-value. Identified 992 HACNSs out of ~10000 conserved elements
- Mikhail Spivakov
associate with neighboring genes -> GO analysis. only 1/734 functional categories showed significant association: cell adhesion! More detailed look - neuronal cell adhesion, axon guidance, synapse formation
- Mikhail Spivakov
Do HACNSs function as reg elements in vivo? Transgenic assays in mouse (select elements based on overall constraint and human-specific acceleration). Many elements show specific developmental patterns of expression in these assays
- Mikhail Spivakov
Focus on HACNS1 - associated with gbx2, highly conserved in all terrestrial vertebrates, evolving 4x faster than neutral rate in human, sequence changes are fixed. The sequence change is association with a gain of function: human-specific anterior limb expression.
- Mikhail Spivakov
this gain of function has been achieved by 13 substitutions in the sequence compared to chimp (proved by "humanizing" the chimp region and "chimping" the human region by making these substitutions)
- Mikhail Spivakov
drives expression in the anterior-most digit at e13.5 - may contribute to the development of longer thumbs in the human compared to chimp/orangutan?
- Mikhail Spivakov
HACNSs: selection or biased gene conversion? Biased gene conversion - postulated to increase the fixation rate of AT to GC substitutions. Mutagenic effect of recombination hotspots + preference in recombination-associated DNA repair toward GC vs AT pairs.
- Mikhail Spivakov
Synergy between positive selection and BGC in HACNS1?
- Mikhail Spivakov
HACNSs are not biased towards high-recombining regions. The human-specific substitution rate is not elevated outside of HANS1 - but GC rate is.
- Mikhail Spivakov
Move from genotype to phenotype: comparative gene expression profiling in developing cortex.
- Mikhail Spivakov
Mouse vs human fetal forebrain, look at gene expression profiles of several cortical stem cells by RNAseq. Purify cell populations by laser capture microdissection. (collaboration with Rakic/Ayoub @ Yale). Recover 50 ng of total RNA from ~10,000 cells, which is more than enough for RNAseq. Comparison of different cell populations in the developing mouse cortex - very high correlation of expression. Differentially expressed genes "make sense".
- Mikhail Spivakov
Ultimately, want to associate differentially expressed genes with HACNSs
- Mikhail Spivakov
Next talk: Steve Montgomery, Population genomics of human gene expression using next-generation sequencing technology
- Mikhail Spivakov
looked at two different quantitative traits: exon reads and splicing pairs
- Mikhail Spivakov
~80% reads map to known exons. 40% reads span multiple exons (insert size: ~150bps)
- Mikhail Spivakov
data normalisation of RNAseq data is still a challenge. used a simple scaling approach
- Mikhail Spivakov
test data filter criterion: <10% of individuals equal to zero
- Mikhail Spivakov
see ~ 11,000 genes, corresponding to ~95,000 exons; ~23,000 exon pairs -> from 1 lane per individual.
- Mikhail Spivakov
there's a decent correlation between any two individuals (same sample same run: corr=0.93, diff sample, same run: corr=0.89, diff sample, diff run: corr=0.86)
- Mikhail Spivakov
Correlation of exon reads within genes (test for freq of alternative splicing) - corr is good, but not perfect for exons with high mean read count, so some information is escaped by array analyses that assume a fixed exon structure for each gene.
- Mikhail Spivakov
Adding a number of lanes / individuals increase the number of identified splice variants => many of alternatively spliced transcripts are rare
- Mikhail Spivakov
Detecting eQTLs with RNAseq data: assume an additive model of gene expression, for each SNP look at the number of reads, use spearman rank correlation to assess the association
- Mikhail Spivakov
detected 2431 genes (4219) exons associated with SNPs at p~0.01
- Mikhail Spivakov
eQTLs detected with exon counts centred around TSSs
- Mikhail Spivakov
consider using RNAseq to detect SNP association with known splice variants - took 1767 known splice variants from Ensembl, tested for association against splicing pair counts, data look promising
- Mikhail Spivakov
Q: what cells used? A: EBV-transformed lymphoblastoid cells. Q: is this a bit of a constraint to look only at these cells? A: looked at eQTLs in other lymphoblastoid cell lines collected long time ago, see a lot of correlation. Did not look at other tissues.
- Mikhail Spivakov
Q: did they look at 20% unannotated exons? A: No, can't do much until new gene models are built from them.
- Mikhail Spivakov
Q: where are the SNPs? Association with HCEs? Gene functional categories enriched for SNPs?
- Mikhail Spivakov
A: in 1Mb windows, see enrichment for eQTLs right around the TSSs rather than remote regulatory regions. Not a large number fall within known regulatory elements.
- Mikhail Spivakov
Next talk: Nadav Ahituv, Deconstructing gene regulatory elements
- Mikhail Spivakov
Geneticists are good at finding genes from sequence, can do much less with detecting regulatory elements - in the vast space of 98% of the genome that is non-coding. Also don't know the effect of nucleotide substitutions at regulatory elements.
- Mikhail Spivakov
What can we do about it? 1) use comparative genomics to find conserved non-coding regions. 2) high-throughput characterisation of enhancers
- Mikhail Spivakov
At Berkley, looked at extremely conserved non-coding sequences (n~3000, 70% similarity, >=100bp human-fugu) and ultracobserved (n~250, 100% similarity). Many of them turn out to drive expression in transgenic assays in the mouse. Now moved to UCSF, are setting up the same assays in zebrafish - easier to move it to high-throughput analysis
- Mikhail Spivakov
enhancer.lbl.gov: resource for experimentally validated human noncoding fragments with gene enhancer activity as assessed in transgenic mice.
- Mikhail Spivakov
These data can be used in a variety of ways. They aim to understand the regulatory code at enhancers. Look for limb-specific signatures, check whether they see them at limb-associated genes, check for new regions with these signatures in the human genome, test in zebrafish (expression in fin), if positive check in mouse (expression in limb)
- Mikhail Spivakov
look at ~480 ultraconserved elements (>50% non-coding, 100% identity and <=200b human-mouse-rat), depleted for SNPs
- Mikhail Spivakov
took four UCE: positive in enhancer assays, near genes when mutated lead to either a lethal or sexual development phenotype, variable in size
- Mikhail Spivakov
generated mouse knockouts for these UCSs - heterozygous KOs are fine, even homozygotes have no apparent phenotype!!!
- Mikhail Spivakov
maybe there's gene / reg module redundancy? It certainly exists - eg at UCEs upstream of ARX
- Mikhail Spivakov
maybe it's because mutations will cause gain-of-function (by binding of other TFs) rather than loss-of-function? in this case, deleting the whole thing altogether won't help identify this
- Mikhail Spivakov
support for this model: replaced a limb enhancer in mouse with a bat enhancer that has only 20bp difference -> observed limb elongation. Also bioinfomatic support showing that these seq's are prone to substitutions and indels (?)
- Mikhail Spivakov
Take four UCE that they KO'd, put an 83bp indel and introduced a 3bp change, both chosen more or less randomly. Very preliminary results on the 83bp insertion into two of the enhancers: made mice, after 10 generations of backcross, het:het mating recovers fewer heterozygous and homozygous mice than expected.
- Mikhail Spivakov
Taverna is a workflow workbench to integrate tools (including web services).
- Gabriele Sales
Taverna can link ONDEX and external (ex. PubMed) data sources via the web-service interface.
- Gabriele Sales
When ONDEX works with Taverna, instead of using the pipeline manager you use the ONDEX web services and access ONDEX from Taverna. This means you can use Taverna to pull in data into ONDEX.
- Allyson Lister
Outline of the demo: starting from Janboree Network SMBL, parse it into Ondex, remove currency metabolites and annotate using network analusis results.
- Gabriele Sales
Then switch to Taverna. Identify orphans, retrive related enzymes, assemble a PubMed query and link results to the graph.
- Gabriele Sales
the workflow for relevant pubmed entry retrieval seems staggeringly complex - but I guess this is because each input appears as a node in the workflow (is that correct ?)
- Jim Procter
@I think there are but the response to that question seemed to suggest that ondex was aimed at denovo network generation rather than massive visualization (this needs to be clarified)
- Jim Procter
Different large scale data sets, genetic, phosphorylation etc. exist but how to interpret?
- Roland Krause
Add protein annotatons, sequence, structure, motifs, domains, functional characterization.
- Roland Krause
Particular interested in interaction domains, analyzing cellular organization and interactomes. Can we discover and analyze recurring patterns.
- Roland Krause
Analogy to multiple alignment of protein sequences with PROSITE pattern
- Roland Krause
For networks schemas, nodes are description of proteins, e.g. domains. An extension of network by attributes, e.g. PFAM or PROSITE.
- Roland Krause
Examples for network schema are homologous pathways.
- Roland Krause
No details of the algorithm, tricky problem but can solve it fast.
- Roland Krause
Triangles, quad topology (linear combination of four), Y star topology of four.
- Roland Krause
Automatically discovery of overrepresented schemas with direct interactions and sequence motifs. Using Y2H with filtering to address quality of the data set.
- Roland Krause
Counting occurrences of labelled subgraphs, brute force approach using Netgrep.
- Roland Krause
take each network, randomize it, count how often the annotation occurs in the randomized network and compute the overrepresentation term
- Ruchira S. Datta
Estimate false discovery rate, look at features with FDR < 0.05
- Roland Krause
count how often the score occurs in the random graph, to estimate the false discovery rate
- Ruchira S. Datta
when making the random graph with the pair schemas, preserve degree
- Ruchira S. Datta
for triplet schemas, also preserve annotation
- Ruchira S. Datta
For 3-vertex schemas, need to presever pairwise annotation acounts.
- Roland Krause
for higher order schemas, also preserve triplets
- Ruchira S. Datta
used Stub Wiring from Uri Alon's group for randomizing the pairwise schema
- Ruchira S. Datta
no known algorithm for randomizing uniformly, need to approximate
- Ruchira S. Datta
Graph of the pairwise schema network.
- Roland Krause
For triplets, nodes are themselves pairwise schemas.
- Roland Krause
Hubs in the graph are ras, kinas and [...]
- Roland Krause
cross-interactomics: repeated the process in human
- Ruchira S. Datta
Example of the DUP family in yeast, and pair comparison between human and yeast.
- Roland Krause
Human-specific schemas, some consist of domains that also exist in yeast.
- Roland Krause
Q. Robustness of the resulting networks. A. Removed 5%, found the network to be stable. Different results in different organisms.
- Roland Krause
Q. Use of subgraph sampling? A. Can use any topology for searches.
- Roland Krause
# Not sure I understand the question and answer but they do.
- Roland Krause
Q. (Ruchira) Human-yeast network did you check orthology? A. Used PFAM annotation, did not worry about orthology. Some motifs in human are from the expansion.
- Roland Krause
Numbering all possible splice variant sites, assign ASCII symbol to each site, "alternative splicing code" that captures all differences between alternative splicing variation
- sebi
ESTs can be truncated, and still splicing events can be detected as a sub-structure
- sebi
"Bubbles" are cyclic graphs, complete events of splicing. Can construct bubble hierarchies for very complex splicing loci.
- sebi
Workflow: Gene annotation -> database of events -> visualization with ASTALAVISTA; optionally add RNA-Seq data for expression levels of events (even simulated data from the Flux Capacitator)
- sebi
Cohen et al, 2008 [not a very useful reference]: nominalizations are more difficult to handle than verbs, but can yield higher precision
- Michael Kuhn
use in-house tool to do annotation of nominalizations
- Michael Kuhn
alternations are very diverse, contrary to previous prediction that there would only be a limited number of alternations in scientific literature (sub-language model)
- Michael Kuhn
Much of biological experiments can fall under an ISA (Investigation-Study-Assay) structure.
- Allyson Lister
You should then use three types of standards: syntax (FuGE, ISA-TAB etc), semantics, and scope. MIBBI is all about scope.
- Allyson Lister
Why do we care about standards? Data exchange, comprehensibility, and scope for reuse.
- Allyson Lister
"Metaprojects": FuGE, OBI, ISA-TAB - draw together many different domains and present in structure/semantics useful across all.
- Allyson Lister
When the independent MI projects overlap, arbitrary decisions on wording and substructuring make integration difficult. This makes it hard to take parts of different guidelines - not very modular. This is what MIBBI helps with.
- Allyson Lister
MIBBI promotes gradual integration of checklists.
- Allyson Lister
The modular nature of molecular cell biology can be seen in today's large networks. A broad overview over the different, complex networks.
- Roland Krause
Many biological entities can be seen as networks, e.g. proteins and genes, measured by DNA binding, physical and genetic interactions, which can be combined into heterogenous networks.
- Roland Krause
Edges are not born equal, we need edge-weights, easy in gene expression networks. SAGA and SWI/SNF example. Assign local weights to rank edges in the network
- Roland Krause
Data set are not born equal. Small scale data set are more reliable than global approaches.
- Roland Krause
# Not sure I agree fully here, but he is merely motivating the weights approach.
- Roland Krause
Some data set are trusted more than others.
- Roland Krause
Different scores for global and local weights can be combined.
- Roland Krause
Everything is interconnected, no way to interpret. Needs to be dissected.
- Roland Krause
Cliques, hubs, set of neighbors can be used to find the modules. Graph clustering are unsupervised approaches, using MCL or between centrality clustering.
- Roland Krause
Functional enrichment can be studied using miRNAs, KEGG, Reactome, GO, TFBS.
- Roland Krause
The statistical perspective can be computed by hypergeometric tests, could be normalized with the number of proteins in the module.
- Roland Krause
Graph partitioning to detect dense regions, than enrichment, only highlight those with high scores.
- Roland Krause
Analyze public data set from yeast, breaking down into hub based modules.
- Roland Krause
Most algorithms at the moment don't allow pseudoknots
- Cass Johnston
Minimum Free Energy approaches: Mfold, RNAfold etc. But many structures close to MFE
- Cass Johnston
Maximizing expected accuracy CONTRAfold etc.
- Cass Johnston
CentroidFold (their algorithm) is an MEA tool. Performs better than RNAFold, Sfold, Contrafold... (not sure what the test set was)
- Cass Johnston
Using homology to further improve accuracy of structure prediction (previous approaches: RNAalifold, McCaskill)
- Cass Johnston
Sankoff sequence/structure alignment of sets of homologous sequences plus MEA. Computationally unfeasible.
- Cass Johnston
Approximate the Sankoff method such that it is practical to run the method even for long RNA sequences
- Cass Johnston
Compared CentroidAliFold to other state of the art methods. Outperforms conventional secondary structure prediction (ie. MFE-based) and outperforms everything except RAF (comparable) for approaches using homology too.
- Cass Johnston
Method uses Nussinov-style dynamic programming to predict secondary structure. Maximises the sum of base pairing probabilities in the predicted secondary structure.
- Cass Johnston
Question: Algorithm tested on structural RNAs, can it be adapted to handle mRNAs etc with more flexible structures? Answer: Possible, but non-trivial
- Cass Johnston
an exciting story, downstream of the bulk of computational biology in the medical field
- Ruchira S. Datta
process flow usually ends with finding and optimizing potential drug targets
- Ruchira S. Datta
Start when the drugs are available on the marketplace and they support personalized medicine, and which drugs to give to AIDS patients.
- Allyson Lister
Personalized medicine, they start when the drugs are in the market place
- Diego M. Riaño-Pachón
in this case, support difficult decision of a doctor: what drug to give to the AIDS patient
- Ruchira S. Datta
one new drug blocks attachments of virus to cell; another blocks fusion; 17 drugs block reverse transcription; 1 drug blocks integration; 8 block maturation
- Ruchira S. Datta
moving target: over 10 million virus particles turned over per day per patient
- Ruchira S. Datta
A drug may be efficient against the wild type, but not against mutants.
- Gabriele Sales
wild type viruses are most fit under natural condition; drug will be very effective on wild type but will very quickly select for resistance
- Ruchira S. Datta
Always going to be minority variants resistant to the drug which will escape drug treatment
- Oliver Hofmann
reverse transcriptase is a bad copier, enabling variation every time the virus replicates
- Ruchira S. Datta
There is no drug targeting all mutants.
- Gabriele Sales
Hence drug cocktails that catch all variants. Doesn't work, best we can do is postpone the onset
- Oliver Hofmann
therefore need drug cocktail that catches all of them, but this is utopia and doesn't happen; there is no drug therapy that works forever
- Ruchira S. Datta
HAART: highly active anti-retroviral therapy; administer at least two drugs of different classes (targetting different proteins, working in different ways)
- Ruchira S. Datta
number of viral RNA in the blood is a major clinical indicator
- Peter Menzel
therapy is effective for some time, until new strain develops that is resistant
- Ruchira S. Datta
50 copies is the current limit of detection for blood tests (unclear per what?)
- Oliver Hofmann
Detection limit: 50 copies of the virus.
- Gabriele Sales
this is the main question in treating patients, and is very difficult
- Ruchira S. Datta
so far only viral genome, not host genome, is being considered
- Ruchira S. Datta
people have built mutation tables: synopsis of global clinical experience of how virus responds to treatment
- Ruchira S. Datta
Mutation tables: collection of responses of the virus to various treatments.
- Gabriele Sales
In the past, they've built mutation tables - global collection of clinical experience
- Allyson Lister
but if there are too many mutations in the table, won't be able to administer therapy to any patient--every patient will have some of these
- Ruchira S. Datta
Tables limited because mutations are not acting independently.
- Gabriele Sales
mutation tables carry not enough information..
- Peter Menzel
interdependencies cannot be captured by mutation tables; need rule-based expert systems
- Ruchira S. Datta
Mutations act in the context of the remaining genome (and the host genome)
- Oliver Hofmann
Mutation table ignores the context of mutations and synergies
- Diego M. Riaño-Pachón
unfortunately, the medical community calls these "algorithms"
- Ruchira S. Datta
virologists ask: "Is this kind of resistance analysis objective?" "Can we not let the clinical data speak for themselves?" i.e., circumvent political process of decisionmaking of what goes in the tables
- Ruchira S. Datta
Started by building a clinical database.
- Gabriele Sales
Then the comp biol at his group enter, by request from MDs, into the picture
- Diego M. Riaño-Pachón
Using linear SVM for regression: a line for each drug and have est resistance factor, and normalization with Z-score, and the scored mutations.
- Allyson Lister
use z-scores, as absolute values of resistance factors are not comparable btw drugs
- Ruchira S. Datta
Each drug with an estimates resistance factor, Z-score (for comparative purposes) and a list of scored mutations based on their weight
- Oliver Hofmann
this difficult patient is full of mutations, has resistance to every known drug per the mutation table
- Ruchira S. Datta
"out-therapy" - doctors say positively they can't help him any more
- Ruchira S. Datta
but they saw that some of these mutations actually resensitize!
- Ruchira S. Datta
Some mutations, which confer resistance to some things (e.g. 76V in the anecdotal example) actual confers re-sensitisation and therefore would have a positive effect. Couldn't have been done with mutation tables!
- Allyson Lister
Give one drug to retain re-sensitation mutation, add second drug to exploit the re-sensitation effect
- Oliver Hofmann
one mutation conferring resistance to two drugs, resensitized the virus to other two drugs
- Diego M. Riaño-Pachón
this could not have been found via the mutation table; the patient was on the recommended therapy from March 2003 until April 2009 and blood was clear of virus
- Ruchira S. Datta
natural next question: predict in what direction the virus will evolve under a given drug therapy
- Ruchira S. Datta
simulated by fair mutations, but the virus does not mutate by flipping a fair coin, it chooses useful mutations (!) don't know how it does that
- Ruchira S. Datta
virus follows specific mutational paths into resistance
- Ruchira S. Datta
('chooses' is probably not the right word for this :) )
- Oliver Hofmann
I have a spare Vasa museum ticket from a friend who unfortunately can't make it :( His misfortune may be your gain. I'm in the keynote, left hand side on the aisle by the power cables. Find me there before I leave the keynote and the ticket's yours!
get 10 inferred new pathway members, sometimes see that the genes are just missing annotations
- Michael Kuhn
can also look for putative regulators
- Michael Kuhn
find Cyclin H as potent OxPhos regulator, validate experimentally
- Michael Kuhn
technical caveat: nonparametric estimation of background requires careful permutation strategy, Breitling R et al, 2008, PLoS Genetics
- Michael Kuhn
Thanks for the live-blogging Michael! Your notes made me realize I never included a reference to the PLoS Genetics paper describing the work. It is here: http://www.ncbi.nlm.nih.gov/pubmed...
- Andrew Su
Components such as R, Bioconductor; GUIs with collaborative views; Scripting (R/Python/Ruby); stateless web services; NFS/FTP/S3 storage; cluster/grid support
- Oliver Hofmann
The R Virtualization is like a mini-desktop.
- Allyson Lister
Server side spreadsheet sync'd on all clients.
- Gabriele Sales
Typically of the from GGGACTAAGGGACTTCCCACTTGG
- Roland Krause
Will form spontaneously, have role in transcriptional control and telomeres.
- Roland Krause
Are these patterns really stable? It's the first indicator of a functional role. Melting temp will be a proxy for stability
- Allyson Lister
Overrepresented in promoter regions [Hupper and Balasumbramanian, 2007]
- Roland Krause
Melting temperature can be predicted and experimentally verified. It's low throughput though, rules are limited, complicated non-linear relationships.
- Roland Krause
Gaussian processes (GP) regression with different error rates across the sequence.
- Roland Krause
Gives the posterior distribution of function values given a training set.
- Roland Krause
Needs a covariance function (kernel), a likelihood model and hyperparameters.
- Roland Krause
Product ansatz to construct a joint covariance function of concentration and sequence.
- Roland Krause
With a 50/50 training split, the predictions (with the error bars) always overlap with the "truth" line, sometimes with a large uncertainty. Everything is predicted within 10 degrees C.
- Allyson Lister
The relevance of the features for the hyperparameters can be shown, one of the length parameters is most important.
- Roland Krause
Genome-wide GQ prediction in human identifies 359,548 candidate sequences.
- Roland Krause
60% is in the 10° Tm range (which is pretty good).
- Roland Krause
Are they functional and in promoter regions?
- Roland Krause
Quadruplexes are overrepresented in the promoter regions by order of magnitude than anywhere else
- Allyson Lister
Weak hint, may not to be expected much more from only 260 examples.
- Roland Krause
CG dinucleotide content in HG: 1%, expected: 4.5%
- Marcel Martin
CpG islands: regions on DNA that contain many CpGs. 28000 islands annotated in HG. almost all of them are near gene promoters
- Marcel Martin
mDIP: methyl-DNA immunoprecipitation assay, similar to ChIP-chip. 244k DNA methylation array
- Marcel Martin
array methylation score (IMS): average signal for all probes mapped to it. bimodal distribution. house keeping genes are methylated (ie, on one side of the distribution)
- Marcel Martin
approx 15 samples (different tissues). almost all are not methylated (~70%)
- Marcel Martin
Nature: Sp1 elements protect a CpG island from de novo methylation, Michael Brandeis et al, Nature 371, September 1994
- Marcel Martin
designed a new tiling array that covers all predicted UMRs
- Marcel Martin
conclusions: 4400 predicted regions were confirmed as UMRs. 923 of the UMRs are placed near known TSS. no one-to-one correspondence between CpG islands and nonmethylated regions. also: yes, there is tissue-specific methylation (didn't go into detail)
- Marcel Martin
e.g., "Consider a spherical cow..." A farmer hires a physicist to help with milk production, who takes half a year to produce a paper beginning thusly.
- Ruchira S. Datta
complexity of data and complexity of models: non-identifiable models still have robust properties
- Ruchira S. Datta
See e.g. Chen et al, Molecular Systems Biology 5:239, 2009
- Ruchira S. Datta
given only the order relations between the model parameters, can we provide robust first and second order solutions?
- Ruchira S. Datta
biological systems are hierarchical and multiscale (an observation, not a theorem)
- Ruchira S. Datta
structure: functional modules, motifs; scales: time scales, concentratio scales
- Ruchira S. Datta
this makes it possible to neglect small quantities in favor of larger ones, given proper theory
- Ruchira S. Datta
aymptotic approximations in chemical kinetics: quasi-equilibrium (fast reactions), quasi-steady state, ..., quasistationary
- Ruchira S. Datta
enzymatic catalysis in quasiequilibrium vs quasistationary approximations gives very different results
- Ruchira S. Datta
rate limiting step: steady state rate is determined by slowest reaction in the chain
- Ruchira S. Datta
what is the equivalent for a complex network?
- Ruchira S. Datta
dominant dynamical system (DDS): auxiliary minimal dynamical system which gives the main asymptotic terms of the stationary state and relaxation in terms of well separated time scales
- Ruchira S. Datta
monomolecular networks with time separation can be solved without exact knowledge of kinetic rates
- Ruchira S. Datta
Theorem: for such systems, the eigenvalues have only -1, 0, 1 values. These are determined only by the network topology and the order of the parameters.
- Ruchira S. Datta
DDS for linear networks: cycle gluing; cut into cycles and contract until have noncyclic system, then solve that
- Ruchira S. Datta
what about non-linear systems with non-monomolecular reacionts?
- Ruchira S. Datta
if one concentration changes much more slowly, treat it as a parameter
- Ruchira S. Datta
1) identify linear or pseudo-linear subsystems; 2) neglect small quantities (use idempotent algebra)
- Ruchira S. Datta
model reduction preserves dynamics; need to identify critical and non-critical model parameters
- Ruchira S. Datta
characteristic functions are ratios of monomials in the initial parameters
- Ruchira S. Datta
cryoEM has become a standard tool for structural characterization of large protein complexes
- Anne Tuukkanen
Complex modeling by using em maps and minimizing an objective function. The objective function includes terms for geometric complementary, a fitting score and term for envelope penetration.
- Anne Tuukkanen
@Andrew - I have a ticket you can have for free - I've just come into possession of it. Meet at Victoria hall just as coffee starts? It'll be empty and you'll be able to see me
- Allyson Lister