A critical yet sober review of the arsenic life forms. I want to see arsenic DNA purified and an NMR spectrum taken and bet against its taking the place of phoshate.
- Roland Krause
Most methods for prediction are genomic or sequence based but not structure based.
- Roland Krause
Protein docking: finding and scoring interaction poses between two proteins. Hard problem to distinguish interactors from non-interactors. Use of the Weng benchmark set of 65 complexes. Measure distribution of scores.
- Roland Krause
Build decoy set and assume they are not interacting with the benchmark set. Some examples show separation but fails for several. 36 complexes outperform rank better than the 80%.
- Roland Krause
Where on the interactors are docking solutions generated? Some AA are involved more frequently than others.
- Roland Krause
Still way to go before predictions are possible but a signal is present.
- Roland Krause
The problem with methods that try to optimize discrimination between natives and a set of decoys is that they usually find problems in the docking software / energy function. I.e. you'll discover that decoys have very weird electrostatics distribution etc... A good set of non-interactors would be key to develop this field further
- Nir London
scoring is indeed a bottleneck in docking. however there is a signal found in this work, it can be very helpful both for docking and prediction of interactions to understand where this signal comes from.
- Dina Schneidman
You think you have enough chalk for the task?
- Roland Krause
I was thinking of limiting it to the people who go all the way and publish their flawed results ... I *might* have enough chalk then
- Lars Juhl Jensen
The Wall of Shame should include the journals that "review" and publish those flawed results...
- Shirley Wu
I agree Shirley ... but I guess it would have to be done by someone who doesn't want to publish in those journals ;-)
- Lars Juhl Jensen
mentioned this before but might have been lost: if ISCB has to do something to support this crucial activity: let us know!
- Burkhard Rost
OK, then let's meet for lunch tomorrow. Please spread the word. It'll be easiest if we meet at 12:50 at the hallway leading to the ballrooms on the 3rd floor. Burkhard: Thanks for the support, we'll get back to you.
- Roland Krause
@Burkhard: most people that sat next to me had no idea about this room or about FF, despite the prominent slide and the links from the ISMB webpage. Perhaps if you show a screenshot before a keynote, awareness will increase?
- Mickey Kosloff
Afraid the horrible wireless connection made (live) blogging nearly impossible for many sessions. Was going to cover the late breaking research session today from 201, but no luck.
- Oliver Hofmann
Must have missed you. Got there a couple of minutes late and didn't see anyone.
- Mickey Kosloff
from iPod
Hmm, let's try to meet at the reception. It'll be a search in 2D space so we should find each other in finite time. I will be in room 302 for the remainder of the afternoon.
- Roland Krause
What about another projector projecting the feed, in talks by some daring presenters?
- Barb Bryant
I won't be able to make the reception. My 3 ideas are: separate (secure) wireless for bloggers. Incentives (e.g. Priority for plos-cb postcard publications). Wider campaign to recruit micro bloggers before ismb.
- Mickey Kosloff
from iPod
1. projection of the feed: don't get the reason. 2. wireless 4 blog: good idea, 3. plos-cb postcard slot: will talk to phil/plos, 4. other incentives: what you want?
- Burkhard Rost
Reason for wireless4blog is most likely due to low speed here
- arne
It's not a bandwidth problem, I had throughputs on the MB/s. More likely, many of our computers have problems in this mixed settings with "smart" handling of receiver power etc. In Vienna, I had a clever tech guy that helped to disentangle these issues with my network card. What we (all) would really need is some sort of true hacker with proper equipment but I have no good idea how to recruit such a person.
- Roland Krause
Can bottleneck be between local server and ISP ?
- Mickey Kosloff
from iPod
Supertree methods can be used to build trees from diverse trees on incongruent species sets and diverse data, e.g. morphological and phylogenetic data.
- Roland Krause
How to find the best representation of several trees. Distances between trees with different taxa.
- Roland Krause
Many reasons to obtain conflicting trees, some have legitimate reasons (hybridization etc) which can be modeled as phylogenetic networks.
- Roland Krause
Clusters are subset of leaves, basically a hypothesis that at least one trees contain such a clade.
- Roland Krause
Clusters loose the topological information. Let's the user specify data that are trusted.
- Roland Krause
Minimize the number of reticulations, motivated by parsimony but by evolution.
- Roland Krause
Algorithm CASS attempts to produce such networks and does so well (60 slide of math not shown)
- Roland Krause
Tested on different subsets obtained from a Poaceae grass data set (PWG 2001, Smidt 2003). Comparison to Hybrindinterleave, PIRN, Galled Trees and Cluster Network.
- Roland Krause
CASS works well in practice. Integrated in Dendroscope. Running times slightly longer than competitors.
- Roland Krause
Different lengths for a gene - total transcript length, 5'UTR length, introns, etc
- Roland Krause
Expression levels are collapsed to a single value by ranking by condition/tissue into 3 category and average and round the resulting rank.
- Roland Krause
link dynamics: new interaction maps to neofunctionalization, loss of interaction to subfunctionalization
- Roland Krause
He and Zhang 2005 calculated neofunc. The older the duplication, the more interactors.
- Roland Krause
Wagner 2001/2003: Ancestral protein would self-interact and form expected interaction. If no self-interaction, de novo interaction can to be assumed
- Roland Krause
Binding domain in yeast-two hybrid is a dimer. Self-interacting proteins should not be observed.
- Roland Krause
Modeling neofunctionalization in a theoretical network.
- Roland Krause
Highlights the importance of self-interactions in network evolution.
- Roland Krause
motivation is to understand genetic basis of human diseases
- Dawei lin
Genetic basis of human diseases - important disease mechanisms and bio pathways remain unidentified
- Venkata P. Satagopam
gap in knowledge of human disease biology contribute to high failure rates in drug development
- Dawei lin
Why understanding genetic mechanisms ? (1) Important mechanism remain unidentified (ii) Gaps in knowledge causes failure rate in drug development
- arne
It will be a long way to know if the two motivating hypotheses are true
- Dawei lin
one of the most research on T2D. It scaned 100k people for 10 yrs
- Dawei lin
10 years later 50% progressed to have the disease
- Dawei lin
10years of diabetic research - the out come is - 50% of people with good lifestyle improved
- Venkata P. Satagopam
lifestyle has a bigger impact than Metformin
- Dawei lin
Diabetes study with 10-year follow-up of diabetes incidence and weight loss, "T2D". Randomized into treatments: lifestyle, metformin, placebo. Best drug makes relatively little difference in incidence; lifestyle intervention is better than drug but still doesn't help a whole lot.
- Barb Bryant
best prevention was extensive lifestyle changes (50% -> 40% incidence)
- Mickey Kosloff
Diabetes is not only a matter of life style
- arne
success rate in current pharma industry is <5% of molecules entering the clinical trails
- Venkata P. Satagopam
key attributes of genetic mapping: (1) unbiased by prior assumptions about pathways (2) saturation mutagenesis reveal pathways
- Dawei lin
many mutants -> reveals coherence of pathways
- Ted Laderas
These days we have other methods that are unbiased like expression profiling, but genetic mapping has some unique characteristics relative to these (he’ll explain in a minute).
- Barb Bryant
Drosophola's mutations looked initially random, years they almost all related to pathways.
- Dawei lin
bottleneck is functional determination - biochemical approaches
- Ted Laderas
A lot of current knowledge can track back to genetic mapping
- Dawei lin
A slide based on Galzier et al, Science 2002
- Dawei lin
genetic mapping of human single gene disorders ...over 15 years Botstein paper in 1980, first genetic map in 1985 ....
- Venkata P. Satagopam
It took 10 year to find maker for Huntington disease
- Dawei lin
Once you find a linked region from genetic mapping, it still takes a long time to find the specific gene responsible.
- Barb Bryant
in the 1990's the idea was that common diseases were caused by rare mutations with large effects
- arne
"Chromosome shlepping" - Eic Lander's term for the identification of a very gene in some genomic region.
- Roland Krause
It is robust to find mendelian disease but to not common diseases
- Dawei lin
another approach: population genetics - QTL approach
- Ted Laderas
phenotypic variation is often continuous and may involve variation in many genes
- Dawei lin
Galton invented regression analysis to analyze the measuring of phenotypic data (heights of parents and offspring).
- Roland Krause
The biometric unit --- almost nothing was Mendelian
- arne
Most traits are continuously variable
- Ted Laderas
Francis Galton was a cousin of Darwin. Darwin didn’t explain the source of variation. Galton focused on this; he measured the heights of parents and their offspring, and found a relationship. He invented regression analysis to draw the line. The slope of the line is related to the inheritability of the disease.
- Barb Bryant
It was studied by the cousin of Darwin, Francis Galton (1885)
- Dawei lin
phenotypic variation is often continuous ... some history ... Francis Galton (1885), Ronald Fisher (1918), Hermann Muller (1920)
- Venkata P. Satagopam
This gave rise to the biometric movement – measure every living thing. Traits were related to genetic relatedness; and it wasn’t Mendelian. This led to the biometric-Mendelian debate.
- Barb Bryant
Ronald Fisher, was actually a geneticist, who also invented p-value and Fisher exact test
- Dawei lin
Ronald Fisher (the one with the exact test) was also a geneticist.
- Roland Krause
Solved by assuming that phenotype often is an effect of several Mendelian genes.
- arne
Fisher: individual genes are mendelian, effects of genes additive
- Ted Laderas
Hermann Muller 1920 (Nobel Prize for X-ray induced mutations). PhD thesis not Mendelian trait, but truncate wing. Wasn’t Mendelian. Did genetic mapping.
- Barb Bryant
Hermann Muller decided to use broken wing of fruit fly to study non-Mendelian diseases
- Dawei lin
Muller 1920 paper: 4 chromosomes in fly – 3 contain genes that influence the trait truncate wing. Muller wrote about implications for human traits, like psychological traits. Said that traits were going to be too complicated. Said you could figure out by looking at population, but not looking at Mendelian inheritance in families.
- Barb Bryant
Muller 1920 suggested that it needed to do study on a population.
- Dawei lin
mendelian fallacy - sub-populations are easily divisible in terms of risk
- Ted Laderas
Prediction will only be useful if there is an intervention that you would not use without the prediction. Otherwise, you should use the intervention anyway.
- Roland Krause
Huntington will not be a representative example - for most diseases/people identified risk will be <<100% even with full genetic information
- Mickey Kosloff
Cautionary tale - PSA prediction results in over-treatment, hasn't been shown that people live longer because of test
- Mickey Kosloff
Very cautious about PSA - no improvements on the mortality but many operations performed.
- Roland Krause
genetics offers a path to discover the underlying biology of human diseases ; the great value will drive from pathophysiology and treatment
- Venkata P. Satagopam
When grouping mutations into pathways up to 85% of GBM have a muation in the most important pathways, while individual genes are down to a few %
- arne
Each oncogene may have relatively low frequency across patients; but when you group genes across pathways, a pathway may explain a large fraction of patients with a given type of cancer.
- Barb Bryant
can see a change in pathway activation between primary tumor and mets
- Mickey Kosloff
Dominant alterations changes between cancer types and states.
- Roland Krause
GBM: copy number is rare (and noisier) Ovarian: more regular and higher
- arne
profiles of copy numbre variations differ between types of cancers
- Mickey Kosloff
Metastatic tumor samples have more copy number changes than primary tumors. Not surprising. But maybe primary samples with more copy number changes than others are more likely to metastasize? Generally, better outcome with fewer somatic copy number changes.
- Barb Bryant
BRCA1 and BRCA2 mutations convey germline inherited cancer risk
- Barb Bryant
These genes act in the homologous repair pathway. Half of all patients have mutations in some homologous repair pathway gene.
- Barb Bryant
and more generally, homologous repair genes are altered in > 50% of ovarian cancer
- Mickey Kosloff
Tumor suppressor genes can be inactivated in various ways: germline mutation, somatic mutation, epigenetic silencing, etc.
- Barb Bryant
There are drugs under development that might work particularly well in patients with defects in this particular pathway.
- Barb Bryant
Cancer genomics portal: www.cbio.mskcc.org/cancergenomics
- Barb Bryant
Instead of going through all the models that are possible, you derive statistical properties across a set of good models for each of the Wij weights in the model.
- Barb Bryant
This is sort of like partition functions in statistical physics
- Barb Bryant
after step 1 - generation of probability distributions then step 2- decimation
- Shannon McWeeney
So you have a probability distribution for each Wij, which represents the interaction between element i and element j. I'm not really getting how you "update" these probability distributions in the iterative steps. I do understand that at the end you take the most "certain" (narrowest) distribution and fix its value (some Wij) at the most probable value, then update all the other Wij's given this fixation. And so on. To get your final model in a sort of greedy fashion.
- Barb Bryant
And by the way, the underlying model is a simple differential equation sort of thing: change of one variable xi is a sigmoidal function of weighted (Wij) sum of all variables xj, less a decay term.
- Barb Bryant
Question: Interacting network tend to be modular, with strongly-interacting subnetworks that interact weakly with each other. ...
- Barb Bryant
Chris: Is the modular approach really useful in confronting the data? [Is that what he said?]
- Barb Bryant
Question: can you get at causal relationships?
- Barb Bryant
Chris: yes - if the network model allows you to predict correctly the result of a particular perturbation applied to a particular node, then you can simulate using that model.
- Barb Bryant
Question: with a big network, how many experiments will you need to model?
- Barb Bryant
Chris: Good question. Could use an entropy measure. Help us figure this out. Help us design the experiments. It's important because of the costs of experiment. This is going to be broadly applicable in cell biology.
- Barb Bryant
bb - he said one should see if approach is useful by confronting with real data
- Shannon McWeeney
from BuddyFeed
Chris gets at the difference between a model that tells a story and a model that is truly predictive.
- Barb Bryant
Question: yes, but, what are the semantics of the graph? What kinds of interaction? Answer: The semantics are in the mathematics of your model.
- Barb Bryant
Question: mean field approach is interesting. Compared to Monte Carlo approach, you are assuming some decoupling. Loss of posterior coupling between weights - is that an issue?
- Barb Bryant
Chris: If you look at a coupled system overall, the extent to which the algorithms work depends on correlations within the system. Long-range (in terms of network distance) correlations are problematic. There are some clever approaches to handle some of this. Mentions non-ergotic space; deal with parts of space separately or iteratively.
- Barb Bryant
HMM based, states map to dinucleotides and therefore overlap.
- Roland Krause
Colors are not modeled, insufficient data to deduce the sequence.
- Roland Krause
Application of Forward-Backward algorithm gives distribution at each position.
- Roland Krause
Extension to indels and heterogeneous SNPs. Add several gap characters (by color), increasing to 1600 states. Not so problematic due to sparse transition probabilities.
- Roland Krause
Support quality values and variable error rates for emissions.
- Roland Krause
Performance similar to individual color space or letter space data. Major improvements with mixed data.
- Roland Krause
Differential equation modeling of various aspects of Notch signaling, incl. binding, localization, translation, transcription. First order models for formation of biomolecules.
- Roland Krause
Bistability in the Notch-Hes1 network. The switching points can be determined, undergoes hysteresis.
- Roland Krause
Provides the cell with noise filtering.
- Roland Krause
Response to a transient Delta signal. Signal can be short and high or continuous to switch the response on.
- Roland Krause
Sensitivity of to model parameters. Loss of bistability turns oncogenes on.
- Roland Krause
Some parameters are not critical. Key parameter is repressive constant of Hes1. Systems start oscillating at some point. Bridged by a brief monostable state.
- Roland Krause
Nice bridge between the biological and a complex model, well presented.
- Roland Krause
Manolis Kellis: Systems level view of transcription. Regulatory networks across species using conservation. Effects on top of nucleosome positioning: The histone code leads to a multitude of combinations.
- Roland Krause
[...] Signatures of transcription factor binding and nucleosomes in different cell lines. Dips in chromatin signal hints TF sites, associated with conservation. Many cell type specific dips.
- Roland Krause
William Stafford Noble: Segmentation for chromatin states.Very general talk with lots of colorful plots and no formula (regrettably if you ask me).
- Roland Krause
Chris Bock:Biomarker development from epigenomics
- Roland Krause
Epigenetic aberrations that lead to cancer could be reversed, a handful examples are in the clinic.
- Roland Krause
Epigenetic biomarkers are detectable earlier than genetic changes and possibly from blood.
- Roland Krause
Showcase example SEPT9 promoter methylation (diagnostic). MGMT promoter methylation as therapy selection. LINE repeat meth to monitor effect of demeth drugs.
- Roland Krause
Search for biomarkers promising. Bioinformatics challenge in distinguishing tissues, not mechanistic inferences. A variety of technologies exist. Four selected for this studies: MeDIP, MethylCap, RRBS (sequencing based, bisulfite) and Infimium (apapted microarray, bisulfite).
- Roland Krause
Benchmarking: Good agreement between bisulfite method. Enrichment methods display sequence biases with low correlations. Repeats show spurious hits.
- Roland Krause
Linear models can correct for sequence biases.
- Roland Krause
Differentially methylated regions detection using Fisher's exact test.
- Roland Krause
Developed in the process of high-throughput screening at the Broad Institute.300,000 compunds are screened (somehow) to produce a ranked list, a distribution across a response variable.
- Roland Krause
Find threshold to verify the top n compounds on the list.
- Roland Krause
How many hits should be sent to confirmatory experiments?
- Roland Krause
Most commonly, people simply guess, often arbitrary and unfair. FDR is typically preferred but it's argued that it's just as arbitrary.
- Roland Krause
Two flaws of FDR: Need to pick a FDR cutoff and FDR assumes that everything different than the negative controls is interesting, often not the case in the compound testing.
- Roland Krause
Key question: What is the most profitable number of confirmed activities to produce? Think supply-demand curves.
- Roland Krause
Determine the supply cost (how many positive are produced) and the demand curve by the cost of finding one more active compound.
- Roland Krause
(Unpublished work in scaffold disovery)
- Roland Krause
Q: Assumptions on distributions. A: Less assumptions than FDR. Only assumption is that there is some signal in the original test.
- Roland Krause
Q: Power comes from option to back to the screen. Not possible easily in microarray experiments. A: Testing happens in MA analysis and recommends to use similar verifying settings there, too.
- Roland Krause
Q: How find the demand curve? A. Should only be controlled by the costs (funding, importance). Does not need explicitly modeled.
- Roland Krause
Main improvement is to tie follow up experiments to real costs rather than some rather abstract statistic.
- Roland Krause
# The late breaking talks seem to be very interesting, room 305 is packed with ~80 people.
- Roland Krause
# Moving the LBR track to 201; constantly been packed before
- Oliver Hofmann
Truncated proteins might interfere with physiological function (dominant negative). The cell removes such transcripts through nonsense-mediated decay (NMD).
- Roland Krause
From Chris Sanders lab in Sloan-Kettering. Small RNAs regulate gene expression. Argument of this talk: Low degradation rate will result in high changes, high degradation rates will lead to low changes after RNAi.
- Roland Krause
not all about the site - some transcripts "not-targetable" - Kreuger et al oligonucleotides 2007
- Shannon McWeeney
from BuddyFeed
siRNA rules have been designed focussing on the target site so far.
- Roland Krause
Theoretical model for high turnover mRNAs. More difficult to perturb. depends on pre-existing turnover and independent of transcription rate
- Shannon McWeeney
from BuddyFeed
mRNA half life varies between minutes and days. Will they respond differently to RNAi?
- Roland Krause
Validation in a simple reporter system, stabilization via (AUUU)n extension..
- Roland Krause
Evaluation in HeLa cells. siRNA targeting high-turnover transcripts is weak. Results not large but significant. Sequence property will explain more. Improved correlation with multiple siRNAs for the same gene.
- Roland Krause
Nice talk, clear hypothesis and conclusion.
- Roland Krause
Thanks for the headsup - I may have looked at this as a course reccomendation, from the title.
- Richard Badge
from Nambu
Ouch indeed. Not explaining the use of extreme value distributions in BLAST is a major oversight in a book about statistical bioinformatics. The focus on statistical analysis of microarray data is also highly unfortunate considering that microarrays are rapidly becoming obsolete due to advances in sequencing technologies.
- Lars Juhl Jensen
The really bad thing is that your criticism could be extended to a large number of books in the market. OK, sturgeon's law will apply but I find the situation on text books for Bioinformatics particularly poor. Any suggestions beyond the Durbin?
- Roland Krause
@Jeremy, suggestions on a good book (intro) to the field?
- Jamie McQuay
Great title though. I want to read this book written by somebody else
- Dave Lunt
from iPhone
I will let you know when I find a decent statistical bioinformatics book. I haven't done a comprehensive search.
- Jeremy Leipzig
Autoren-Gerangel unter Forschern: Wer auf den Titel gehört - und wer nicht - SPIEGEL ONLINE - Nachrichten - UniSPIEGEL - http://www.spiegel.de/unispie...
Autoren-Reihenfolgen sind eben ein Relevanzkriterium wie Impact Factor, Page Impressions und Google Rank und haben ähnliche Probleme. Leute, die in dem jeweiligen Feld zu Hause sind, wissen damit umzugehen. Der Artikel ist allerdings recht sauber recherchiert, allerdings fehlt das viele Zeitschriften wie PLoS die Funktionsnennung bereits eingeführt haben.
- Roland Krause