Sign in or Join FriendFeed
FriendFeed is the easiest way to share online. Learn more »

Roland Krause › Likes

Keynote: Michael Ashburner - From sequences to ontologies - adventures in informatics
"The father of ontologies in biology" speaks without slides. - Roland Krause
One well turned phrase is worth a thousand power points.... - Shannon McWeeney
Almost 50 years after he started his undergrad in Cambridge in geology to venture into paleontology. His developing interest in zoology was not matched by the respective department. Was asked to be the chairman of the department. - Roland Krause
PhD on Drosophila in Cambridge, PostDoc at Caltech. - Roland Krause
Stresse the importance of God-father for young scientists as role model etc. - Roland Krause
No problems in funding in the 60s and early 70s. - Roland Krause
Six month in Bruce Alberts lab. # not so easy to keep up with all the great people that Michael Ashburner worked with - Roland Krause
No knowledge of repetitive DNA, formulation of the C-value paradox. - Roland Krause
Work on Drosophila alcohol dehydrogenase spanning 20 years. - Roland Krause
Sequencing of ADH of 4 species using radionucleotides led to a PhD and a Nature paper. Almost no software available, major hardware incompatibility. Only one ARPAnet node in Europe at the University London. - Roland Krause
Initially 120 baud bandwidth. - Roland Krause
1983, first version of the EMBL database, came on magnetic tape 60kB. First thing: Print and read annotations. - Roland Krause
at that time size is 60kb - Venkata P. Satagopam
No relational integrity, lots of integrity. Moaning about it led to a position on the advisory board. - Roland Krause
Promoter of the establishment of the EBI at the Cambridge site. No genomics at the EMBL in the 80s. Raised 30 M GPB to convince the EMBL council. - Roland Krause
Gopher! - Roland Krause
Flybase establishment. Built in Sybase, output in files, distribution via Gopher. Later, contact with Amos Bairoch and Expasy led to use of a webserver. - Roland Krause
AceDB memories. - Roland Krause
Hierarchical, structured language use in FlyBase, extension to other model organisms. - Roland Krause
Presented in 1997 at the ISMB in Greece. - Roland Krause
Whitepaper for the 1998 in Montreal. Lot's of resistance from various groups. - Roland Krause
Gene Ontology sealed in 1999, support from the yeast and mouse databases. - Roland Krause
Incyte had a patent on controlled vocabulary. (Utter not reproducable here) - Roland Krause
Drosophila genome project, finalized in 6 month, genome annotation jamboree. - Roland Krause
HL20: Philip Kim - Bringing order to protein disorder through comparative genomics and genetic interactions
Bellay et al., collaboration with Chad Myers and Gary Bader. - Roland Krause
Many proteins contain disorder regions which are difficult to study and tend to neglected. With a focus on domains but interdomain regions contains many relevant motives for binding. - Roland Krause
General systematic survey of disordered interactions. Disorder is correlated with genetic interaction degree, holds for other data set, known for years. - Roland Krause
gene interaction network as way to define functional disorder - Shannon McWeeney
Paradox: disordered proteins evolve fast, disordered hubs are conserved at the gene level. - Roland Krause
Many disordered regions are conserved but not their content. - Roland Krause
detect level of conservation in regions characterized by high level of disorder - Shannon McWeeney
use of disorder conservation score and sequence conservaton score - allow elucidation of 3 classes of disorder - Shannon McWeeney
3 classes: constrained disorder (residue); flexible disorder (residue); non-conserved disorder (residue) - Shannon McWeeney
different functions with each of these classes - Shannon McWeeney
Constrained disordered in chaperone Hsp90 forms a loop on the surface. Might work on the refolding of other proteins. - Roland Krause
cancer driver mutations are enriched in constrained disorder - Shannon McWeeney
Q: Does your work suggest that with usual tours we are not able to conservation of disordered regions? A: Yes. There are different properties of disorder. We look at more data types. - Roland Krause
Q: Hubs vs non-hubs? Previous speakers showed a correlation with multiple functions. Is there a correlation? A: It's basically the same thing. Q: The floppy disorder regions allows it to do more things? A: Hmm, if you have more disordered regions have more opportunity - Roland Krause
Q: Strong signal in the location of the protein, context important. A: Not explicitly looked at. - Roland Krause
Q: Definition of disorder: Is there a number of AAs to define disorder? A: Might be a technical issue. We use Disorderpred2. Some artifacts from sliding window approaches, needs a minimal window. - Roland Krause
HL16: Insuk Lee - Predicting genetic modifier loci using functional gene networks
Changed title to "Finding missing inheritability using functional gene networks" - Barb Bryant
This is based on Lee et al, "Predicting genetic modifier loci..." and Lee et al "Prioritizing candidate disease genes by network-based boosting..." - Barb Bryant
GWAS is currently the most powerful approach for studying complex traits. 862 GWAS papers have been published, reporting more than 4306 trait-associated SNPs. - Barb Bryant
How much trait can we explain by these SNPs? Nature Feb 2011 paper: ranging from 12 to 60% of heritability of traits. - Barb Bryant
Two reasons for missing inheritability: (1) epistasis (non-additive genetic effects), and (2) low statistical power to detect weak genetic penetration per each SNP. - Barb Bryant
Gene-trait mapping using gene networks to explain more. - Roland Krause
Another approach: gene-trait association mapping. Construct functional gene networks using a Bayesian framework - Fraser & Marcotte, Nat Gen 2004; Lee et al Science 2004. - Barb Bryant
Combine different types of relationships to get better networks. - Barb Bryant
They look to see whether genes of interest are highly connected within the network. If so, then the network is a good model for the genes, or for the phenotype that the genes are involved in. - Barb Bryant
They tested by looking at 100 knock-out (KO) phenotypes from the literature - McGary et al Genome Biology 2007 - Barb Bryant
So this is yeast - how does this work in animal and plant, with many different tissue and cell types? - Barb Bryant
WormNet: works pretty well even if you ignore tissue type. Also AraNet, HumanNet (Genome Biology 2011). - Barb Bryant
Back to missing inheritability. - Barb Bryant
They predicted about 174 new Rb suppressors using 5 known ones as seeds. Then they tested loss-of-function phenotype using RNAi, and validated 16 new suppressors. This is in worm. - Barb Bryant
Hypothesis is that known disease gene modifiers that are well connected are likely to point to other modifiers for the same disease gene. Lee, Lehner et al Genome Research 2010. - Barb Bryant
They tested the top 3 genes with high connectivity in worm. They see a 7-fold enrichment compared to the original semi-random screen. - Barb Bryant
On to the second problem: low statistical power. With multiple testing, you need lots of data. Many of the minor contributors are below the significance threshold. - Barb Bryant
Can a network help? - Barb Bryant
Lee, Blom, et al Genome Research 2011. - Barb Bryant
If the genes that are close to significant belong to the same pathways, we will rescue them. We want them to be in the same pathways as the definitely-significant genes. Use HumanNet ( to look for "soft" guilt-by-association. - Barb Bryant
They looked at the Wellcome Trust Case Control Consortium. They validated boosted genes by comparing to genes that were later validated with a larger sample size. Examples include STAT3 and JAK2 in Crohn's disease, and GRB2 and SHC1 in gastric ulcer healing. - Barb Bryant
I don't have a clear understanding about how boosting compares to just relaxing the P-value threshold in the first place -- does he explain that? I guess he must. - Barb Bryant
Another example in type 2 diabetes... - Barb Bryant
A trait is a complex system composed of many genes. to model the system, first collect massive amounts of heterogeneous biological data, and create a "social network" of genes. From here, we get networks and can carry out the analyses described above. - Barb Bryant
I wonder if they are cataloguing the results of their analyses across many diseases with known disease genes. - Barb Bryant
PT17: Regev Schweiger - Generative Probabilistic Models for Protein-Protein Interaction Networks – The Biclique Perspective
Introduction by Rob Russell: The subject Networks were second in terms of submitted papers. - Roland Krause
Protein don't work alone. TAP, Y2H interactomes in yeast. Datebases. - Roland Krause
(Hairball) - Roland Krause
Introduction to random graphs (Erdös-Renyi, geometric) - Roland Krause
Degree distribution and clustering are not described by the above models. Alternatives are preferential attachment (PA) and duplication-divergence (DD). - Roland Krause
How to test a graph model against a real world network. Compare distribution, network motifs, dense subgraphs are used. - Roland Krause
Contribution: maximal bicliques. Bicliques are naturally related to the DD model - Roland Krause
Introduced for web graphs, found in yeast interactome. - Roland Krause
DIP set, plot all bicliques in heat map. Large range. - Roland Krause
Generate many models, count maximal biclique count, compare to the real network, compare by sum of difference of logs. - Roland Krause
Should become independent of parameters: scan the range of parameters and select the best for the real data. - Roland Krause
DD is similar, larger cliques, outperforms PA. Seeds are important. - Roland Krause
More simulations, best fit with inverse geometric random seed model. DD still outperforms PA. - Roland Krause
Most of the bicliques are in the seed of the PA. In DD, its only 5.1%. Removing the seed bicliques shows stronger effects. - Roland Krause
Q: Influence of abundance? A: Should be independent. - Roland Krause
Q: Have you tested other measures? A: Yes (not presented). DD gives best results. - Roland Krause
Q: Did you use only Y2H data? A: Yes, should be investigated more. - Roland Krause
Q: Single large scale experiments rather then DIP? A: No. - Roland Krause
Q&A (Rob Russell) Biological relevance? The DD network model makes more sense. [...] - Roland Krause
Q: Inclusion of smaller cliques, overcounting. Term is not right as nodes are connected with one clique? A: Hmm, yes chould be named differently. - Roland Krause
Q: Sequence similarity? A: Others have used this to come up with binding motives. - Roland Krause
Q: Known quality of PPI - disassociativiy of hubs, other interaction network such as TFs? A: No, tested it a bit, few organisms have larger networks, TF networks are directed. - Roland Krause
Q; Distribution of the counts, is the average a good measure? A: It's different for each model and set of parameters, variance rather small. - Roland Krause
There's a poster. - Roland Krause
Keynote: Janet Thornton - The Evolution of Enzyme Mechanisms and Functional Diversity
10 year Keynote for ECCB - Shannon McWeeney
Special call-out to Elixir session today at 2:30 Hall F2 - Shannon McWeeney
Trying to understand life from molecules to systems - Venkata P. Satagopam
She's a "data junkie" -- everything depends on having your data properly organized and being able to extract information from it. - Barb Bryant
Most of our information is still at the parts level, with emerging data on interactions, reactions and pathways - Barb Bryant
at EBI - data doubling every 5 months 12 petabytes of storage currently - Shannon McWeeney
EBI contains presently 12 petabytes of data - Venkata P. Satagopam
We need to look not only at proteins but also at the small molecules, the metabolites. - Barb Bryant
Plants have way more metabolites than we do. - Barb Bryant
Cheminformatics is older but smaller than bioinformatics; largely confined to industry. The tools are not freely available, with notable exceptions. - Barb Bryant
Differences between the proteome and the metabolome, e.g. no evolution and hierarchical structure of metabolites. - Roland Krause
"Way back in the 90s" they were trying to define the reactome - the reactions necessary for life. - Barb Bryant
From the proteome and the metabolome to the reactome: How many reactions are necessary for life? - Roland Krause
Enzymes are important part of biological molecular reasons - Venkata P. Satagopam
Enzymes are called by name and EC number. - Roland Krause
Handling the reactions computationally is a challenge - Venkata P. Satagopam
Predicting enzyme function automatically: most powerful and most popular method is to recognize a homologue and transfer functional annotation. - Vangelis Simeonidis
EC numbers explained: they conform to the following format: C.SC.SSC.SN - Vangelis Simeonidis
The classification of enzymes are four-part: classes, subclasses, sub-subclass, serial number (typically the substrate) - Roland Krause
where: C = Class, SC = Sub-class, SSC = Sub-subclass, SN = Serial number - Vangelis Simeonidis
EC numbers do not capture the mechanism of the enzyme. - Vangelis Simeonidis
Capture only the chemical level, no biological dependence such as co-factors - Roland Krause
There is no one to one relationship between EC numbers and protein families - Venkata P. Satagopam
The reactome contains 4154 reactions - Venkata P. Satagopam
They wanted to build tools that would handle the actual chemistry. - Barb Bryant
There has been a lot of work in the past 10 years in tools to handle the chemistry. Includes Kanehisa 2004, Gasteiger 2008, Aris-De-Sousa 2008, Schomburg 2010. Unfortunately, most of the software isn't freely available, and only tackles part of the problem. - Barb Bryant
There is a huge literature on comparing small molecules to each other. So that's well covered. - Barb Bryant
They also needed to map the atoms from each side of the equation to each other: atom-atom mapping. This works by matching the largest common moiety first, and iterating. The Mesa (?) database of about 300 reactions is a gold standard to check the quality of the mapping. - Barb Bryant
You need to be able to compare reactions to each other - reaction similarity. - Barb Bryant
To describe the changes in the bonds that take place, you use the Dugundji-Ugi model -- you make a matrix showing the bonds for reactants and products; subtracting the matrices gives you the reaction matrix. - Barb Bryant
EC-BLAST created by Syed Asad Rahman; it allows you to compare reactions by bond similarity, reaction centre similarity or substrate structure similarity. - Barb Bryant
Chemicals have several fingerprints bond change, structure, stereo fingerprint - Venkata P. Satagopam
(See KillerApp talk I think Tues 11:45am) - Barb Bryant
CDK (Chemistry development kit) free software, - Venkata P. Satagopam
They looked into redefining the enzyme classification system. - Barb Bryant
Ligases in principle simple, most are 6.1s are amino-acyl-tRNA synthases - Venkata P. Satagopam
The EC-BLAST-server (URL above) is in closed beta. - Roland Krause
Compared two reactions using Tanimoto coefficient - Venkata P. Satagopam
"This heatmap might look good to you, to me it looks fantastic!" Similarity between substrates is now close the EC classification. Differences might be based on the EC classification. - Roland Krause
FunTree - Understanding enzyme families and evolution Poster #Z06 - Venkata P. Satagopam
Why are some structures capable of so many different enzymatic functions? Which are the residues that led to change of function? - Roland Krause
Examples from the Phosphatidylinositol-Phosphodiesterase-Superfamily, a multi-domain protein family. - Roland Krause
They looked at the multi-domain architecture of the phosphatidylinositol-phosphodiesterase superfamily. Adding new domains doesn't add enzyme function to members of this family. - Barb Bryant
One need to understand the evolution to better understand the EC classification - Venkata P. Satagopam
The tree constructed from structure has three main groups. Branches of the tree are distinguished by differences in substrate, product, presence of a metal co-factor, or mechanism. - Barb Bryant
Matrix showing how frequently there are evolutionary changes within and between classes. Evolution tends to create new enzymes within the same class, having the same mechanism but changing the substrate or product. - Barb Bryant
Most of the enzyme evol happening in the last sub class level - Venkata P. Satagopam
Question from the floor: is this an opportunity to abandon the EC classification method and move on to a better one? Answer: no. The EC structure is very sensible. Also, it is powerful because everybody uses it. Also, in the first class we examined, it matches pretty well to the similarity measure we developed. - Barb Bryant
# Best keynote so far - Roland Krause
Question: sometimes you have a huge protein to carry out a single small reaction. Have you noticed any clues to why this happens? A: we have some thoughts related to protein function. First, most proteins are multi-functional. They interact with other proteins and do other sorts of things. Secondly, some of the substrates are quite large. We have a sort of domino theory of enzyme... more... - Barb Bryant
Keynote: Olga Troyanskaya - Integrating computation and experiments for a molecular-level understanding of human disease
Olga is not here but will have talk via video and then live Q&A remotely - Shannon McWeeney
Introduction by Alfonso Valencia. Olga Troyanskaya cannot be here because of a recent new family member. of hers. (# Congratulations) - Roland Krause
Her lab takes large-scale molecular biology datasets and develops pathway level models - Barb Bryant
Tissue- and developmental-stage-specificity is important. - Barb Bryant
Example: DNA damage repair with many different types of interactions. - Roland Krause
Pathway connections between proteins meaning correlated or connected function as opposed to protein-protein or regulatory interactions. Connection means confidence that the two proteins are working together to accomplish some biological function. - Barb Bryant
Bayesian networks for data integration. - Roland Krause
The graph is context-specific. - Barb Bryant
A case study of mitochondria: Yeasts unlike humans can live without mitochondria. - Roland Krause
Do these networks discovery novel biology? Case study of mitochondria. - Barb Bryant
Goal is to see if we can find proteins previously unknown to be involved in mitochondrial function, and test those predictions by knocking out the gene and looking at phenotype. - Barb Bryant
Two iterations results in finding all predictions that can be tested with single knock-out. - Barb Bryant
Went from 106 to 350 proteins known to be involved in mitochondrial biogenesis, in a few months. - Barb Bryant
109 of these predictions are completely novel; 135 more do have prior literature evidence but hadn't yet made it into Gene Ontology. - Barb Bryant
Instead of this computational candidate approach, they could have done genome-wide knock-out, and tested with that assay; it would have taken them 8 years, so this is a huge time savings (and cost!) - Barb Bryant
More than 50% of the genes have an ortholog in human. - Roland Krause
The newly annotated genes that they find tend to show a quantitative phenotype as opposed to being necessary for any respiration. These are more likely to be relevant to human disease, I think because the ones that are strictly necessary would be lethal mutations... - Barb Bryant
Computational predictions from large collection of genomic data can be accurate despite incomplete or misleading old standards. - Vangelis Simeonidis
So that was yeast, but you can also take an approach of looking directly at human data. - Barb Bryant
Their system allows you to ask what diseases a particular gene is involved in. - Barb Bryant
They use 650 datasets that include 30,000 conditions. I assume this is mainly gene expression profiling data. - Barb Bryant
These predictions can be tested. Hilary Kohler (sp?) at Princeton has tested 7 of the predictions; 6/7 confirmed. - Barb Bryant
Tissue specific gene expression -- in worm, this has been carefully elucidated; see WormBase. - Barb Bryant
With Coleen Murphy at Princeton - taking out whole worm expression and figure out tissue-specific expression from it. ? - Barb Bryant
They do genetic perturbation of genes in the untranslated response pathway and make some interesting findings that were not previously known. - Barb Bryant
I am confused about the finding that they can predict tissue specificity even if they take that tissue out of the compendium. Did I get that right? How can you predict the tissue if you have zero examples of it? - Barb Bryant
# Not sure what the UTR pathway was here either. - Roland Krause
Now moving on to a human example: kidney disease - Barb Bryant
Damage to glomerular filter causes disease. - Barb Bryant
There is no way to microdissect podocytes; at most can dissect glomerular filters. A collaboratler at MI, Kretzler, has some gene expression profiles of this tissue. They're going to try and predict podocyte expression. - Barb Bryant
They take the expression compendium and positive and negative examples of mixed samples (various cell lineages) that include podocytes, I think. - Barb Bryant
They can refer to other datasets like mouse, in situ staining, and so on. - Barb Bryant
They want podocyte-specific genes that are enriched for clinically relevant genes. - Barb Bryant
DACH1 expressed in the human kidney, also expressed in the homologous organ in fish. - Roland Krause
She now talks about follow-up work, unpublished, on some specific genes. - Barb Bryant
One way to assess the predictions is to look at the glomerular filtration rate (GFR) which reflects kidney function; the expression of the predicted genes does relate to GFR. So the predictions are clinically relevant. - Barb Bryant
HOW are these proteins involved in the process? What are these genes doing in the podocytes or in the kidney? (she asks) - Barb Bryant
She looks at the genes in a network built based on what we already know about gene and protein relationships. - Barb Bryant
She shows the brain network and genes known to be involved in Alzheimer's disease. - Barb Bryant
Moving on to the topic of finding genotype-disease associations with functional genomics. This is like GWAS done with genetic data, but here wtih functional genomics. - Barb Bryant
Input: tissue-specific functional relationship networks. + genes involved in specific phenotypes (from mouse data, Jackson Labs). Then an SVM classifier tries to find new genes with evidence of connection to different phenotypes. - Barb Bryant
This works. And it definitely helps to be considering tissue specificity instead of just using global networks. - Barb Bryant
Let's look at a specific prediction: bone mineral density. - Barb Bryant
There are 20 GWAS loci but these only explain 3% of heritability. ! - Barb Bryant
Two genes: Timp2 and Abcg8 are in the top 100 predictions; they don't overlap with other previous studies, and there is a KO model in mouse. - Barb Bryant
So they looked at the bone density phenotype in these knockout (KO) mice. - Barb Bryant
Male mice do have signficiant reduction in BMD. (One graph looks like increase - what am I missing?) - Barb Bryant
With Hess & Huttenhower, infer physical, genetic, regulatory and functional networks, from the functional genomics data. Specific interaction types (like phosphorylation) are hard to predict because of the small amount of gold standard data. - Barb Bryant
They have an ontology of interaction types; this hierarchical relation can improve their systems. - Barb Bryant
I've lost track of what the input data is here; the output predictions are specific types of relationships between pairs of proteins. - Barb Bryant
Example of JAG1, looking for its targets; it is known to be a mediator of bone metastasis in breast cancer. They made a prediction of one target that they hope to confirm. - Barb Bryant
Another example in the NOD-like receptor signaling pathway. They held out some data and showed that they could regenerate it. They also had a novel prediction of an inhibitory relationship, which is consistent with some indirect experimental evidence in the literature, and relevant to a disease. - Barb Bryant
Summary: computational analyses of diverse large datasets, especially using tissue-specific information and modeling, can help pinpoint disease genes and processes that have been missed by other techniques like GWAS. - Barb Bryant
We need to link the micro-level biochemical events with physiological-level events like blood pressure. We also want to do this in the context of individual genomes - personalized medicine - to guide appropriate treatment for each patient. - Barb Bryant
Acknolwedgements: Huttenhower (now at Harvard). Chad Myers, David Hess, Matthew Hibbs. Maria Chikina - worm. Casey Greene - podocytes. Yuanfang Guan - collaborating with Jackson Labs. Chris Park -- pathways work. Many more. - Barb Bryant; open source library C++, highly optimized, developing with Huttenhower, useful for repeating the analyses or applying tools to other datasets. - Barb Bryant
Q: Are the edges in tissue specific networks? A: Both edges and nodes are tissue specific. - Roland Krause
HL08: Andrew Bordner - Universal epitope prediction for class II MHC
Introduction to Class II MHC. Binds 15-25 residue long peptides, more conserved backbone. Possibly easier to predict. - Roland Krause
Highly polymorphic, each has different peptide preferences, 4 genes, alpha and beta (MHCII) - Roland Krause
Epitope prediction: Two general types, patter finding in sequence and structure modeling based. - Roland Krause
Pattern based methods fast, requires only known binding peptides. Cannot be generalized to dissimilar types. - Roland Krause
Regularized Thermodynamic Average sequence-based prediction. - Roland Krause
HL07: Yanay Ofran - Survival of the Friendly - the Importance of Protein-Protein Interactions in the Evolution of Bacterial Genomes
Lateral gene transfer - how can moving of parts work in other systems and even infer a selective advantage? - Roland Krause
Complexity hypothesis: In order to be beneficial for a new protein it cannot have many interactions. - Roland Krause
Do you have to be a lone wolf to integrate successfully? - Roland Krause
Genes undergoing LGT were found to be less connected. - Roland Krause
The study presented is a new large scale study inspecting the binding interfaces. Developed a method to identify interfaces. - Roland Krause
# Missed the punchline of the presentation due to WLAN problems. Hmrg. - Roland Krause
HL03: Carlo Cannistraci - From revealing new insights into Human Tissue Development to Minimum Curvineality
32 human tissues x 1321 transcription factors - Roland Krause
Clustering finds mesoderm, ectoderm, endoderm. - Roland Krause
HL02: Jacques Colinge - The Central Human Proteome
Which are the commonly expressed in human? - Roland Krause
7 cell lines, 1D gels, 50 bands, Orbitrap MS-MS - Roland Krause
Membrane proteins underrepresented, no other biases for characteristics such as PI, MW found. Abundant proteins. - Roland Krause
Overlap of 45% with Su, PNAS 2002 transcriptome study. Similar processes enriched. - Roland Krause
Human protein atlas (Ponten, MSB, 2009), overlap of 40%, mass sensitivity limited. - Roland Krause
Protein found generally well conserved. More interesting, genes are exon-rich. - Roland Krause
Increased number of interactions. - Roland Krause
The central proteome is central in the interactome (by centrality measures). - Roland Krause
83% enzymes, 77% primary metabolism, significant enrichment in drug targets (176 from DrugBank) - Roland Krause
Specialized functions of the core proteome interwoven by connectors. - Roland Krause
10% of the core proteome are poorly annotated. - Roland Krause
Significant overlap with viral host factors. - Roland Krause
Data publicly available. - Roland Krause
Q: Map of the central proteome against housekeeping genes? - Roland Krause
A: No, not directly but other works defined abundant genes as such. - Roland Krause
Q: Kinases as cancer targets, relevance in the data set? A: 40 kinases in the data set. Overlap with cancer much bigger than only kinases. - Roland Krause
Q: Both expression levels and abundance levels. Correlation? A; Poor overlap between the two suggests expression levels are little correlated. [..] - Roland Krause
Q: Have you looked at recent sequencing studies for transcriptomics [not only Su et al] . A: Studies by Chris Burge report 10.000 genes, much too many to compare because copy number might be very low, comparison very difficult. Unclear functional relevance. - Roland Krause
Q: Centrality. Have you looked at the complexome? A: You recognize proteasome, spiicesosome. Q: How many is in complexes: A: Two thirds, there are not many complexes that work alone. - Roland Krause
Q: Are poorly annotated proteins underrepresented? A: Depends on the number of PubMed abstracts that make a protein underrepresented. - Roland Krause
Keynote: Bonnie Berger - Computational biology in the 21st century: making sense out of massive data
ISMB/ECCB 2011 kicked off, Michal Linial is introducing the first keynote speaker - Venkata P. Satagopam
algorithmic challenges to increasingly massive amounts of data - how to avoid situation becoming intractable? - Shannon McWeeney
Berger showing a graph where data are growing faster than computational powere can handle (MIPS vs. bases / day) - Iddo Friedberg
10 fold increase in sequencing vs doubling in computing capacity - Shannon McWeeney
Not just a bigger cloud -- > need better algorithms - Iddo Friedberg
3 challenges areas - compression, signal from noise, patterns across species - Shannon McWeeney
Need to exploit fact that "new" data is similar - utilize redundancy "compressive genomics" - Shannon McWeeney
Work directly on compressed data rather than compress --> decompress - Iddo Friedberg
use case from fly genomes - Shannon McWeeney
Compression accelerated blast caBLAST - Shannon McWeeney
Redundancy in genomics can be exploited --> CaBLAST. Works on compressed data. Size of compressed DB is proportional to the size of non-redundant data - Iddo Friedberg
coarse analysis on compressed data - refined analysis on relevant regions - Shannon McWeeney
Run time much faster - Iddo Friedberg
never have to uncompress - potential for huge gains - Shannon McWeeney
use case 2: signal from noice (medical genomics) - Shannon McWeeney
Can't find cablast online... - Iddo Friedberg
Berger moving to NCBI GEO and medical trasncriptomics - Iddo Friedberg
Indexing GEO using UMLS -- Unified Medical Language System from NCBI - Iddo Friedberg
UMLS is an ontology of medical concepts. - Iddo Friedberg
Concept enrichment in umls tool: concordia - Iddo Friedberg
Using UML and Concordia to analyze tumor origin - Iddo Friedberg
Thanks for the coverage! - Ruchira S. Datta
Lab has several interesting tools like IsoRank, IsoRankN, Struct2Net, RNAicut, Mangoose and more - Venkata P. Satagopam
(network connection is bit bad) - Venkata P. Satagopam
Of course I agree, Cytoscape -
...the writer's block meme rears its invisible head once again... - Noel O'Boyle
Abhishek Tiwari
Rethinking the scientific method : The New Yorker -
Rethinking the scientific method : The New Yorker
Rethinking the scientific method : The New Yorker
LBR11: Mark Wass - Towards the prediction of protein interaction partners using physical docking
1-10% of protein-protein interactions have been identified. Need for prediction. - Roland Krause
Most methods for prediction are genomic or sequence based but not structure based. - Roland Krause
Protein docking: finding and scoring interaction poses between two proteins. Hard problem to distinguish interactors from non-interactors. Use of the Weng benchmark set of 65 complexes. Measure distribution of scores. - Roland Krause
Build decoy set and assume they are not interacting with the benchmark set. Some examples show separation but fails for several. 36 complexes outperform rank better than the 80%. - Roland Krause
Where on the interactors are docking solutions generated? Some AA are involved more frequently than others. - Roland Krause
Still way to go before predictions are possible but a signal is present. - Roland Krause
The problem with methods that try to optimize discrimination between natives and a set of decoys is that they usually find problems in the docking software / energy function. I.e. you'll discover that decoys have very weird electrostatics distribution etc... A good set of non-interactors would be key to develop this field further - Nir London
scoring is indeed a bottleneck in docking. however there is a signal found in this work, it can be very helpful both for docking and prediction of interactions to understand where this signal comes from. - Dina Schneidman
Keynote: George Church - BI/O: Reading and Writing Genomes
George Church has developed an amazing amount of technology. - Barb Bryant
I am always wondering that if he gets any sleep at all. - Dawei lin
Which is the introducer? - Dawei lin
michal linial, if I'm not wrong (which I was, need new glasses) - arne
The My First DNA sequencer reference: - Shannon McWeeney
First challenge on computational interpretation and integration: personal genomes =stem cell epigenome + mC environments + traits. - Dawei lin
Olga Troyanskaya - Barb Bryant
Cost of drugs goes up linearly; cost of sequencing is dropping exponentially - Barb Bryant
40,000 fold price drop for 4 years - Dawei lin
CGI price for genome is $1500/year? - Dawei lin
In 2005 we abandoned a monopolistic capillary electrophoresis; instead we have a couple and now 21 different technologies for sequencing. Resulted in a jump in rate of change of sequencing capacity - Barb Bryant
He thinks that many of the sequencing companies will find a niche :) - arne
Cost of personal genome: 2007: $57M; 2009 $1500, for 40-fold coverage. - Barb Bryant
Close to the $1000 genome - arne
(+ $100,000 interpretation cost?) (he doesn't really think that) - Barb Bryant
Drmanac et al Science Jan 2010 - Dawei lin
Sidetrack: One friend said when he started his PhD it took 6 month to sequence a bacteria and 6-60 month to analyse it. Not it takes 6 minuted to sequence it and still 6-60 month to analyze it. - arne
limitation is several hundreds nm in scale on chip (positive charge molecules on hydrophobic background - Dawei lin
7% human genome is missing so far because of technical challenges - Dawei lin
trio genomics information (father, mother, child) is increasing important in genomics research - Dawei lin
From open acess Sequences to Bio-Fab - arne
One of the 21 sequencing technologies is open-access. Reads and writes DNA with light. - Barb Bryant
2nd-gen synthesis ($500 per 15 Mbp) - arne
Second-generation synthesis - four different kinds of technologies. - Barb Bryant
Next Gen synthesis: off chips $500 15Mbp - Dawei lin
Tian et al 2004 Nature - arne
The work started around 2003 - Dawei lin - arne
person genome 3M allele -> immunology + microbome -> trait - Dawei lin
Issues of personal identification from genomic data. Informed consent as one solution. - Barb Bryant
Have 16,000 volunteers for Personal Genome Project so far; 100,000 target. - Barb Bryant
Claims that ~1800 genes are highly predictive and medically actionable. - Barb Bryant
They are rare but collective common at 10% level - Dawei lin
Example of the Madsen family with two diseases. Found causative allelles - 4 total (2 from each parent). - Barb Bryant - Dawei lin
Each time we find a scary allele in a person, it could be a sequencing error; it could be a problem with the literature. - Barb Bryant
found a dozen cases in the literature got allele sequence wrong - Dawei lin
The oldest volunteer for PGP is 96.7 years old - Dawei lin
Q: Are these genomes available ? - arne
Circulating tumor, pathogen, fetal, and immune cells. - Barb Bryant
Microbe vs Immunome - arne
If you want to look for a microorganism in a body, you can either look directly for the microbe, or look for the body's reaction. - Barb Bryant
immune test is to focus on response to exposure. - Dawei lin
Sequencing after vaccination - response is maximum after 7 days - arne
Generating human tissue from pluripotent stem cells - Barb Bryant
The Economist 20-May-2010 cover - Dawei lin
Genome engineering - Barb Bryant
E.g., change the genetic code -- for resistance to pathogens, new amino acids, and something else. - Barb Bryant
You have to do this safely. - Barb Bryant
For $400M, Dupont made 27 changes to the 4.6 Mbp E. coli, to make a chemical. - Barb Bryant
Another application: bio-petroleum from microbes. - Barb Bryant
Identify enzymes that synthesize alkane. Many cyanobacteria made trace amounts; others made none. Did genome sequence "subtraction" to find which genes were in the former. Isolated & tested these genes. Overproduced them; it worked. Green chemistry. - Barb Bryant
Multiplex Automated Genome Engineering (MAGE)... - Barb Bryant
Church's own genome is available: - Christiaan Klijn
So: subtract my genome from Church's, then overproduce those genes --> TOTAL BRILLIANCE! - Barb Bryant
Example of freeing up a codon by changing those codons to a different one./ - Barb Bryant
Is this not just the analysis. Not the sequence ? (or did I miss a link) - arne
See the 'Datasets' header -> you can get 500k Affy data as well as exome - Christiaan Klijn
Metabolic engineering example. Historically, you'd get obsessed with one step in the pathway and overproduce one enzyme. But then you'd get product inhibition, or the product might be toxic. - Barb Bryant
Would be nice with a map to the reference genome as well, but guess that can be done - arne
DNA Nanostructures: (DNA origami). Proposes a combination of DNA and proteins. - arne
DNA nanostructures help solve structures of membrane proteins. - Barb Bryant
First practical application: Made a long rod that was stiffer than other DNA. Used in NMR for membrane proteins (Cooooll idea but, it has been tried with proteins before) - arne
caDNAno is a software tool that is free available - Dawei lin
Time for questions. - arne
Special Public Lecture: Dr. Robert Weinberg - Cancer Stem Cells and the Evolution of Malignancy
Shows picture of stages of cancer progression (ref Vogelstein, colon); poses the question of how metastasis occurs -- does this involve genetic or epigenetic changes? - Barb Bryant
Tan Ince cultured two kinds of normal human mammary epithelial cells. He transformed them with oncogenes, resulting in different types of tumors. - Barb Bryant
Concludes that the nature of the normal cell of origin is a strong determinant of the phenotype of the primary tumor, and whether it metastasizes. The playing field is tilted in the beginning. - Barb Bryant
Posits tumor-generating cells. - Barb Bryant
Self-renewing stem cells produce either more stem cells or transit amplifying cells which in turn lead to post-mitotic differentiated cells. Only the self-renewing stem cell could seed a new tumor. - Barb Bryant
invasion-metastasis cascade - Barb Bryant
How do cancer cells acquire all of these capabilities (invasion, intravasastion, transport, metastasis...) Are there addiitonal mutations required? Is it epigenetic? - Barb Bryant
epithelial-mesenchymal transition -- cells on the perimeter of the tumor are mesenchymal. This may be due to signals from the surrounding stroma. - Barb Bryant
There are probably 1000 proteins that shift in EMT. Vaious transcription factors (TFs) induce EMTs. - Barb Bryant
EMT program highly complex and occurs normally during development. - Mickey Kosloff from iPod
It seems likely that most of the invasion-metastasis program can happen without need for additional mutations; rather use signaling from microenvironment. - Barb Bryant
P. Gupta transformed human primary melanocytes (pigmentation in the skin) with a cocktail of oncogenes. Found that in contrast to transformed epithelial cells, there was much higher likelihood of metastasis. Again, cell of origin is important in future behavior. - Barb Bryant
One TF, Slug, was found to enable melanoma metastasis. (Even though the primary tumors grew a little faster.) - Barb Bryant
Another TF, FOXC2, when expressed in epithelial cells induces migration and invasion. A subset of breast cancers have high levels of nuclear FOXC2, and these are more aggressive breast cancers. - Barb Bryant
Speculates that different networks of EMT-inducing factors might program metastasis in different cell types./ - Barb Bryant
Stem cells identified by high CD44 and low CD24. (CD's are markers on cell surface which can be assayed fairly easily.) - Barb Bryant
There are various ways to make cells acquire stem cell characteristics. - Barb Bryant
Mentions Kornelia Polyak. There are stem-like cells in primary human breast samples. The stem cell program in normal human mammary gland is coopted by cancer cells. - Barb Bryant
More proof that EMT creates stem cells. - Barb Bryant
Most current chemotherapies preferentially kill non-cancer-stem-cells. The remaining stem cells can repopulate the tumor and are often more resistant to therapies. - Barb Bryant
Gupta & Onder tested CSCs and non_CSCs with a bunch of drugs. There are some CSC-targeted agents (Salinomycin, Abamectin). Of 16,000 compounds only about a dozen preferentially killed CSCs as opposed to non_CSCs. Many were the other way round. - Barb Bryant
This probably won't be the "answer". Christine Chaffer noticed that there were some floating cells in 2D cultured human mammary epithelial cells. She grew these up; these look more like CSCs. - Barb Bryant
Interestingly, she found that non-CSCs could generate CSCs. - Barb Bryant
Hm, isn't this kind of pouring cold water on the excitement about CSCs as drug targets? Or maybe you have to target both CSCs and non-CSCs simultaneously. - Barb Bryant
yup - Barb Bryant
Q: cancer biologists like to study druggable genome. But transcription factors seem most important. A: expression of TFs is controlled by cytoplasmic factors. Might want to go after those. Drugging the TF itself might be hard, but the signaling pathways might be more druggable. - Barb Bryant
Q: has it been shown that change in the two forms of cadherins match the change in CD expression, and are these correlated with morphology? A: I showed that: CD44 high cells shut down E-cadherin; they expression vimentin, and other mesenchymal markers. I don't know whether CD44 is useful for non-mammary epithelial tissues. - Barb Bryant
Q: So do normal non-SCs generate SCs? A: Yes. Same differences as in cancer. - Barb Bryant
Spontaneous de-differentiation into SCs. Interesting phenomenon. - Steve Chervitz Trutane
HL40: Martin Vingron - Histone modification levels are predictive for gene expression
See PNAS 2010 Feb 16. - Barb Bryant
Using histone modification data in Barski et al and Wang et al and one other. - Barb Bryant
Li, Carey, Workman 2007 Cell 128:707 review. Tabulates distribution of modifications along the genes. - Barb Bryant
Some are activating; some repressing; some either. Occur in various locations along the gene. - Barb Bryant
They want ot go from histone modification vector to transcription level. - Barb Bryant
Zhao lab data: 38 histone modifications. Look in promoter region. - Barb Bryant
-2000 to +2000 around TSS. Introduce a pseudocount alpha: x = log(N + alpha). - Barb Bryant
Standard linear regression. - Barb Bryant
Graph of predicted vs measured log expression values. High correlation. In fact, it looks bimodal. - Barb Bryant
Looked at where the information resides. Tried using single, pair or 3 modifications and see how close can get to optimal prediction. - Barb Bryant
Most predictive modifications are H4K20me1, H3K27ac, H3K79me1, H2BK5ac - Barb Bryant
Shows a histogram of promoters with respect to CpG ratio. - Barb Bryant
Which modifications are informative depends on CpG content. CpG rich: H4K20me1, and to some extent H3K27ac, H2BK5ac some. CpG-depented: H3K4me3, H3K79me1. - Barb Bryant
Predicting in other cell types. Cui et al 2009: ChIP-seq data for CD36+ and CD133+ T cells. 10 histone mods (H3K4me1/3, H3K27me1/3, H3K36me3, H4K20me1, H3K9me1/3, H2A.Z) + gene expression data. - Barb Bryant
Trained a model for those 10 mods on initial dataset, then tested on the new data. Prediction still pretty accurate. However, these cell types are quite closely related. So looked at genes that are differentially expressed between the cells. Even there, th eprediction is still pretty good. - Barb Bryant
H4K20me1 and H3K79me1 associated wtih elongation; H3K4me3, H3K27ac, H2BK5ac are sasociated wtih the TSS. - Barb Bryant
Summarizes: expression and histone meodification levels are quantifiably related. This holds across cell lines - Barb Bryant
Common prejudice: Low-CpG promoters drive tissue specific genes. - Barb Bryant
PT43: Emmanuel Douzery - SUPERTRIPLETS: A triplet-based supertree approach to phylogenomics
Supertree methods can be used to build trees from diverse trees on incongruent species sets and diverse data, e.g. morphological and phylogenetic data. - Roland Krause
How to find the best representation of several trees. Distances between trees with different taxa. - Roland Krause
PT41: Steven Kelk - Phylogenetic Networks Do not Need to Be Complex: Using Fewer Reticulations to Represent Conflicting Clusters
Many reasons to obtain conflicting trees, some have legitimate reasons (hybridization etc) which can be modeled as phylogenetic networks. - Roland Krause
Clusters are subset of leaves, basically a hypothesis that at least one trees contain such a clade. - Roland Krause
Clusters loose the topological information. Let's the user specify data that are trusted. - Roland Krause
Minimize the number of reticulations, motivated by parsimony but by evolution. - Roland Krause
Algorithm CASS attempts to produce such networks and does so well (60 slide of math not shown) - Roland Krause
Tested on different subsets obtained from a Poaceae grass data set (PWG 2001, Smidt 2003). Comparison to Hybrindinterleave, PIRN, Galled Trees and Cluster Network. - Roland Krause
CASS works well in practice. Integrated in Dendroscope. Running times slightly longer than competitors. - Roland Krause
HL35: Liran Carmel - A universal relationship between gene compactness and expression level in multicellular eukaryotes
From Eugene Koonin's group: - Roland Krause
Different lengths for a gene - total transcript length, 5'UTR length, introns, etc - Roland Krause
Expression levels are collapsed to a single value by ranking by condition/tissue into 3 category and average and round the resulting rank. - Roland Krause
First observed in C. elegans - Roland Krause
Found in other organisms and for several length measures and different strategies for expression. - Roland Krause
20 nucleotides are transcribed in one second - Xinwei Han
one nucleotide needs two ATPs - Xinwei Han
Intergenic length and compactness are related. - Roland Krause
Plants have opposite trend - Xinwei Han
Exception in plants: highly expressed genes are least compact. - Roland Krause
No good explanation for the outlier, usually selection and genomic design are given. - Roland Krause
New explanation: the relationship is not monotonous but peaked. - Roland Krause
Segmented regression recovers the relationship easily. - Roland Krause
Use of mixed gamma distributions to estimate noise and real expression level by tissue. - Roland Krause
Selection rather than genomic design shapes the length distributions. - Roland Krause
Q: Relation of intron length and alternative splicing. A: Probably not. - Roland Krause
Q&A: Intron length changes more pronounced than exon lengths. - Roland Krause
Q: Specific tissues: A. Interesting but not yet analyzed. - Roland Krause
HL33: Todd Gibson - Neofunctionalization in interaction network evolution
Fate of duplicated genes: non- neo- and subfunctionalization. - Roland Krause
In networks of interacting proteins duplication leads to duplication of interaction. - Roland Krause
function = interaction - Roland Krause
link dynamics: new interaction maps to neofunctionalization, loss of interaction to subfunctionalization - Roland Krause
He and Zhang 2005 calculated neofunc. The older the duplication, the more interactors. - Roland Krause
Wagner 2001/2003: Ancestral protein would self-interact and form expected interaction. If no self-interaction, de novo interaction can to be assumed - Roland Krause
Binding domain in yeast-two hybrid is a dimer. Self-interacting proteins should not be observed. - Roland Krause
Modeling neofunctionalization in a theoretical network. - Roland Krause
Highlights the importance of self-interactions in network evolution. - Roland Krause
HL32: Ernest Fraenkel - Network models for understanding what 'omic data really mean.
Omic data don't mean what you think - Roland Krause
The answer is 42 - arne from iPhone
Generally little overlap between different experimental screens. - Roland Krause
Studies 156 perturbations mapped on networks - arne from iPhone
Chip chip data and protein protein interaction data. Transcripts and proteins are separate entities. - arne from iPhone
Hitting central nodes. - arne from iPhone
HL30: Rohith Srivas - Genome-Wide Association Data Reveal a Global Map of Genetic Interactions among Protein Complexes
use of co-localization filter to assist in interpretation - Shannon McWeeney from BuddyFeed
Keynote: David Altshuler - Genomic Variation and the Inherited Basis of Common Disease
Altshuler is an expert on diabetes type II. - Dawei lin
It is said that he is also a good dancer. - Dawei lin
Tap, ballroom, or tango? - Ted Laderas
Slide dancing - Dawei lin
motivation is to understand genetic basis of human diseases - Dawei lin
Genetic basis of human diseases - important disease mechanisms and bio pathways remain unidentified - Venkata P. Satagopam
gap in knowledge of human disease biology contribute to high failure rates in drug development - Dawei lin
Why understanding genetic mechanisms ? (1) Important mechanism remain unidentified (ii) Gaps in knowledge causes failure rate in drug development - arne
It will be a long way to know if the two motivating hypotheses are true - Dawei lin
one of the most research on T2D. It scaned 100k people for 10 yrs - Dawei lin
10 years later 50% progressed to have the disease - Dawei lin
10years of diabetic research - the out come is - 50% of people with good lifestyle improved - Venkata P. Satagopam
lifestyle has a bigger impact than Metformin - Dawei lin
Diabetes study with 10-year follow-up of diabetes incidence and weight loss, "T2D". Randomized into treatments: lifestyle, metformin, placebo. Best drug makes relatively little difference in incidence; lifestyle intervention is better than drug but still doesn't help a whole lot. - Barb Bryant
best prevention was extensive lifestyle changes (50% -> 40% incidence) - Mickey Kosloff
Diabetes is not only a matter of life style - arne
success rate in current pharma industry is <5% of molecules entering the clinical trails - Venkata P. Satagopam
This is bad !! - arne
mentions well known number of >95% failure rate of new compounds - Mickey Kosloff
because there are still 40% people got the disease after the lifestyle change, it seems that people do not know the course of the disease - Dawei lin
Genetic mapping started in 1913 - Dawei lin
genetic map came in 1913 - Venkata P. Satagopam
Morgan and Sturtevant 1913 - arne
emphasizes he advocates a genetecist's approach (rather than a genomic approach) - Mickey Kosloff
And tells you to skip undergraduate work if you have something better to do - arne
key attributes of genetic mapping - unbiased by prior assumptions about pathways - Venkata P. Satagopam
saturation mutagenesis reveals pathways - Venkata P. Satagopam
key attributes of genetic mapping: (1) unbiased by prior assumptions about pathways (2) saturation mutagenesis reveal pathways - Dawei lin
many mutants -> reveals coherence of pathways - Ted Laderas
These days we have other methods that are unbiased like expression profiling, but genetic mapping has some unique characteristics relative to these (he’ll explain in a minute). - Barb Bryant
Drosophola's mutations looked initially random, years they almost all related to pathways. - Dawei lin
bottleneck is functional determination - biochemical approaches - Ted Laderas
A lot of current knowledge can track back to genetic mapping - Dawei lin
Botstein and Fink Science 1988 .... - Venkata P. Satagopam
A slide based on Galzier et al, Science 2002 - Dawei lin
genetic mapping of human single gene disorders ...over 15 years Botstein paper in 1980, first genetic map in 1985 .... - Venkata P. Satagopam
It took 10 year to find maker for Huntington disease - Dawei lin
Once you find a linked region from genetic mapping, it still takes a long time to find the specific gene responsible. - Barb Bryant
in the 1990's the idea was that common diseases were caused by rare mutations with large effects - arne
"Chromosome shlepping" - Eic Lander's term for the identification of a very gene in some genomic region. - Roland Krause
It is robust to find mendelian disease but to not common diseases - Dawei lin
another approach: population genetics - QTL approach - Ted Laderas
phenotypic variation is often continuous and may involve variation in many genes - Dawei lin
Galton invented regression analysis to analyze the measuring of phenotypic data (heights of parents and offspring). - Roland Krause
The biometric unit --- almost nothing was Mendelian - arne
Most traits are continuously variable - Ted Laderas
Francis Galton was a cousin of Darwin. Darwin didn’t explain the source of variation. Galton focused on this; he measured the heights of parents and their offspring, and found a relationship. He invented regression analysis to draw the line. The slope of the line is related to the inheritability of the disease. - Barb Bryant
It was studied by the cousin of Darwin, Francis Galton (1885) - Dawei lin
phenotypic variation is often continuous ... some history ... Francis Galton (1885), Ronald Fisher (1918), Hermann Muller (1920) - Venkata P. Satagopam
This gave rise to the biometric movement – measure every living thing. Traits were related to genetic relatedness; and it wasn’t Mendelian. This led to the biometric-Mendelian debate. - Barb Bryant
Ronald Fisher, was actually a geneticist, who also invented p-value and Fisher exact test - Dawei lin
Ronald Fisher (the one with the exact test) was also a geneticist. - Roland Krause
Solved by assuming that phenotype often is an effect of several Mendelian genes. - arne
Fisher: individual genes are mendelian, effects of genes additive - Ted Laderas
Hermann Muller 1920 (Nobel Prize for X-ray induced mutations). PhD thesis not Mendelian trait, but truncate wing. Wasn’t Mendelian. Did genetic mapping. - Barb Bryant
Hermann Muller decided to use broken wing of fruit fly to study non-Mendelian diseases - Dawei lin
Muller 1920 paper: 4 chromosomes in fly – 3 contain genes that influence the trait truncate wing. Muller wrote about implications for human traits, like psychological traits. Said that traits were going to be too complicated. Said you could figure out by looking at population, but not looking at Mendelian inheritance in families. - Barb Bryant
Muller 1920 suggested that it needed to do study on a population. - Dawei lin
Muller: Truncate wing - 3 genes influence effect of phenotype - Ted Laderas
Mullers thesis included the notion of surveying complex phenotypes in the population rather than families. - Roland Krause
Muller: traits are too complex to observe in families, but can observe in population - Ted Laderas
characterization and catalogue human seq variation is a decade of work .. i.e international HapMap project - Venkata P. Satagopam
Another decade-long failure: the candidate gene approach. Instead, we need a genome-wide, unbiased approach. - Barb Bryant
Testing candidate genes was not successful. Only 10-20 successes. - Dawei lin
779 GWA published for 148 traits - Mickey Kosloff
out come - 779 published GWA for 148 trails - Venkata P. Satagopam
For common diseases, GWA was needed - Ted Laderas
but "correlation does not imply causality" - Mickey Kosloff
There have been 779 genome-wide association studies (or regions/genes found?) for 148 traits, with p < 5x10^-8 - Barb Bryant
"correlation does not imply causality" .... - Venkata P. Satagopam
But correlation does not imply causality. - Barb Bryant
The reasons of "Correlation does not imply causality": irreproducibility, lack of randomization, confounding, arrow of time. - Dawei lin
If you can't randomize the experiment you can never prove causality as opposed to just being correlated to the underlying cause. - Barb Bryant
FF lag results in all these duplicate posts - Mickey Kosloff
a lot of efforts are on finding correlation between rare variation and diseases - Dawei lin
rare variation is defined as has <5% in population - Dawei lin
95% of variations is already present in the database - arne
Identified 50 regions that are associated with T2D - arne
with in next few years ... the role of rare and less common variants will be characterized in a variety of diseases - Venkata P. Satagopam
next topic - can we obtains new insights into the basis of disease? - Venkata P. Satagopam
one example - sickle cell anemia - Venkata P. Satagopam
Sankaran et al Science 2008 - Venkata P. Satagopam
Lettre et al PNAS 2008 - Venkata P. Satagopam
Uda et al PNAS 2008 - Venkata P. Satagopam
Crohn's disease: 15 years, no idea what was happening. Now many genes and 3 pathways are identified to be relevant. - Dawei lin
96 loci explain ~25% of cholesterol levels - Mickey Kosloff
Lipid GWAS found 60 loci that are previous unknown. Some of the positives are drug targets already. - Dawei lin
Global lipids consortium, forthcoming Nature paper (Nature paper is mentioned about 20 times !!!) - arne
is there a way to automate validation/function determination? - Ted Laderas
prediction -- will prediction prove useful --this is depending on the clinical testing and the genetic test - Venkata P. Satagopam
prediction will be useful when there's a proven intervention - Mickey Kosloff
BRCA1/2 risk for cancer as an example - Mickey Kosloff
seq tech will increase the reach of genetic methods - Venkata P. Satagopam
mendelian fallacy - sub-populations are easily divisible in terms of risk - Ted Laderas
Prediction will only be useful if there is an intervention that you would not use without the prediction. Otherwise, you should use the intervention anyway. - Roland Krause
Huntington will not be a representative example - for most diseases/people identified risk will be <<100% even with full genetic information - Mickey Kosloff
Cautionary tale - PSA prediction results in over-treatment, hasn't been shown that people live longer because of test - Mickey Kosloff
Very cautious about PSA - no improvements on the mortality but many operations performed. - Roland Krause
genetics offers a path to discover the underlying biology of human diseases ; the great value will drive from pathophysiology and treatment - Venkata P. Satagopam
Keynote: Chris Sander - Systems Biology of Cancer Cells
An interview with Chris Sander ... - Venkata P. Satagopam
Kabsch and Sander paper - over 6000 citations - - Shannon McWeeney
Note the subliminal message in the announcement slide - Iddo Friedberg from Android
Prediction by transparency - no computation necessary story - Shannon McWeeney
Awards should be shared: People working with Chris includes: Burkhard Rost, Alfonso Valencia, Liisa Holm and many more - arne
Announcement of unpublished and new work. A good trend at this ISMB. - Roland Krause
Cancer genome atlas: TCGA - arne
Mapping of molecular alterations (cpy number variation) to 200 glioblastoma samples. - Roland Krause
Difference between patients is huge - arne
extract network, find relevant modules. - Roland Krause
illustration of netbox algorithm - Shannon McWeeney
When grouping mutations into pathways up to 85% of GBM have a muation in the most important pathways, while individual genes are down to a few % - arne
Each oncogene may have relatively low frequency across patients; but when you group genes across pathways, a pathway may explain a large fraction of patients with a given type of cancer. - Barb Bryant
"Network pharmacology" - Barb Bryant
can see a change in pathway activation between primary tumor and mets - Mickey Kosloff
Dominant alterations changes between cancer types and states. - Roland Krause
GBM: copy number is rare (and noisier) Ovarian: more regular and higher - arne
profiles of copy numbre variations differ between types of cancers - Mickey Kosloff
Metastatic tumor samples have more copy number changes than primary tumors. Not surprising. But maybe primary samples with more copy number changes than others are more likely to metastasize? Generally, better outcome with fewer somatic copy number changes. - Barb Bryant
BRCA1 and BRCA2 mutations convey germline inherited cancer risk - Barb Bryant
These genes act in the homologous repair pathway. Half of all patients have mutations in some homologous repair pathway gene. - Barb Bryant
and more generally, homologous repair genes are altered in > 50% of ovarian cancer - Mickey Kosloff
Tumor suppressor genes can be inactivated in various ways: germline mutation, somatic mutation, epigenetic silencing, etc. - Barb Bryant
There are drugs under development that might work particularly well in patients with defects in this particular pathway. - Barb Bryant
Cancer genomics portal: - Barb Bryant - Barb Bryant
Topic shift: now, perturbation cell biology. "and belief propagation". (eh?) - Barb Bryant
Perturbation Cell Biology - arne
In recent past, says Chris, you make a few perturbations: overexpress or knock down a gene; inhibit with a compound, etc. - Barb Bryant
use network inference algorithms - Mickey Kosloff
goal = predictive models for therapy - Mickey Kosloff
with only 200 datapoints -> derive validated (known) pathways - Mickey Kosloff
Prediction of networks does not scale to larger networks - arne
Large data generation with the number of pertubation > than proteins. - Roland Krause
Still prohibitively large number of networks even for small number of nodes. - Roland Krause
Use statistical physics methods to tackle combinatorial explosion of possible networks. - Barb Bryant
Inference using belief propagation known from statistical physics. - Roland Krause
Ah, here is where "belief" comes in. Network inference using belief propagation. Reference Riccardo Zecchina et al. - Barb Bryant
Instead of going through all the models that are possible, you derive statistical properties across a set of good models for each of the Wij weights in the model. - Barb Bryant
This is sort of like partition functions in statistical physics - Barb Bryant
evolving work on Wij (transition from Nelander et al 2008- - Shannon McWeeney
Cavity approach - optimize locally on global background iteratively cover all local cavities - Shannon McWeeney
Mm, this is rather opaque to me. - Barb Bryant
"Let me give you some intuition about how this all works." Yes, I'd like that. - Barb Bryant
Nice results on toy experiment - constraints from 10 experiments with 5 interactions (the nodes W in factor graph). - Shannon McWeeney
Almost looks too good - arne from iPhone
after step 1 - generation of probability distributions then step 2- decimation - Shannon McWeeney
So you have a probability distribution for each Wij, which represents the interaction between element i and element j. I'm not really getting how you "update" these probability distributions in the iterative steps. I do understand that at the end you take the most "certain" (narrowest) distribution and fix its value (some Wij) at the most probable value, then update all the other Wij's given this fixation. And so on. To get your final model in a sort of greedy fashion. - Barb Bryant
And by the way, the underlying model is a simple differential equation sort of thing: change of one variable xi is a sigmoidal function of weighted (Wij) sum of all variables xj, less a decay term. - Barb Bryant
thanks for the summary bb - Michael Jones
Mike! - Barb Bryant
Mentions bunches of other stuff in passing. Like bioPAX: paper in press. - Barb Bryant
bioPAX is community project on pathways, ontology, and exchange format. - Barb Bryant
"no science without people; science for the people; ask good questions" - Shannon McWeeney - arne from iPhone
Ask good questions !!!!! - arne from iPhone
Question: Interacting network tend to be modular, with strongly-interacting subnetworks that interact weakly with each other. ... - Barb Bryant
Chris: Is the modular approach really useful in confronting the data? [Is that what he said?] - Barb Bryant
Question: can you get at causal relationships? - Barb Bryant
Chris: yes - if the network model allows you to predict correctly the result of a particular perturbation applied to a particular node, then you can simulate using that model. - Barb Bryant
Question: with a big network, how many experiments will you need to model? - Barb Bryant
Chris: Good question. Could use an entropy measure. Help us figure this out. Help us design the experiments. It's important because of the costs of experiment. This is going to be broadly applicable in cell biology. - Barb Bryant
bb - he said one should see if approach is useful by confronting with real data - Shannon McWeeney from BuddyFeed
Ah, thx - Barb Bryant
Chris gets at the difference between a model that tells a story and a model that is truly predictive. - Barb Bryant
Question: yes, but, what are the semantics of the graph? What kinds of interaction? Answer: The semantics are in the mathematics of your model. - Barb Bryant
Question: mean field approach is interesting. Compared to Monte Carlo approach, you are assuming some decoupling. Loss of posterior coupling between weights - is that an issue? - Barb Bryant
Chris: If you look at a coupled system overall, the extent to which the algorithms work depends on correlations within the system. Long-range (in terms of network distance) correlations are problematic. There are some clever approaches to handle some of this. Mentions non-ergotic space; deal with parts of space separately or iteratively. - Barb Bryant
PT31: Adrian V. - VARiD: A Variation Detection Framework for Colorspace and Letterspace platforms
Color space allowing to distinguish SNPs from sequencing errors. - Roland Krause
HMM based, states map to dinucleotides and therefore overlap. - Roland Krause
Colors are not modeled, insufficient data to deduce the sequence. - Roland Krause
Application of Forward-Backward algorithm gives distribution at each position. - Roland Krause
Extension to indels and heterogeneous SNPs. Add several gap characters (by color), increasing to 1600 states. Not so problematic due to sparse transition probabilities. - Roland Krause
Support quality values and variable error rates for emissions. - Roland Krause
Performance similar to individual color space or letter space data. Major improvements with mixed data. - Roland Krause
PT29: Jared Simpson - Efficient construction of an assembly string graph using the FM-index
Nice introduction to Burrows-Wheeler transform, suffix arrays and FM-index for read assembly. . - Roland Krause
Major improvement on memory usage. - Roland Krause
Expects the practical differences to other aligners primarily in the low coverage regions. - Roland Krause
LBR12: Joseph Mellor - Interrogating Genetic Interaction Networks with High-Capacity Sequencing
Genetic interactions: two genes contribute jointly to a phenotype. 12 million interactions in yeast would need to be tested. - Roland Krause
HL20: Smita Agrawal - Computational Models of the Notch Network Elucidate Mechanisms of Context-dependent Signaling
Presentation of a PLoS CB publication. - Roland Krause
Differential equation modeling of various aspects of Notch signaling, incl. binding, localization, translation, transcription. First order models for formation of biomolecules. - Roland Krause
Bistability in the Notch-Hes1 network. The switching points can be determined, undergoes hysteresis. - Roland Krause
Provides the cell with noise filtering. - Roland Krause
Response to a transient Delta signal. Signal can be short and high or continuous to switch the response on. - Roland Krause
Sensitivity of to model parameters. Loss of bistability turns oncogenes on. - Roland Krause
Some parameters are not critical. Key parameter is repressive constant of Hes1. Systems start oscillating at some point. Bridged by a brief monostable state. - Roland Krause
Nice bridge between the biological and a complex model, well presented. - Roland Krause
Keynote: Svante Pääbo - Analyses of Pleistocene Genomes
This will probably be a very interesting talk. Just can't wait. - Tomasz Puton
Not just interesting, but most likely great. Svante is a fantastic speaker - arne
If you’re interested in human history, the genome is a great source of information. To reconstruct history, we compare sequences of people (and other species) living today. We use models of how DNA changes over time to understand the differences that exist today. This is an indirect way to study history, because we are reconstructing from the present what we think has happened in the past. - Barb Bryant
specimens are highly contaminated, .... - Venkata P. Satagopam
mtDNA - advantage of many copies per cell - Mickey Kosloff
original work from 1984 on egyptian mummy - - Shannon McWeeney
Replacement (out of africa theory) vs assimilation (i.e. geneflow from modern humans) - arne
mtDNA is extracted from a specimen from neanderthal - Venkata P. Satagopam
Started with the original neanderthal specimen - arne
The variation in human population origins before the split (as measured by mtDNA) of modern and neanderthals - arne
extract dna from skull, skip PCR and directly sequence - Mickey Kosloff
only 3.5% actually from neanderthal genome - Shannon McWeeney
Average length 50 nucleotides - arne
Vindija Cave, Croatia .... 3 bones - Venkata P. Satagopam
only about 3.5% of dna came from human - Mickey Kosloff
3 billion fragments - again most from bacteria - Shannon McWeeney
most dna is bacterial contaminants - Mickey Kosloff
avg genome cover is 1.5X - Venkata P. Satagopam
most DNA extracted is female look at Y chrom % as contaminant - Ted Laderas
Three females samples (and therefore Y chromosome contamination can be used to calculate noise). Total risk is below 1% risk of contaimination - arne
at any particular position - 1% chance contamination (broken down by source - 3 measures) - Shannon McWeeney
consistant nucleotide chemical changes at 5' and 3' ends - Mickey Kosloff
try to correct by alignments to human and chimpanzee genomes - Mickey Kosloff
Details on bioinformatics and alignment issues (led by Ed Green) can be found in Science paper - - Shannon McWeeney
55% chance of seeing a position covered by at least 1 read - Ted Laderas
Divergence to human reference genome 12% highest among human is in San 10% - arne
typical european (French) 8% divergence to human reference compared with 12% in neanderthal - Shannon McWeeney
78 amino acid substitutions ... a catalog of novel fixed features in the human genome - Venkata P. Satagopam
But this number will change - arne
novel fixed features in human genome - 78 aa substitutions (in paper) - now down to 50 - Shannon McWeeney
Three out of six proteins with 2 changes are skin expressed - arne
next focused on SNPs - Mickey Kosloff
detection of selective sweeps - look for snps in human, chimps, neanderthals - r egions where neanderthal looks all ancestral. - Shannon McWeeney
S vs cM plot - visual inspection for widest spread - Shannon McWeeney
Most extreme case in THADA, Transport and diabeted related - arne
Thada is risk allele for type 2 diabetes - implications for metabolism - Shannon McWeeney
detection of insertion in intron in Thada (not fixed in humans as initially thought in paper) - Shannon McWeeney
3-4% in europe has the neanderthal version (and are protected against Diabetes Type II) - arne
interesting follow-up research here - positive selection yet cost with risk allele - Shannon McWeeney
RUNX2: Mutations cause CCD (Cleidocranial dysplasia) - arne
annotation of others associated with autism and other diseases including CCD - Shannon McWeeney
CCD of interest due to skull morphology phenotype - Shannon McWeeney
Now comes the most surprising result. - arne
focusing on - Interbreeding with modern humans? - Venkata P. Satagopam
Work by Rasmus Nielsen - arne
Is Craig Venter a "fully modern human" ? - arne
analysis of self-identified neanderthals who write to Svante - predominantly men. - Shannon McWeeney
Comparisons to genomes of humans from different continents suggests interbreeding occured in middle east, before geographic expansion - Mickey Kosloff
:) - arne
45% men who are neandertals, 1% women are neandertals.... - Venkata P. Satagopam
future 10-20x coverage of genome - Mickey Kosloff
Future: (i) Better coverage (10-20x coverage) (ii) Functional analyses of candidate genes Exemplified by FoxP2 - arne
next topic - functional analysis of genes - foxp2 - Venkata P. Satagopam
FoxP2 is the same in human and neanderthal. - arne
hope to identify backmutations in humans -cheaper to find these people because of low cost of sequencing - Ted Laderas
easier to check phenotypes in mice - Mickey Kosloff
Human FoxP2 in mouse: The mouse can not speak ! Large scale phenotype study (323 phenotypic traits). -> More cautious in a novel area (stays close to the wall). No difference after 3 minutes. Second phenotype: Altered vocalization !!! - arne
323 phenotypic traits ... studied .. - Venkata P. Satagopam
movement more cautious in humanized mice - Venkata P. Satagopam
next one is altered vocalization - Venkata P. Satagopam
Enard et al Cell 2009 - Venkata P. Satagopam
mice with human foxp2 grew longer neurons - Mickey Kosloff
Other hominid forms........ - arne
Denisova individual 1 Myears (400 diffs in mtDNA) - arne
very good keynote - Mickey Kosloff
Other ways to read this feed:Feed readerFacebook