The 6th Annual RECOMB Satellite on Regulatory Genomics, the 5th Annual RECOMB Satellite on Systems Biology, and the 4th Annual DREAM reverse engineering challenges will be held jointly at MIT on Dec 2-6, 2009. The meeting is hosted by the MIT Computational Biology group, the Broad Institute and the Computer Science and Artificial Intelligence Lab. See http://compbio.mit.edu/recombs... for details.
Evolutionary tree as input, existing species in leaves, extinct internal nodes, encoding 1 if protein is encoded by genome, 0 otherwise. MP, ML approaches summary.
- Oliver Hofmann
MP for binary alphabets originally (Fitch 71), extended countless times (non-binary alphabets for one)
- Oliver Hofmann
Co-evolution: tendency of functionally and physically interacting proteins to co-evolve; can also be used to predict interaction
- Oliver Hofmann
Correlative patterns of evolution of sets of orthologs
- Oliver Hofmann
Infer the ancestral sequences based on the concept of co-evolution
- Oliver Hofmann
MP with standard label (1/0 encoded, non-encoded), [find trees that best explain co-evolution. Do not check mail while trying to follow.. sorry!]
- Oliver Hofmann
Example on 95 organisms, 4873 rtholog families and 10.000 co-evolution edges based on protein interactions, network proximity and more. Tree edge inferred using PAML, co-evolution weighted by co-occurrence
- Oliver Hofmann
Compare leaf missing value reconstruction of ACE with ML/MP
- Oliver Hofmann
Chromosome to large chunks, chunks, building blocks and oligos [her words :) ]
- Oliver Hofmann
Putting it back tgether by ligation, restriction digestion and ligation (on the chunk/large chunk scale)
- Oliver Hofmann
Requires careful screen and placement of (palindromic) restriction enzyme sites to create overhangs
- Oliver Hofmann
Additional constraints: placement, no good idea on impact in introns, non-coding regions. Stick to coding regions as impact on codon change better understood. Prioritize non-essential genes, factor in cost of the enzyme (!), number of genes affected, etc
- Oliver Hofmann
Chunks of 10kb, restriction enzymes with average frequency (every 100kb, every 30kb) -- utilize modularity by combination of enzymes so that combined cutting sites are useful (nested overlapping sub-problem)
- Oliver Hofmann
Manual design: month of expert effort, yet average cost of 7.6 for chromosome 9R. Solution of the algorithm in 5 minutes, average cost of 2.1 (i.e., a much better solution given the defined constraints)
- Oliver Hofmann
Hurdles: from images to measurements, and onwards to knowledge
- Oliver Hofmann
(their solution to first hurdle: CellProfiler)
- Oliver Hofmann
Screens with thousands of samples combined with thousands of drugs, RNAi, identify 'hits' (that sample that is different)
- Oliver Hofmann
Difficulty scales: phenotype complexity, rareness
- Oliver Hofmann
Address phenotype by large-scale machine learning, rareness by combining data
- Oliver Hofmann
Use simple methods and few assumptions, leverage data size
- Oliver Hofmann
T47D cells unstimulated vs heregulin stimulus (breast cancer associated)
- Oliver Hofmann
Label 300 cell images manually curated for the desired phenotype. Use machine learning to screen for other, more subtle phenotypes
- Oliver Hofmann
Exploit experiment structure: three replicates, one negative two positive samples, check for motility. But only a small subset in the positive control exhibit change in motility, not enough for a classifier to pick up
- Oliver Hofmann
(45% to 55% cells with motility phenotype)
- Oliver Hofmann
Solution: use all cells during training instead of just a few hundred. Kernal grows quadratically with training set. Create low-dimensional space (spanned by random Fourier bases, a few hundred dimensions) to approximate an RDF kernel (Rahimi and Recht NIPS 2007)
- Oliver Hofmann
7.6 million training cells with 130 measurements, mapped to 250 dimensional random feature space to train fisher's linear discriminant
- Oliver Hofmann
The random feature classifier is as good as the hand-trained human classifier
- Oliver Hofmann
(still accuracy i AUC far from perfect)
- Oliver Hofmann
Aim is not to score single cells but whole sampes which consist of hundreds of cells
- Oliver Hofmann
Instead of classifying cells each cell with a probability of being stimulated based on the position on the histogram (soft labeling), create probability density function of the cells in the sample. Addresses uncertainty in classifying individual cells
- Oliver Hofmann
Likelihood for any ratio of cells that a sample was stimulated
- Oliver Hofmann
Per-well accuracy much improved over the per-cell accuracy
- Oliver Hofmann
Allows to identify subtle, complex cellular phenotypes without human training, can screen for 'invisible' phenotypes (to humans), and avoids premature threshold and classification
- Oliver Hofmann
Systems biology needs a "third leg" to stand on that takes systems seriously: computer science.
- Oliver Hofmann
Offers concepts, mathematical techniques but most importantly a perspective on how to understand how objects interact to create a system behaviour
- Oliver Hofmann
(Network image with hundreds of components)
- Oliver Hofmann
10^19 molecular species that can generate a path for EGFR to Erk via Ras and Raf/Mek/Erk. Entirely too many differential equations
- Oliver Hofmann
Origins of complexity: post-translational modification, complex formation .. millions of states, "and the protein has not left the membrane yet". Problem of independence or concurrency, generated by the modularity
- Oliver Hofmann
(Balance between A and A-B-A, pathways via A-B or B-A independently)
- Oliver Hofmann
Concurrency implies you get to A-B-A faster than if you had only one pathway (sequential method)
- Oliver Hofmann
Change in dynamics, but also in equilibrium: more binders that 'suck up mass', scales with the number of concurrent paths
- Oliver Hofmann
What happens if you increase the concentration of the scaffold (B) rather than protein A? Chances of finding _some_ B increases. If it is _too_ large the problem becomes finding the _same_ B (otherwise left with two intermediates A-B, but no final A-B-A product).
- Oliver Hofmann
Equilibrium 'jamming' in concurrent systems
- Oliver Hofmann
Concurrent assembly of 'rings' as another example of dynamic jamming
- Oliver Hofmann
A-B bind, at that point C can bind to B and form a ring (triangle) to A. Driven by interaction energy (assume favorable positional energy). Second bond (interaction energy) for free as C is already positioned.
- Oliver Hofmann
Alternative path: C binds to A on the A-B complex. Or A-C followed by B. Or...
- Oliver Hofmann
First step of pathways use up proteins required in second step for other pathways. Dimers form quickly, soaking up all monomers, and system is stuck as no monomers are left for the triangle formation (until there is a dimer dissociation)
- Oliver Hofmann
Energy plateau, then slow increase towards stable triangles
- Oliver Hofmann
Combinatorial complexity at the level of trimers (followed by hairball network image)
- Oliver Hofmann
2^n + n molecular species, n2^(n-1) assembly reactions
- Oliver Hofmann
Keep complexity implicit with rules: Rule B can 'pick up' / bind any A. Specify linear rules exploiting independence
- Oliver Hofmann
Rules as patterns that match or do not match a graph. Induce reactions in a mixture of species.
- Oliver Hofmann
Advantage: its resolution is tunable to different levels of complexity
- Oliver Hofmann
Rule expresses the local context required for an action. Expresses what is known empirically (or assume); ideally the rule should expression conditions necessary and sufficient for an action (different from what an experimentalist know)
- Oliver Hofmann
Units of dynamics are patterns, not molecular species. Exact coarse-graining translating rule system into an appropriate system of differental equations (Feret, Danos et al, PNAS 2009)
- Oliver Hofmann
Dynamic interpretation of the whole yeast interactome
- Oliver Hofmann
Thousands of complexes differ from cell to cell, or cell at time t1 and t2
- Oliver Hofmann
"Causality landscapes on networks". So what if there are so many events that limit independence (compartments, phosphorylation, ...)
- Oliver Hofmann
Assembly of a proteasome assembly not independent of all the other proteins competing for the same binding sites
- Oliver Hofmann
To understand why a system is architected in a given way requires understanding the kind of problems it is solving
- Oliver Hofmann
Processes, complexes etc might have a given shape precisely to prevent interference
- Oliver Hofmann
Decomposition of signal processing circuits into molecular devices (the engineering view)
- Oliver Hofmann
vs cognitive view that accepts network plasticity (pathways induced by the signals they process, networks emerge in competition for shared components, and interference is pervasive and essential). Control vs integraton
- Oliver Hofmann
Feyman quote: What I cannot create I do not understand [tip of the hat to the morning keynote]
- Oliver Hofmann
Engineering active in prokaryotes, not in metazoa. The reverse statement -- what we do not understand we cannot create
- Oliver Hofmann
Goal: predict gene expression given sequence and TF concentrations. Even assuming you can calculate thermodynamic and know occupancy effect on gene can still not be quantified
- Oliver Hofmann
CRMs: cis-regulatory models / enhancers. Short DNA segments, clusters of binding sites, act at distance independent of oritnation, independent of each other
- Oliver Hofmann
Again differental display system of the Drosophila blastoderm (segmentation)
- Oliver Hofmann
Summary of the eve (even-skipped) genomic initiation. Different enhancers with (partial) stripe specificity
- Oliver Hofmann
Stripe 2 enhancer alone: produces stripe 2. Just proximal region no expression. Combine and stripe 2 and 7 appear
- Oliver Hofmann
Juxtaposing two enhancers give novel pattern. Again a non-additive behaviour
- Oliver Hofmann
Quantitative expression data required to understand quantitative changes, method to calculate repressive and activating effects
- Oliver Hofmann
Quantitative Trans-Factor dataset: stain and scanned embryos (6 minute time slices), align, classify
- Oliver Hofmann
No thermodynamic model for the PPI between TF, cellular machinery.
- Oliver Hofmann
Feedforward model of transcription that depends on TF concentration. Uses fractional DNA occuancy by DNA binding factors, adaptor factors, binding site adaptor factors to determine transcription rate. Transcript number proportional to rate
- Oliver Hofmann
Model with quenching, competition components
- Oliver Hofmann
Competition distance dependent; quenching to introduce the notion that multiple binding sites are needed
- Oliver Hofmann
Applied to stripe 7 problem, model 7 transcription factors. Model correctly predits non-additive behaviour
- Oliver Hofmann
Introduce stripe 2 enhancers from different Drosophila species, still correctly create stripe 2. Model of the combined enhancers result in prediction matching data despite sequence divergence
- Oliver Hofmann
[Skipping part of the notes trying to coordinate / steer folks to the museum in case someone want to take over :) ]
- Oliver Hofmann
Recap of the segmentation patterns (maternal genes, gap genes, pair rule genes, segment polarity genes)
- Oliver Hofmann
Usually modeled as a 1D stripe using proteins
- Oliver Hofmann
Use the 2D surface embedded in 3D space, use protein and mRNA information
- Oliver Hofmann
Data from BDTNP: 2D data for Hb, Kr, Gt, Kni from 0-50 minutes. Inout is Bicoid, Caudl, Tailless, Huckebein
- Oliver Hofmann
Using diffusion reaction equations modeling diffusion and degradation of mRNA, transcriptional activivators and repressors, followed by non-linear function modeling the saturation of transcription and protein translation. Also includes Hunchback dimerization at different concentrations of its regulators
- Oliver Hofmann
(a non-linear transcriptional regulation)
- Oliver Hofmann
Simulate diffusion (discret laplace-beltrami operator on the surface, error about 1%)
- Oliver Hofmann
61 rea parameters, 250k data points. Global optimization with CMA-ES (evolution strategy), and... [missed the second step...]
- Oliver Hofmann
Visualization of best fit solution for mRNA shows position dependency, problems anterior for GT, other genes with good fit over time course
- Oliver Hofmann
Discrepancy due to missing genes involved in the process (Ems, btd)? Not all enhancers covered.
- Oliver Hofmann
Validate model with gap gene mutants (Kni or Hb null mutations). Good results for Kni, miss one shift for Hb
- Oliver Hofmann
Remaining problem of the high dimensional non-linear optimization, missing genes and the (simplistic?) transcriptional regulation model
- Oliver Hofmann
Wager (2003): does selection mold molecular networks? (Feedforward motifs, etc by chance?)
- Oliver Hofmann
Belief that the signaling phenotype reflects evolutions
- Oliver Hofmann
Crucial ability for cells to recognize external states by different internal activity levels
- Oliver Hofmann
Active signaling system vs inactive ground state (signal off by default); spurious activations should be dampened rather than amplified
- Oliver Hofmann
Maintaining stable ground state favors and penalizes certain signaling network structures
- Oliver Hofmann
Perturbations in signal-off case with different effects depending on network structure
- Oliver Hofmann
Generic ODE model to furmulate arbitrary activatio patterns. Activation externally or by another internal species (generic)
- Oliver Hofmann
Stability of the steady state (with all activities at zero as the ground state)?
- Oliver Hofmann
Feedback family (sets of strongly connected components) sole determinant
- Oliver Hofmann
Use loop-less unweighted network to study toplogical effects alone
- Oliver Hofmann
Concept Kinetic ground state robustness (GSR) [sorry, no details on that one...]
- Oliver Hofmann
Design principles: networks should be acyclic, if cycles (feedback loops) they should not interlock, if interlocked for dynamic reasons dominant component shoud be small, have large girth
- Oliver Hofmann
Way to plot graph properties (connectivity, path lenths etc) vs robustness
- Oliver Hofmann
Abstract models help to understand network topology evolution
- Oliver Hofmann
Quantify -> Compare under genetic perturbation -> Reverse engineer
- Oliver Hofmann
Galactose network of S.cerevisiae as model system
- Oliver Hofmann
Summary of galactose import & metabolism
- Oliver Hofmann
Core regulatory mechanism or loop with GAL4, GAL80, GAL3, GAL6, multiple positive/negative feedback loops
- Oliver Hofmann
Yeast strain with Gal10 promoter reporter (positively correlated with overall model system activity). Grow in raffinose, galactose medium; transfer to new medium that has raffinose and percentage of galactose (no need to use Galactose, but available). Wait N hours and assay on the cell level after FACS
- Oliver Hofmann
Cells without galactose at N hour stage with low fluerescence, at 0.03% 100 times the reporter activity. At 0.0066% a bimodal distribution (off or on at lower level)
- Oliver Hofmann
Narrow range of gal with multiple subpopulations and sharp transitions. On/off activity _levels_ depends on Gal concentration
- Oliver Hofmann
Questios: some cells in the middle - in transit or just undecided?
- Oliver Hofmann
Stochastic bifurcation structure; as a function of the input level, what is the number ad location of modes of distribution, what fraction of cells is near each mode, and how high is the variability at each mode?
- Oliver Hofmann
Poor functional conservation at the gene or protein level, but biological processes are well conserved
- Oliver Hofmann
Is there an intermediate metagene level that explains the conservation of functional data?
- Oliver Hofmann
1) Most interactions might be spurious, with only core interactions being conserved. 2) conservation of network occurs on a different level
- Oliver Hofmann
Interaction data comparison (cerevisiae, pombe, c elegans, drosophila)
- Oliver Hofmann
Spearman correlation of expression data, similar for PPI, Genetic Interactions, paralogs by Blast, and GO similarity score
- Oliver Hofmann
Conservation rates for edge-edge comparison: 20% at expression level 60% PPI, 10% GI, 24% GO
- Oliver Hofmann
Create combined weighed network for each species, partition using markov clustering (MCL) to create modules or metagenes
- Oliver Hofmann
Distinguish between within-module edges and between-module edges
- Oliver Hofmann
Within more highly conserved (about 2-fold) than between
- Oliver Hofmann
Results robust even when 40% of edges are removed
- Oliver Hofmann
Condition specific networks with better conservation than the general case
- Oliver Hofmann
Comparison to coding regions with higher conservation than overall genome; functional (internal) module interactions more conserved than overall conservation
- Oliver Hofmann
Property of cancer cells (high genomic instability). Distinction between driver mutations (cancer related) and others
- Oliver Hofmann
arrayCGH common approach to quantify DNA copy number
- Oliver Hofmann
Standard pipeline: segment profile, call significant gains and losses, cluster samples and identify frequent variants. Mistakes and biases propagate forward.
- Oliver Hofmann
Non-pipeline approach based on Max-margin clustering. Linear function that partitions sampes into two groups. Algorithm determines the function and labels samples
- Oliver Hofmann
Key emerging problem the interpretation of observed genomic variation to phenotypic variation
- Oliver Hofmann
Need that magic model that transform a gene perturbation query into a phenotype. What form should these models take?
- Oliver Hofmann
Two observation. 1: Simplest disease (lethality) not tied to the protein but to the machine to which the protein belongs. Machine matters, not the protein. 2: protein levels may be more conserved evolutionary than mRNA levels (Schrimpf, PLoS Biol 2009)
- Oliver Hofmann
Models should benefit from consideration of proteins and interactions across evolutionary distance species
- Oliver Hofmann
Guilt by association models for the black box? Functional protein networks with probabilistic edges indicating involvement in similar biological processes
- Oliver Hofmann
Learned from PPI, genetics, functional genomics...
- Oliver Hofmann
Static networks, but work surprisingly well. Set of published examples from yeast, worm, plants; identified novel genes that change worm lifespan, reverse tumors, ...
- Oliver Hofmann
Scales up to mammals (Gray, Nature Cell Biology 2009 for fuzzy mice with developmental defects)
- Oliver Hofmann
Second model: Phenologs (aka orthologous phenotypes)
- Oliver Hofmann
Simpler predictors that compare traits directly
- Oliver Hofmann
Plenty of systematic measurements of model organisms; a different matching problem though: what is the worm collection of breast cancer genes?
- Oliver Hofmann
Identify overlapping sets of orthologues genes that each gene in a given set gives rises to the same phenotype in that organism
- Oliver Hofmann
Provides prediction if gene was tested in one organism, not the other, and belongs to the orthologues set
- Oliver Hofmann
High incidence map of C elegans maps to human breast and ovarian cancers. Overlap with genes that lead to high incidence of male progeny (9 human, 13 worm, 3 overlap, remaining C elegans become predictive)
- Oliver Hofmann
Systematic discovery of phenologs in human, yeast, C elegans, mouse, test against permutation sets
- Oliver Hofmann
Lots of built in controls (human cataracts to mouse cataracts), but plenty of novel ones.
- Oliver Hofmann
Yeast lovastatin sensitivity predicts mammalian angiogenesis defects
- Oliver Hofmann
Out of 62 predictors 3 in the literature, 59 tested in frog development, 5 additional confirmed by in situ/knockdown. 33x discovery rate than expected by chance.
- Oliver Hofmann
SOX13 knockdown removes veins; same effect in human cells (umbilicial vein endothelial cell angiogenesis assays)
- Oliver Hofmann
Detected in an organism that has no blood system (or veins...)
- Oliver Hofmann
Phenologs more strongly interconnected in protein networks and capture molecular subsystems of proteins, evolutionary conserved systems of proteins relevant to the given disease
- Oliver Hofmann
Combine this with the black box model 1 approach (networks)
- Oliver Hofmann
Neural tube birth defect genes confirmed by phenologs and protein networks; predictor abnormal cilia morphology in worms
- Oliver Hofmann
Exend to network neighbours of predictors, 2/2 validate
- Oliver Hofmann
So.. what about the plant model of human disease?
- Oliver Hofmann
23k gene-phenotype associations in Arabidopsis
- Oliver Hofmann
Shade avoidance defect (...) to ear development defects in mammals
- Oliver Hofmann
Gravitropism defect (no response to gravitiy) to Waardenburg syndrome (2-5% of human deafness)
- Oliver Hofmann
Waardenburg: Defect in neural crest cell population in developing embryo. Genes involved in Gravitropism in plants should affect neural crest. One confirmed by literature, two others testesd (SEC23IP localizes to NC and induces defects when knocked down in frog embryos)
- Oliver Hofmann
Protein abundances conserved, stochiometrics likely to be conserved too
- Oliver Hofmann
Use redefined SCC measure in an association-analysis approach (combinatorial search)
- Oliver Hofmann
Validation in 3 lung cancer data sets and the combined data set
- Oliver Hofmann
Statistical significance against random permutation test (gene pairs and phenotypes); yield 88 significant size-3 patterns
- Oliver Hofmann
About half could not have found in full-space DC approaches
- Oliver Hofmann
Ten-gene subspace DC pattern, not correlated in 92 cancer patients (90%), co-expressed in 63% of normal patients. Enriched with TNF-a/NFKb pathways
- Oliver Hofmann
Can be used to study demographic, genetic differences in each class
- Oliver Hofmann
Input: set of known knockout effect pairs, challenge: predicting new effect. Previous work: SPINE (Ourfali 2007), Physical network models (Yeang 2004)
- Oliver Hofmann
Assumption that each PPI has activation or repression effect
- Oliver Hofmann
[Somewhat difficult to descibe the current graphical network model. Outline of the physical network and knockout effects to determine +/- effects]
- Oliver Hofmann
Consistent physical networks: no paths with different signs between genes
- Oliver Hofmann
Two groups model derived from consistent physical network
- Oliver Hofmann
Model can explain 83% of KO effets. Very high cover rate, can decide on a prediction for 95% of the experiments
- Oliver Hofmann
Annotate interaction between A and B based on their predicted KO effect, assume sign of interaction is identical. Tested on two signaling pathways (filamentous growth, high osmolarity glycerol). Accuracy 72% vs 64%, coverage 67% vs 13% in SPINE
- Oliver Hofmann
Annotation creates consistent physical network
- Oliver Hofmann
Causal inference rather than association the aim, distinguish causal and reactive models (work by Schadt)
- Oliver Hofmann
Yang et al, Nature Genetics 2009. Networks facilitate the identification of causal genes
- Oliver Hofmann
Framework for data integration graphic probabilistic models to integrate Omics data, enhanced by literature. Integration improved network quality (based on comparison with existing Knockout data, GO terms)
- Oliver Hofmann
Leonardson, Hum Mol Genet 2009: human blood expression time series during a fasting/meal/postgrandial cycle vs fasting
- Oliver Hofmann
Gene signature analysis (expression and connectivity)
- Oliver Hofmann
[Must be tired.. having the hardest time following this one. Someone want to take over?]
- Oliver Hofmann
Granger causality to identify causality in (long) time series. Combine the individual short time series into one long one (after removing individual scale variation)
- Oliver Hofmann
Identifies Per1 as a key causal regulator in food response
- Oliver Hofmann
Differences in time lags and other data types difficult to model this way; switch to dynamic Bayesian networks
- Oliver Hofmann
Picture of a bike as a model of human diseases: stuff that can go wrong, and how do components overlap. No handle bars, no steering.. but also no brakes.
- Oliver Hofmann
Focus on disease genes vs disease mutations? OMIM with disease genes, higher probability that a random mutation in a disease gene will result in disease
- Oliver Hofmann
Focus on mendelian, germ line inherited diseases
- Oliver Hofmann
Parallel pathways vs duplicate genes and their (different?) contribution to organism robustness
- Oliver Hofmann
Presence of duplicate significantly decreases probability of disease phenotype measured by conditional probability of a disease gene on the distance to the most similar gene
- Oliver Hofmann
Genes with 90% sequence identity homologs 3 times less likely to harbor disease genes than those with remote homologs
- Oliver Hofmann
Parallel pathways do not contribute to robustness against deleterious mutations
- Oliver Hofmann
Mapping disease genes on PPI (Feldman, PNAS 2008), similar work from the Vidal lab:
- Oliver Hofmann
intermediate PPI connectivity (degree) -> highest probability of _germ-line_ disease mutations
- Oliver Hofmann
(need to be important, but not _that_ important)
- Oliver Hofmann
Same is true for intermediae tissue distribution [would seem to make sense: no house-keeping genes, no genes essential for a vital tissue]
- Oliver Hofmann
About 1/3 of disease genes are pleiotropic. Example of single gene with three (E.C.) functions or through different biological process (each domain involved in different process). Pleitropy depends on the biological processes, NOT the number of different functions (within one process)
- Oliver Hofmann
Rational design requires solid prior knowledge. Model network function (PPI-Regulatory-Signaling-Metabolic) a trade-of between accuracy and scale
- Oliver Hofmann
Kinetic models at one extreme, topological analysis at the other. Constraint-based models chosen here as a balanced approach
- Oliver Hofmann
Developed in Palssn lab, UCSC. Predict metabolic reaction rates under steady state constraints
- Oliver Hofmann
Usual constraints: mass balance, capacity, thermydynamic constraints, ...
- Oliver Hofmann
Rely on selection pressure to drive the network towards maximized biomass production rate
- Oliver Hofmann
Modify network such that metabolite of interest is maxmized as a result of biomass optimization (as a by-product)
- Oliver Hofmann
OptKnock optimization problem (searches the knockout space that maximizes cellular objective along with metabolic production aim)
- Oliver Hofmann
Tends to be overly optimistic (alternative paths); RobustKnock (here) tries to maxmize the minimal guaranteed chemical production rate
- Oliver Hofmann
[Details on the math formulation of the optimization problem, should be in the paper]
- Oliver Hofmann
Showcases sample results for different cases, including triple-knockouts in Ethanol (bio-fuel) production
- Oliver Hofmann
Improved solution space boundaries for RobustKnock vs OptKnock
- Oliver Hofmann
Ongoing work to develop microbe metabolic phenotypes, ...
- Oliver Hofmann
Focus on mouse embryonic stem cells (ES)
- Oliver Hofmann
Self-renew, differentiate (in vivo or vitro) into around 120 cell types including the germ line
- Oliver Hofmann
Set of known key TFs and pathways (Oct4/Sox2 etc), or inhibition of Erk1/2 and GSK3. Nanog levels may define two different states of pluripotency
- Oliver Hofmann
Nanong part of an auto-regulatory circuit maintaining pluri-potency
- Oliver Hofmann
Knock down nanog, or sox2 or oct4 one gets blocks of gene expression changes that are superinposable (i.e., same target genes)
- Oliver Hofmann
See Lemischka review in Nature Reiews on Mol Cell Biology
- Oliver Hofmann
Measure network dynamics during changes in cell fate a requirement. Technical limitations restricted to snapshots at one potential regulatory level
- Oliver Hofmann
Recap of the epigenetic molecular and temporal cell landscape
- Oliver Hofmann
How to convert the snapshots to movies: Lu et al, Nature 463 (2009)
- Oliver Hofmann
ES cell lines to tune Nanog expression levels. Vector with Nanog shRNA, inducable promoter. Depend on Doxycycline for Nanog expression, tight control of levels (endogenous nanog knocked down, replaced by new version)
- Oliver Hofmann
New publication (last week's nature?): Trace histone marks, Pol II, RNA, 1600 proteins by MassSpec. Day 0, Day 1/3/5 of ES cells
- Oliver Hofmann
"Sobering results": if anything an anti-correlation on the expression level; activity of encoded proteins cannot be determined for this sample
- Oliver Hofmann
Representation of high dimensional data sets: three dimensional heatmaps, interactive
- Oliver Hofmann
Views on all known Nanog, Polycomb targets. Group them by RNA, Protein levels etc. Same option for gene sets (all TFs, drill down to targets of TFs)
- Oliver Hofmann
Nanog snapshot (Wang, Nature 2006, Orkin lab) converted to a movie, split up into different time points
- Oliver Hofmann
Stem cells: the movie -- coming soon (to the lab website)
- Oliver Hofmann
Databases and software to follow (GATE), interfacing prior knowledge. Superimpose existing targets and observe how they relate to the experimental data
- Oliver Hofmann
What happens when pulsing Nanog? Turn off, turn on a bit later (the theory of nanog levels and different pluripotency states) -- do you recover the pluripotent state?
- Oliver Hofmann
nanog changes not reflected in changes of other pluripotency markers, work in progress at the single cell level
- Oliver Hofmann
9 markers in 200 different ES cells monitored by qRT-PCR. Some with very narrow distribution (Oct4, Sox2); Nanog with widely distributed expression levels
- Oliver Hofmann
[Love the visualization / interaction system, hopefully available at some point]
- Oliver Hofmann
Nanog and Esrrb interacting at different regulatory levels. What happens when Esrrb is taken away (same approach as in Nanog)?
- Oliver Hofmann
Strong discordance between mRNA/protein levels
- Oliver Hofmann
Monitor landscape after Esrrb down-regulation
- Oliver Hofmann
Expand pipeline to additional data types, conditions
- Oliver Hofmann
Extended prior knowledge system as a predictive tool used to plot expression level changes after taking away Nanog or Essrb: while closely linked to each other (auto-regulation, PPI) strong changes in target fold changes
- Oliver Hofmann
Functional validation by comparing against published RNAi hits
- Oliver Hofmann
Transfect Nanog-GFP cells with shRNA, after 24h treat with RA+, follow with FACS
- Oliver Hofmann
Knockdown of gene required for neuronal differentiation vs knockdown of gene required for self-renewal observable by reaction to RA stimulus
- Oliver Hofmann
shRNA against 350 usual suspects. Hit Oct4, report goes away rapidly; hit components of RAR prolongs time in which Nanog is expressed
- Oliver Hofmann
Rank targets by relative promoter construct expression
- Oliver Hofmann
Identify SWI/SNF as a potential switch to dismante the pluripotency network
- Oliver Hofmann
.. underlies incomplete penetrance in multicellular development
- Oliver Hofmann
Why are individuals different? Excluding genetic differenes, what about random variation (e.g., in clones)?
- Oliver Hofmann
(Random) Variation in gene expression leads to cell-to-cell variation in mRNA and protein number, examples from bacteria and yeast. How variable are multi-cellular organisms, how reliable is development?
- Oliver Hofmann
Massive variation of gene expression between cells (Raj et al, 2006)
- Oliver Hofmann
WT C.elegans with robust development (cell lineage fate)
- Oliver Hofmann
Some fractions of mutants still reveal wild type in the case of incomplete penetrance
- Oliver Hofmann
[Err. Slide text barely readable. Gene names likely to be off...[
- Oliver Hofmann
Track single mRNA in situ in individual embryos (quantitatively)
- Oliver Hofmann
Focus on elt-2. Consistent in WT. In mutants late embryos generally do not express elt-2, but there are rare expression with the mutation yet almost WT-level expression
- Oliver Hofmann
Track development stage by nucei count, track elt2, end-1, end-3, med-1.2, plot number of RNAs
- Oliver Hofmann
med1/2 practically gone. Only remaining connection to elt2 is end-1 which is very heterogenous in mutant compared to WT. end-1 a controller of elt2, sometimes enough RNA/protein to switch on elt2, can be measured in transcript number counts required
- Oliver Hofmann
Lower threshold of elt-2 expression results in lower penetrance of mutant phenotype
- Oliver Hofmann
T2D with genetic and environmental component
- Oliver Hofmann
Identify the predisposing genetic factors in diabetes prone vs resistant mice (B6 vs 129)
- Oliver Hofmann
Similar at 6 weeks, very different phenotype after six months of high lipid diet
- Oliver Hofmann
Use gene network enrichment analysis. Couple PPI and differentially expressed genes, identify high-scoring subnetworks by simmulated annealing (Liu et al 2007, PLoS Genetics), characterize networks
- Oliver Hofmann
Run the algorithm multiple times, count the number of subnets in which each gene appears, permute sample to phenotype labels and repeat, compare real tallies to background to obtain pValues/z-scores
- Oliver Hofmann
Combine multiple biologically related experiments using dependent tests (LJ Wei, Biometrika 1985)
- Oliver Hofmann
Strongest differential difference is in adipose tissue. Inflammatory genes differ in adipose tissue of both old and young mice, supported by GSEA, despite identical metabolic profiles
- Oliver Hofmann
Extent of inflammation response increases with age (moderate to strong differential expression)
- Oliver Hofmann
Confirmed inflammatory biomarkers by RT-PCR. Marker neighbours (of MCP1, CD45, Thy1) follow similar trends
- Oliver Hofmann
Intrinsic strain differences in inflammatory cell populations, B6 with more inflammatory cells than 129
- Oliver Hofmann
Same trend with dietary fat. Uncertain whether this is the same mechanism as the age effect
- Oliver Hofmann
[@Golnaz: That's quick -- thanks for the pointers!]
- Oliver Hofmann
Brief recap of chemotaxis in E.coli, tumble movement, adaption and sensitivity
- Oliver Hofmann
Assays in capillary tubes, swarm plates
- Oliver Hofmann
Switch to graphical notations for networks (SBGN, standardized notation). Still no fully executable semantics, concurrency, independency
- Oliver Hofmann
Use a state-based methods (State Charts Harel 1987), extended classical state machines with hierarchy, orthogonal states, communication and history
- Oliver Hofmann
Abstraction (clustering) allows higher-level representations
- Oliver Hofmann
Two-tier formalism, link high-level representation with lower leel details (inter/intracellular); models fully executable
- Oliver Hofmann
High level states: tumbling, growth states, run states, (Surviving group, metabolism group, flagellum group, ...)
- Oliver Hofmann
Dynamic flux balance analysis, update of constraints at set intervals in the Rhapsody simulation environment
- Oliver Hofmann
StochSim (Morton-Firth 1998) as stochastic simulator
- Oliver Hofmann
Model system of LNCap prostate cell lines (treated/untreated)
- Oliver Hofmann
Overlap between measured expression, protein genes small (13 out of 347, 70 differentially expressed genes/proteins). Partially due to a static look at a complex dynamic system
- Oliver Hofmann
Key signaling proteins frequently hidden from high-throughput assays (but reflected in changes of their targets)
- Oliver Hofmann
Use topological scoring to identify hidden regulators (enrichment of the set of direct, indirect targets of protein of interest)
- Oliver Hofmann
Usual problems: complex, inter-connected networks. Requires a score to identify targets which are 'more directly related' to a protein of interest
- Oliver Hofmann
Find networks that are _specific_ to a regulator influence (exclude networks that have competitive regulators)
- Oliver Hofmann
Workflow: topological scoring of up-regulated genes up-regulated proteins (MetaCore framework), comparatie functional analysis to identify mechanisms with the most support and functional analysis
- Oliver Hofmann
Intersection between predicted set of regulators 52% (gene/protein)
- Oliver Hofmann
Growth factor signaling as the second most significant cascade in both data sets (G1-S regulation), not directly related to AR mechanism (shown by removing AR targets from enrichment analysis)
- Oliver Hofmann
Overlap in this network increases to 76%, even higher in subsets (IGF signaling part)
- Oliver Hofmann
Scored TF set, discrepancies between expression and protein levels
- Oliver Hofmann
Growth factor signaling might represent a second phase response not directly related to androgen stimulus
- Oliver Hofmann
Antibiotic resistance as a public health problem: spread of resistant strains vs decrease in new drugs
- Oliver Hofmann
Studied by their primary target interaction. Reactive oxygen species (ROS, kill the bacteria, not just slow growth), try to sensitize to ROS
- Oliver Hofmann
Aerobic growth is a ROS balancing act (basal production vs scavenging capacity, above that death); results in a dose response to the drug
- Oliver Hofmann
Rather than increase drug decrease the gap between basal production and scavenging capacity
- Oliver Hofmann
Focus on increasing the basal production rate, a metabolic engineering problem
- Oliver Hofmann
Problem: ROS not a stochiometric product, standard tools not applicable
- Oliver Hofmann
In addition the problem of unknown ROS source; enzymes that can produce ROS span the entire metabolism
- Oliver Hofmann
Complex engineering problem. Recap of Flux Balance Analysis (FBA). Mass Balance / steady state / thermodynamic constraints and optimization problem
- Oliver Hofmann
Useful for knockout performance but does not model opportunistic ROS production
- Oliver Hofmann
"Outfit" FBA. Find enzymes with ROS-generating profactors, assume opportunistic production of ROS is proportional to the flux of metabolites through these enzymes (FAD, Quinone, Fe-S)
- Oliver Hofmann
Find metabolic network perturbations that increase ROS
- Oliver Hofmann
2300 reactions for 12000 genes, 16000 metabolites, ROS reactome of 130 reactions. Assume minimal glucose media. Top predictions tested by directly adding H2O2 and test how much is needed to hit the scavenging capacity
- Oliver Hofmann
3/5 predictions with increased H2O2 sensitivity (multiple logs of percent survival change)
- Oliver Hofmann
Negative controls (as the production of ROS is a black box), picked 5 non-essential genes. Experimentally no change in sensitivity.
- Oliver Hofmann
Also tested for superoxide, still 25x increase in sensitivity, less than 2x in negative controls
- Oliver Hofmann
Targets include Cytochrome bo3, Gly Dehydrogenase, Succ dehydrogenase
- Oliver Hofmann
Test predictions against 'cidal antibiotics. Ampicilin with increased sensitivity, now testing Getamicin, Norfloxacin
- Oliver Hofmann
Probabilistic inference of pathway activities
- Oliver Hofmann
Expression (array) profiling as disease classifiers or progression estimators
- Oliver Hofmann
Gene marker discovery: identification of differentially expressed genes challenging (small sample size vs size of feature set, inherent noise and heterogeneity)
- Oliver Hofmann
Doubts on the usefulness of classifiers built on individual gene markers
- Oliver Hofmann
Breast Cancer Metastasis data set (USA Dataset, Wang et al 2005; Netherlands Data set, van't Veer 2002). Each study with about 70 genes, three of which are shared. Low performance on cross-dataset comparison
- Oliver Hofmann
How to design more robust classifiers for reproducible results?
- Oliver Hofmann
Analyze at the level of functional modules (overcome the independent selection of marker genes)
- Oliver Hofmann
Switch to pathway markers, joint analysis of expression levels of functionally related genes [I was wondering whether this was going to be GSEA, but they combine expression information]
- Oliver Hofmann
Methods required to summarize pathway member gene expression levels. Previous results indicate that pathway comparison is more robust and provides insights into functional biology
- Oliver Hofmann
Widely used methods: mean/median level, magnitude of first component of PCA, CORG (condition-responsive genes) uses mean of differentially expressed genes
- Oliver Hofmann
Improve by using a probabilistic approach to measure pathway ativity and to identify the best markers
- Oliver Hofmann
Assume expression level of gene x has different distributions in different phenotypes, calculate log-likelihood ratio between phenotypes given the expression level g(x); compare overall LLR as the sum of ratios of all pathway members
- Oliver Hofmann
Combines support from each gene in the pathway (naive bayes model) [Assumes conditional independence of genes in a pathway?]
- Oliver Hofmann
Compare to USA, Netherlands data set using MSigDB pathways, high t-test score indicates marker is effective in phenotype discrimination. Improved over mean/median, CORG
- Oliver Hofmann
Rank markers based on t-score, discriminate power of top P% markers in second data set. Significantly higher t-score compared to CORG et al, particularly for top marker selection
- Oliver Hofmann
Probabilistic representation of networks. Standard methods for undirected graphs: UG inference, graphical lasso, scale for small sample sizes.
- Oliver Hofmann
Scaling problem for DAGs given the sample size. Feedback relationships (cycles) require directed cycles but do even worse given the sample sizes
- Oliver Hofmann
Aim: directed inference for realistic sample sizes, assumption: each expression trait has a perturbation and each perturbation only has one directed edge
- Oliver Hofmann
If that is case the undirected graph can be used to infer a directed cyclic graph
- Oliver Hofmann
Motivation: sample of n individuals from a population with m genetic loci, with phenotype p (gene expression products). As long as the changes are far away from each other assume they only have a cis-eQTL effect
- Oliver Hofmann
Identify cis-eQTLs (SNPs and expression products) in population studies and use as input for the described method
- Oliver Hofmann
[I am going to skip the notes on the isomorphism proof...]
- Oliver Hofmann
Combine perturbation, mutation information, infer undirected graph s from the data, significant non-zero partial correlations from the undirected graph to infer the directed regulatory network
- Oliver Hofmann
Preliminary results from HapMap (270 individuals, 4 populations) along with expression information from four immortalized lymphoblastoid cell lines
- Oliver Hofmann
Survey of identified small networks with consistent biological function (cell cycle regulation, etc)
- Oliver Hofmann