Sign in or Join FriendFeed
FriendFeed is the easiest way to share online. Learn more »
Recomb-Sat/DREAM 08

Recomb-Sat/DREAM 08

For coverage of the 5th Annual RECOMB Satellite on Regulatory Genomics, the 4th Annual RECOMB Satellite on Systems Biology, and the 3rd Annual DREAM reverse engineering challenges. http://compbio.mit.edu/recombs...
Roland Krause
David J Reiss: High-resolution transcriptome analysis reveals prevalent regulatory logic interspersed within prokaryotic coding sequences
Halobacterium salinarium is explored with hi-res tiling arrays. - Roland Krause
GX time course analysis and CHiP-chip studies. - Roland Krause
75% of all transcripts and 5% additional detected, 20% have 5'UTR, most have a long, trailing 3'UTR and overlaps. Putative ncRNAs with no ´confirming peptides, several antisense transpoases. Conditional operators are seen often. Resolution of 700 conditions. - Roland Krause
Neil would have certainly liked this ... - Roland Krause
Roland Krause
The write-up of the NYAS has appeared. Several of the keynote talks are now available in higher quality. - http://www.nyas.org/ebriefr...
Roland Krause
The videos of the conference have been posted. - http://compbio.mit.edu/recombs...
Roland Krause
Keynote Uri Alon: On the evolution of modularity
A modular system can be separated into units that perform almost independently - Roland Krause
Most networks we can imaginge are not modular - Roland Krause
Optimal systems are not modular - Roland Krause
Biology is modular, we would not understand it without modules - Roland Krause
Network motifs exist - Roland Krause
Evolutionary models use computers to evolve networks towarrds a goal. Initial population - select - mutate - reproduce, select etc. Systems evolved in this way (Neural networks digital systems) tend not to be modular with no network motifs. - Roland Krause
How can evolution produce modularity and network motifs? - Roland Krause
A toy problem of circuits with NAND gates is evolved by rewiring. The desired output is (X XOR Y) AND (Z OR W) as a fitness goal. The solution is typically small and non-modular. - Roland Krause
Engineers (software e.) use modules, simple systems that work and can be re-used, including re-wiring. - Roland Krause
Goals change over time. Switch a goal every 20 generations with common sub-problems: Modularly varying goals. Organisms will evolve to a state that they can re-wire quickly to respond to goals. They display a modular structure.. - Roland Krause
The circuits discover the hidden modularity of the goals. They also exhibit network motifs. - Roland Krause
Scales well with problem size. - Roland Krause
What happens when goals become fixed? The modularity is lost. Modular solution use additional gates, are larger. - Roland Krause
Modular systems (with swtching goals) evolve faster than fixed goals systems. The speed up by modular systems increase with a power law. - Roland Krause
The extinction 60 millions ago left a early mouse-like mammal that evolved quickly into whales and primates etc, much quicker than a computer simulation would expect. - Roland Krause
Evolution under constant environments is typically slow. Modular solutions work well in a fitness landscape - not simulated annealing but changing goals. - Roland Krause
Modules (goals) exist on organismal, cellular, protein levels. If the goals is constant, it should not be modular. Ribosomal proteins have fixed goal and are therefore not modular (Yanai & Lancet, Trends in Genetics) - Roland Krause
Parter, Kashtan, Alon, BMC Evol Biol 2007. Bacterial metabolic networks are modular for complex environments, simple for parasites. - Roland Krause
Great talk, sitting in a lecture theater on Saturday at 9am never was so useful. - Roland Krause
Q: Speed up for multi-cellular organisms? Does each cell type face different goals. A: Proteins in eukaryotes are more modular. And yes. - Roland Krause
Q: Application to yeasts - which growth conditions are modular? A. Sugars might vary, TCA cycle is constant. Modules for metabolizing different sugars. - Roland Krause
Q. Do modern organisms loose modularity? Should conserved elements be more or less modular? A: Not being explored yet, interesting field of research. - Roland Krause
Q: Chemostat data, 1000 generations are possible. Do we see loss of modularity? A: 1000 generations will only see one or two modules decreasing. Chemostat not explored. E coli and Buchnera show the difference more expressed by loosing modules. - Roland Krause
Q: Relationship between goals and modules: Frequency of goal switching .A: Large range of goals switching times show modularity. - Roland Krause
A recent article on the matter, which includes some of the aspects of the talk (and much more) http://www.ploscompbiol.org/article... - Roland Krause
Roland Krause
Keynote Timothy R. Hughes: Protein-nucleic acid interaction mapping
The last talk of the conference (the Hughes 2004-Hughes) - Roland Krause
"Cells don't use comparative genomes, Chips or motif searching." - Roland Krause
Topic: Nucleosomes vs DNA - Roland Krause
Full genome for yeast available form tiling arrays (Lee ... Hughes 2007) - Roland Krause
Build a linear model that predicts nucleosome occupancy with TFBS; DNA structural data etc. - Roland Krause
Lasso model analysis identifies TFBS and some DNA struc. Weights are almost always negative, exclusion dominant element. General TFs with short motifs. Default is to occupy. - Roland Krause
Test a occupancy in vitro, does mutation of TF have the expected effects. Mix chicken histones and dsDNA and use Solexa for ID. GCGC motif fond very often. - Roland Krause
How much all 150mer weigh? - Roland Krause
In vitro and in vivo correlate (R^2 0.74) - Roland Krause
GC explains most of the variation, about half of the variation. - Roland Krause
C. elegans and human data by other data sets. TFs in human might exclude more regions. - Roland Krause
Does mutant in TFs change NO? Can be shown Abf1 and Reb1, other TFs show less effects (smaller number of genes) - Roland Krause
Rsc3 and Rsc30 motifs - Roland Krause
Rsc3 contributes to nucleosome landscape - Roland Krause
A worthy final keynote with interesting science and a highly spirited and entertaining speaker. - Roland Krause
Many thanks for the coverage of the meeting! - Thomas Lemberger
I have my doubts whether the coverage was useful for all talks, it was quite tough to follow some of the highly condensed 12 minute presentations of complex subjects. One should probably focus on commenting the abstract as it already holds most of the important information but that would require a different set up. Well, let's see what we can do about it in the future. - Roland Krause
yep, thanks for the coverage it was very useful. - Pedro Beltrao
Roland Krause
Keynote Chris Burge: Global patterns in tissue- and factor-specific RNA processing
Posttranscriptional regulation affects the majority of genes. What is the context dependence of RNA processing? How do splice forms differ between individuals? How do they react to RNAi?` - Roland Krause
mRNA-seq (Illumina) is the enabling technology. Tissues are mixed and are not pure. Data set that was generated from 9 tissues and 5 cell lines and cerebellum cortex samples from 9 individuals. 435M reads were generated. - Roland Krause
Look at an individual genes displays individual positions that map to exon-intron structure, including tissue specific AS: Confirmation with Affy-arrays. - Roland Krause
Eight common alternative transcript events like skipped exon, retained intron, etc. can be identified by mRNA-seq. - Roland Krause
Fisher exact test can be used to determine tissue specificity, FDR set at 5%. - Roland Krause
35% skipped exons, 15% A5SS, 16% A3SS, mutually exclusive exon at 4%, alternative first exon at 13%, last exon 8%, tandem 3'UTR 7, retained intron 1%. All of them are tissue specific at ~65%. - Roland Krause
Small difference between tissue or individual differences is surprising, tissues still more diverse. - Roland Krause
Splicing codes govern constitutive and alternative splicing, reviewed in Wang and Cooper. - Roland Krause
A subset of AS events show switch like regulation - percent spliced gives a score. Some genes switch only in e.g heart. (Xing and Lee 2005) - Roland Krause
Often a switched exon change reading frame and result in aberrant transcript. - Roland Krause
Higher conservation of switch-like tandem 3'UTRs. - Roland Krause
Some motifs are conserved in 3'UTR, not in introns. Additional roles for factors binding such as Quacking and Fox-1/2. - Roland Krause
Q: Detecting selection differences in splice sites, is there enough data (Manolis Kellis) A. Not huge differences, we would like to have more data. - Roland Krause
Q: Junctions are important with mRNA-seq, what's the coverage required? A: A million read mapped uniquely to a junction(?) - Roland Krause
Q: Properties of switch-like exons A: Skipped exons compared to mutual exclusive exon - latter are more tissue specific and have similar lenght, probably related to protein structure/function. - Roland Krause
Q. Functional characteristic of switch-like exons. A: GO enrichment possibly in signal transduction but not enough data yet. - Roland Krause
Roland Krause
Scot A. Wolfe:A systematic characterization of Drosophila transcription factors via a bacterial one-hybrid system
The hunt for all TFs and TFBS in Drosophila with a one-hybrid system by KO of omega factor (of RNAP), introduce a fusion protein with omega. Noyes et al NAR 2008 - Roland Krause
Characterize all 84 homeodomains (finished), and TFs in anterior-posterior patterning - Roland Krause
Noyes, Cell, 2008, 11 groups of homeodomains recognition motifs. Correlation with specificity of TFBS and recognition motifs in conserved pos in homeodomain proteins, some single AA changes - Roland Krause
Build a code to understand the binding, modeling TF-protein contacts - Roland Krause
AP-patterning similarly, all data available at http://veda.cs.uiuc.edu/cgi-bin... - Roland Krause
Q:Compare one-hybrid and SELEX? A. One-.hybrid advantage only one round, in vivo, mimics the eukaryotic system. - Roland Krause
Q: Are PWMs a good model? Data set big enough. A: Not explored, probably yes. - Roland Krause
Roland Krause
Leping Li: GADEM: A genetic algorithm guided formation of spaced dyads coupled with an EM algorithm for motif discovery
Will focus on one aspect of the algorithm. - Roland Krause
Approaches to find binding sites: Word enumeration, ranking. Deterministic. Altrenative: Statistical, estimate parameters of a model with PWM, Gibbs sampler and EM - Roland Krause
How to find informative start for stats approach? Circular problem - Roland Krause
MEME convert all subsequence in model. - Roland Krause
For large data sets there are many starting models, try different lengths etc. - Roland Krause
This approach: cover a variety of lengths and computationally examine a small subset only. - Roland Krause
Start with spaced dyads, w1 spacer w2. Count all 3- to 6 mers. Rank them in groups.Short enriched motifs are likely to involved in motif. - Roland Krause
Convert spaced dyads to starting PW, run EM, scan for binding sites using optimized PWM. - Roland Krause
Find the best spaced dyads using genetic algorithms. - Roland Krause
Tested on simulated ChIP data, compared well with MEME, identified 8 out of 30 motifs (?) - Roland Krause
Results fast, even if the complete run takes many days to compute you get access to initial results. - Roland Krause
Q. If you educated guess starts on a local optimum, how to ensure the maximal? A. Apparently not practical problem. - Roland Krause
Q: Use of conservation information? A. Not considered. - Roland Krause
An extension of MEME approach, applicable for larger number of sequences. - Roland Krause
Roland Krause
Xuejing Li: Learning regulatory motifs from gene expression trajectories using graph-regularized partial least squares regression
The first talk of the last session. "Learning graph-mer motifs that predict GX trajectories in development." - Roland Krause
From seq to GX in time series in C. elegans development. - Roland Krause
Clustering is not useful for time series, rather go from promoter sequence (X) to expression (Y). Motif vector of k-mers. OLS cannot be applied due to dimensionality problem - Roland Krause
Partial least squares with dimension reduction, subsequent OLS regression by maximizing the covariance - Roland Krause
Define a graph on k-mers after setting most components to zero (sparsity constraint). - Roland Krause
9000 genes in 12 time points, scan for 6 and7mers, additional gene expression data in sperm/oocyte. - Roland Krause
Top 50 k-mers on sperm,/oocyte data shows, identify interconnected regions with MCODE, produce PSSMs, two known sperm motifs. Time series data (?) identifies high GC motifs, conserved in C briggsae. - Roland Krause
A challenging project, I would wonder how it compares to other approaches that go from seq to GX: - Roland Krause
Q: Role of TFs, not present in model? A: Time series highly similar, difficult to extract. - Roland Krause
Roland Krause
Harmen J. Bussemaker: Linkage analysis of inferred transcription factor activity reveals regulatory networks
Static, genomic information and dynamic cell state. A TF-centric point of view should allow to explain GX. From sequence to affinity, concentration and occupancy. How to quantify affinity? Change in Kd from PSAM. Estimate from CHiP experiments. - Roland Krause
How to estimate GX from TF? Problem of post-TL modification of TFs. - Roland Krause
Promoters have many binding sites, deconvolve computationally to a single site (see Foat, Tepper, Bussemaker NAR). - Roland Krause
Model hidden TF activities, examples for successful studies in vitro. - Roland Krause
Data source. genetic variations in yeasts (Rockman, Krugylak) - Roland Krause
eQTLs as in Jansen Nat Rev Genetics - Roland Krause
Existing approaches: eQTL hotspots or co-expression of groups. - Roland Krause
New approach: PSAMs from 123 TFs (MacIsaac 2006) - Roland Krause
Infer activity and compare different TF activities. - Roland Krause
aQLT results Hap1 linked to itself, Swi5 post-TL modification - Roland Krause
I'd say "great work" if he would not compete with the people next door. - Roland Krause
Q: Wow! What are the tresholds A. More than two steps eliminate statistical power. - Roland Krause
Q: Overlap with GX and aQTL? A: Several examples, an older version had little overlap, might be orthogonal approaches. - Roland Krause
One of the novel, important contributions to this conference. Earmark. - Roland Krause
Roland Krause
D. J. Verlaan: Targeted screening of cis-regulatory variation in human haplotypes
The hunt for disease causing regulatory SNPs by analyzing SNPs in non-coding regions. - Roland Krause
Allelic expression, intronic and exonic SNPs mapped to haplotypes for 1103 genes. - Roland Krause
Pooled 55 HapMap individuals (gDNA vs cDNA), Sanger seq of SNPs of interest. 2532 SNPs, 757 genes show AE, 34% had 2 SNPs. - Roland Krause
Validation, increased for genes with 2 or more SNPs - Roland Krause
Map of allelic expression. - Roland Krause
24 genes associated with disease validated. - Roland Krause
454 seq for 300 targets of allelic frequency correlates well with Sanger sequencing, showing that NG machines can be used for AE. - Roland Krause
Roland Krause
Jeffrey A. Rosenfeld: Repression at the single-nucleosome level
Focus on a particular aspect, H3K9me3 (K9), of a large scale study of histone modification - Solexa data from Barski et al (I cannot identify the reference) - Roland Krause
5% of human genes are enriched in K9 modification - Roland Krause
L1 repeats are highly enriched in K9, long regions typically found near centromeres. - Roland Krause
Q: Bias in sequencing. A: Clear differences with other types of histone mods despite possible biases. - Roland Krause
Roland Krause
Keynote Leona D. Samson:Genomic predictors of interindividual differences in response to DNA damaging agents.
Recapitulating the motivatons and ideas leading to the publication in Genes and Development. http://www.pubmedcentral.nih.gov/article... - Roland Krause
Q: Why was p53 not identified? A: Possibly linked to its function as TF (low expression?) - Roland Krause
The 24 cell lines are anonymized, so no way of linking outcomes to different groups of people. (Source: http://ccr.coriell.org/Section...) - Roland Krause
Roland Krause
Keynote Eddy Rubin: Genome Wide Set of Human Enhancers
A catalog of genes exists, a catalog for regulatory sequences is missing. - Roland Krause
Sonic hedgehogs enhancer's deletion results in limbless mice, similar conditions apply in human. - Roland Krause
Comparative genomics projects identify enhancer, can be taken to the mouse with reporter genes. Is however slow and "not very genomic". - Roland Krause
Tested 1000 enhancers, ~47% confirmed. - Roland Krause
Three different limb enhancers with different localization patterns with temporal and spatial specificity. - Roland Krause
Most enhancer found involved in brain development in mice. - Roland Krause
CHiP-seq approach using co-activators specific to enhancers promise specific seqs. - Roland Krause
Mice at e11.5 are screened, dissected to midbrain, forebrain and limbs and tested reproducible peaks in mouse model. - Roland Krause
CHiP-seq predictions are more successful at >80% with p300. Specific for the tissues tested. - Roland Krause
Relation between binding site and expression of neighboring genes can be explored by Affy chips at e11.5 midbrain tissue, showing enrichment of genes. - Roland Krause
Larger studies with more tissues are possible due to NG seq capabilities. Comparative genomics approaches identified only few enhancers with limited success. - Roland Krause
Many genes were known in Gat6, all of them recovered. - Roland Krause
Predictive power is increasing, regulatory sequences can be identified and understood in the near future. - Roland Krause
What makes us human (no picture of Bush next to the ape this time)? - Roland Krause
Regulatory sequences, non-coding regions relevant to species. Exploration of accelerated sequences in human. One region with 26 mutations in 600 bp not conserved in human but across other metazoan species. - Roland Krause
Expression is located in the thumb might be important to its development. Possible adaptation after coming down from the trees. - Roland Krause
Evolution of a Pax9 binding site might regulated downstream gene Pbx2. - Roland Krause
Q. Conservation of p300 binding sites. A: Not very. - Roland Krause
Q: Regulatory regions are re-used but not the limb enhancers described in the talk.A Missing data to answer. 60% specificity? - Roland Krause
Q:Human SNPs in the limb regions? A: Non observed as of now. - Roland Krause
Q: Have you tested the length of the enhancer - 600 bp? A: Played with the length, unclear results, typically take longer regions to be on the safe side. - Roland Krause
Q: p300 should be crucial to the genome. A: It amongst the important one but the real reason is the availability of a clean antibody. - Roland Krause
Roland Krause
Yedael Y. Waldman1:Gene Translation Efficiency in Healthy and Cancerous Human Tissues
Codon bias in human is not well understood. First large scale tissue specific analysis. - Roland Krause
tRNA pool adjusted to the adult, not the fetal stage. Large differences between tissues. tRNA pool is nearly optimal. - Roland Krause
Solid groundwork, we discussed this in length in the group meeting recently. Another publication to earmark. - Roland Krause
A correlation between TE and cancer should be expected because of strong bias observed in organisms with high growth rate. However, no such connection is observed. Instead, the tRNA pool appears to be changed (preliminary results).(?) - Roland Krause
Q. TE are tRNA pools adopting to bias or vice versa? A: Probably easier to adopt the sequences. - Roland Krause
Q: Ribosome are typically biased genes. Did you do a GO analysis. A: No enrichment found (?) - Roland Krause
Roland Krause
Angela H. DePace: A cellular resolution atlas of gene expression in Drosophila pseudoobscura reveals interspecies variation in embryonic patterning.
How do enhancers work? We need a unified framework for TFs, CR and GX. - Roland Krause
We can go from sequence to protein structure and the impact of SNPs but for regulatory regions, we have fundamental hurdle. (Problem 1). We have quantitative description for protein function, not present for expression patterns (Problem 2). We cannot map in-situ images to a whole embryo expression. - Roland Krause
We want cellular description for a system that we know. - Roland Krause
High resolution at cellular level of the blastoderm. Determine the embryo age in 10 minutes bins by microscopy. - Roland Krause
Cylindrical projection of the embryo show different expression patterns between fly species, e.g. Hb and Kr but others are similar (ftz) - Roland Krause
We can find corresponding cells in different species using reciprocal best hits. - Roland Krause
Cells between species map well to middle regions, differences in the head and tail regions. - Roland Krause
Q: How meaningful is it to compare cells between species? A: We use expression data to identify the same, pseudoobscura has fewer cells. Cannot correct for problems in timing. - Roland Krause
Q: Will this translate to phenotypic differences? A: Difficult to track in this stage. Some places to look in the anterior, can be tracked into later stages of the development. Hard experiments. - Roland Krause
Roland Krause
Jeffrey H. Chuang: Comparative Analysis of Enhancers and Regulatory Motifs for Gene Expression in the Vertebrate Brain
Abundant human-zebrafish conserved non-coding elements (CNE) integrated with in-situ images. Large distance, everything identical will be important, short developmental time in fish, translucent embryos. Study of the forebrain with a GFP assay using transposons. - Roland Krause
Multiple time points can be tested. 101 CNE tested in 3 years by a single person. Sequence context should be able to give the function of a given CNE, which is confirmed by examples. 79 CNEs, 20 drive expression in the forebrain. Context very useful and relevant. - Roland Krause
cneViewer (in Bioinformatics http://bioinformatics.oxfordjournals.org/cgi...) can be used to assess tissue specificity. - Roland Krause
CompGenomics approaches can answer questions such as change over time. - Roland Krause
CNE motifs involved in forebrain development, identified from different motif searching algorithms. - Roland Krause
cneBrowser: A database of verified CNEs. http://bioinformatics.bc.edu/chuangl... - Roland Krause
Roland Krause
Erwin Frise: Mining embryonic expression images reveals novel developmental pathway components
Tomancak 2002, 2007 Genome Biology data - Roland Krause
Functionally related genes are expressed in similar spatial regions not reflected in microarrays. - Roland Krause
Images are difficult to compare due to shapes and microscopy artifacts. - Roland Krause
Virtualized expression patterns are analyzed in automated frame work that fits an ellipse to the embryo, overlay with a mesh of 311 triangles, that can be transformed into a microarray like matrix. - Roland Krause
80.000 images available, 5700 in stage 4-6, no expression 4400, redundancy 2700 and only distinct, filters end up with 600 images. - Roland Krause
Uses APC clustering (Science 315:972) - Roland Krause
Intersection with different data, e.g. syn-lethals. - Roland Krause
Identification of interacting genes using Markov Random Fields. - Roland Krause
(examples, tinmen, snail, hkb) - Roland Krause
Novel in this work is the triangular meshes. - Roland Krause
Roland Krause
Guy Zinman: Constrained clustering for cross species analysis of gene expression data
Cross species regularly done for static networks (PPI, sequence similarities) - Roland Krause
Gene expression has dynamic layer, is more difficult to compare and is noisier and continuous.Many first approach to data analysis is clustering. Overcome the noise by use of expression and similarity data to reward orthology. - Roland Krause
The rewarded clustering represents the confidence in the similarity. - Roland Krause
Core genes: conserved in sequence and expression - Roland Krause
Comparison of three yeast species, treated with fluconazole. Finds core and divergent clusters and identifies anti-correlated TFs. - Roland Krause
Some orphan clusters show new motifs. ABC transporters display divergence. Hypothesis: SG in sterol rich environment, CG in sterol rich environment. - Roland Krause
Immune response data, infected with influenza and tuberculosis, clusters show interesting GO properties. Again, core genes are identified involved in immune response. Mouse does not respond to A/PR infection: Mice die to quickly and not properly activate innate immune response. - Roland Krause
Species divergence: Similarity between species higher than between different infectious stimuli. Differences in penetration and recognition. - Roland Krause
Nice, little analysis frame work - Roland Krause
Q: Bonus for orthologs - introduction of bias for large, similar families? A: Probably not, use of different thresholds. - Roland Krause
Q: Source code available online? A: Website for immune response. Something general will hopefully come out - Roland Krause
Roland Krause
Xin He: Combining Transcription Factor Binding Site Clustering and Evolutionary Conservation for Predicting cis-Regulatory Modules
Computational prediction of CRMs by clustering of binding sites and inter-species conservation are to be combined into a probabilistic model. - Roland Krause
Binding site clustering by HMM and probabilistic models of CRM evolution. - Roland Krause
TFBS evolution follows Halpern-Bruno model. Also need to describe background model, indel, gain and loss. - Roland Krause
EMMA can do both alignment and scoring, comparison to LAGAN - Roland Krause
Multi-species model called STEMMA - Roland Krause
Pretty comprehensive model, quite some interest in the audience despite the last talk before the poster session. - Roland Krause
Roland Krause
Ron Shamir: Discovering transcriptional modules by combined analysis of expression profiles and regulatory sequences
The classical motif finding problem analyses a co-regulated gene-set to obtain overrepresented motif. A new algorithm, called Amadeus will be presented, including different data sources and conservation, etc. It performs reliably and fast. - Roland Krause
Goal: Find motifs and co-regulation in one step. Bypass the two-step approach. - Roland Krause
Allegro, a novel algorithm discretizes the expression patterns to discrete expression patterns. CWM model is non-parametric and robust. - Roland Krause
Motifs create PWM, expression ´creates conditional weight matrix (CWM), comparison, motif enrichement p-value. Avoid overfitting with cross-validation like scheme. - Roland Krause
Large data-set of the human cell cycle or yeast osmotic shock. Identifies motifs even when only relevant in few conditions. - Roland Krause
Additional data set: 3'UTR analysis for human stem cell. No significant motifs with classical measures, novel measures using binned scores finds 3 plausible motifs. - Roland Krause
Q: Distance from TSS in algorithms? A: Typically 1000bp but depends on question. Can be set by user. - Roland Krause
Q. How to manage convergence. A:Scoring large number of k-mers and move to PWM, finally a EM process. - Roland Krause
Roland Krause
Keynote Bing Ren: Chromatin Signatures of Transcriptional Enhancers
Regulatory elements of the non-coding elements of the human genome. - Roland Krause
Distal regulatory elements play important roles in cell specific gene regulation. - Roland Krause
Analysis focus on promoters (*-200 of TSS) but are we oversimplifying? - Roland Krause
Enhancers need to be considered as is to shown in this presentation. Benerji (1981) identified the SV40 enhancer, upstream, downstream and within introns. Other examples include eve in Drosophila, shown to be specific for tissues. - Roland Krause
Model of enhancer function include co-activators, upon activation, a loop is formed to bring the enhancer close to promoter. Many co-activators remodel the chromatin. (p300, CBP, ...) Histone tail modification suggested as a crucial developmental elements. Term epigenetics for this feature not agreed upon by everyone. - Roland Krause
Mechanisms of activation by structural change or interaction of additional components (code). - Roland Krause
Chromatin signatures are dynamic. - Roland Krause
Cell models and CHiP-chip identify methylation and acetylation in ENCODE pilot study. Use of tiling arrays show enrichment profiles for each modification. - Roland Krause
Do all promoters show the same modifications? Cluster analysis show common modifications. - Roland Krause
p300 co-activator, sequence specific. 74 putative enhancers are correlated with different modifications than promoters, e.g. H3K4Me1 - Roland Krause
Prediction promoters and enhancers available at sens > 80% and spec > 90%, some validated experimentally. - Roland Krause
Now genome wide for H3k4me1/3 in different cell lines identifies many putative enhancers. - Roland Krause
Validation by reporter assys confirm 7 out of 9 predictions. - Roland Krause
Small overlap between HeLa and K562 cells. - Roland Krause
Pluripotent ES cells differentiation determined by chromatin structure. - Roland Krause
Mapping enhancers in ES cells (unpublished data omitted) - Roland Krause
Known enhancers identified, correlation with changes in gene expression in differentiation. - Roland Krause
Q: How to map target genes to enhancers? A. Use insulators such as CTCF. - Roland Krause
Q. Estimate for the total number of enhancers in human genome. A: We don't see saturation yet, much more than the number of genes. - Roland Krause
Q: Classes of enhancers that do not require p300. A: Many other surely exist and are relevant. Available antibodies limiting. - Roland Krause
Q: Enhancers in gene deserts. A: Not clear cut answer, correlation exists but there are clusters of enhancers deserts involved in diseases. - Roland Krause
Q: Evidence for CR with TA., mechanisms. A: Transcription and CR are closely located and work together. Extensive studies, complex problem. - Roland Krause
Interesting QA session, there are certainly more questions, showing the interest in the subject. - Roland Krause
Roland Krause
Neil Clark: Inferring transcription factor targets from gene expression changes and predicted promoter occupancy
Applied to CHiP-seq and gene expression data from stem cells. Relate gene expression change and TF binding. to receive an average rank. Select FDR threshold. - Roland Krause
Use PWM to estimate the probability to a region (http://genomebiology.com/2005...) - Roland Krause
Presenters name is Clarke, not Clark, sorry. - Roland Krause
Little correlation between Pho4 predicted binding and gene expression overall, but about 20 genes show correlation. - Roland Krause
102 PWMs, 1600 gene expression data, finds sensible clusters. Find union of inferred targets. (Young data) - Roland Krause
Autoregulatory and feed forward loops motifs are overrepresented in the inferred network. - Roland Krause
Roland Krause
Todd Wasson: Transcritional regulation and DNA occupancy
Regulation is exercised by bound TFs not just the presence of a site. - Roland Krause
We need to understand the binding, in particular nucleosome occupancy, mutually exclusive with TF binding. Many configurations are possible, static and binary views are oversimplistic. - Roland Krause
A simple, probabilistic model with concentrations is introduced. A large number of sites are present, regions of degeneracy, weak sites. Thresholds are not an ideal solution. - Roland Krause
The pipeline includes motifs, concentration, Boltzmann chains, posterior decoding. Examples in yeast show that the system can show that TF pos affect nucleosome positioning. - Roland Krause
Decoding changes when Gal4 is introduced in the GAL10, GAL1 region. Nucleosomes change TF positioning. - Roland Krause
Concentrations, modeled as weights affect TF binding. Explicit competition is modeled, example includes PHO4, PHO2, NRG1 and nucleosomes. - Roland Krause
Implicit co-operativity between PHO2 and PHO4 is a side effect of the nucleosome occupancy. - Roland Krause
Q: How to model chromatin remodeling? A: Could be applied as post-processing of the method, not done yet. - Roland Krause
Q: Relative differences, energy differences not accounted for. A. No. - Roland Krause
Clear talk, effective treatment. - Roland Krause
Roland Krause
Jason Ernst: Integrating Multiple Evidence Sources to Predict Transcription Factor Binding Across the Human Genome
Limits of knowledge for TFBS despite data bases and HT approaches. - Roland Krause
Principal ways of integrating different sources of evidence are required given the plethora of approaches and parameters. - Roland Krause
Learn empirical prior of TF on general features using genome position, no motif information. Different data sources used for training using different cell lines and technologies. - Roland Krause
For each data set use logistic regression classifier with positive and negative examples. CV ROCs all much better than random. 29 input features to prior, histone modification features contributes most. - Roland Krause
Combine with motif agreement using PWM using by multiplication with average prior. - Roland Krause
CV without the test-TF with good AUCs for model. - Roland Krause
Orthogonal validation with E2F2, beating motif scanning approaches. - Roland Krause
Q: Mix of static and dynamic features - does it make sense to include dynamic info? A: Should not be a problem due to multiple data sources. - Roland Krause
Q: p53 is different from the other data A: David Haussler's PNAS paper explains. - Roland Krause
Roland Krause
Raja Jothi: Genome-wide identification of in vivo protein-DNA binding sites from ChIP-Seq Data
How to obtain exact TFBS from NG sequence data? - Roland Krause
What is the average length of a DNA fragment? Can be estimated by paring fragment lengths on different strands. - Roland Krause
SISSRs algortihm uses overlapping windows counting sense and anti-sense tags to obtain a profile were counts cancel each other out to obtain candidate regions. Counts need to exceed background and additional constraints. - Roland Krause
Re-analysis for CTCF, NRSF/REST and STAT1 published data found more binding sites (~fold). Motif searches to validate the findings favorable. - Roland Krause
Find the nearest canonical binding site show good resolution. Why is there a difference in the first place? Resolution of the tags is limited. - Roland Krause
Different motifs bound in close proximity, expansion to 200bp window. - Roland Krause
Tag density related to conservation to motif and to binding strength (full or half motif) - Roland Krause
Noise in CHiP-seq data might lead to additional sites, play with window sites to remove problem. - Roland Krause
Roland Krause
Guo-Cheng Yuan: Targeted recruitment of histone modifications in humans predicted by genomic sequences
Specific modification in ES cells, e.g. http://www.nature.com/nature... - Roland Krause
Evidence for DNA sequence properties (not short motifs) linked to CR. - Roland Krause
How to build a model? Periodic patterns are described in http://www.nature.com/nature... , the predictions can be improved. - Roland Krause
N-score model with wavelet energy for periodicity, step wise logistic regression. - Roland Krause
Comparison with Segals model show differences in previous work. - Roland Krause
Prediction of histone modification sites. - Roland Krause
Patterns for prediction and CHiP-seq agree for most modification for promoters and enhancers. Peak for H3K4me1 peaks around TSS. - Roland Krause
Similarity between histone modification marks Correlation at 0.88. - Roland Krause
Similar sequences have little overlap in CHiP data for H3K27me3 and H3k4me2/me3 explained by feedback loop. - Roland Krause
Roland Krause
Steven Chen: Integrating Biological Knowledge with Gene Expression Profiles for Survival Prediction of Cancer
Goal: Model with predictive power. Problem: High dimensionality - Roland Krause
Use of gene sets instead of single genes - number is smaller and already contains hooks for interpretation. - Roland Krause
Proposition of supervised PCA for dimension reduction by using genes most associated with outcome with standard linear regression. Determine thresholds select subsets. - Roland Krause
Cross-validation, fit PCA model, calculate LRT. - Roland Krause
Group genes by GO or KEGG etc. Used k-means and Gap statistic to extract super genes (representatives of cluster). - Roland Krause
Cox regression. - Roland Krause
Survival prediction method. Two stage approach by SPCA and application of models to supergenes. - Roland Krause
Evaluation by LRT p-value, R^2 statistics and time dependent ROC on two breast cancer data sets. (Wang 2005, Miller 2005). ROC at 5 year follow up shows slight but consistent increase for gene sets approach over single gene methods, incl Lasso. Across all time points, gene sets outperform other methods in most time points. - Roland Krause
Plot of set "apoptosis", conclusion. - Roland Krause
Roland Krause
Hedi Peterson: Regulatory Networks Pertinent To Self-Renewal In Human Embryonic Stem Cells
Core ES networks are known and strongly conserved. - Roland Krause
Regulation by Oct4. Genes bound by CHiP-chip, RNAi, not binding but regulated by Oct4, probably by additional TFs. The latter are the target of the study. - Roland Krause
Different CHiP-chip data sets, incl Boyer set. Look for genes not bound but still down- or upregulated. Look for similar patterns. 500 genes up, GO categories and miRNAs as expected. 200 down, mainly mitochondrial. - Roland Krause
Focus on upregulated genes, what are the TFs using Transfac profiles. Some very frequent e.g. AP-2, SP1 etc (Analysis with http://nar.oxfordjournals.org/cgi... ) - Roland Krause
Sox2 involved in regulatory pathways. Interconnected targets. Hairball (light) - Roland Krause
Q: Experimental validation? A: Work in progress. - Roland Krause
Q. Very frequent TFs found, relevance? A. Modules show different frequencies. - Roland Krause
Q. Replicates, confidence A. Three replicates in CHiP data, not for gene expression data. Q: Comparison of different CHiP-data. A: Some factors differ. - Roland Krause
Roland Krause
Jason G. Mezey: Bayesian structural equation model (SEM) identification of regulatory networks
Mapping genetic loci and network discovery - Roland Krause
Find eQTLs. Find networks for expression. - Roland Krause
Structural equation models. Directed edges among different genes. Normal error. Problem with cycles, no ML solution. Problem of equivalence classes. - Roland Krause
Solutions to the problems by equivalence operator allows re-scaled orthogonal transformations. - Roland Krause
Application: Affy 500K genotypes, eQTLs. Single marker analysis. - Roland Krause
Application: Pathway discovery in Phase II HapMap individuals. (Stranger Nat Genetics, 2007) - Roland Krause
Other ways to read this feed:Feed readerFacebook