"A proposed author ID system is gaining widespread support, and could help lay the foundation for an academic-reward system less heavily tied to publications and citations."
- Pedro Beltrao
from Bookmarklet
This looks closer to reality than ever. 23 organizations supporting the idea with plans to have working code within 6 months based on Thomson Reuters' ResearcherID.
- Pedro Beltrao
From Keio University. Two recent prime ministers graduated from there. Newly elected prime minister Hatoyama is not a Keio graduate, very sorry.
- Ruchira S. Datta
Morita is in charge of Institute for Advanced Biosciences at Keio U, focusing only on data-driven systems biology. (Or quantitative systems biology, or multi-omic systems biology, or integrative systems biology, they all mean the same.)
- Ruchira S. Datta
10 years ago, everyone was talking about genome, transcriptome, proteome; no one about metabolome.
- Ruchira S. Datta
They use CE-MS: capillary electrophoresis - mass spectrometry. Can get peaks at organic acids, amino acids, sugars.
- Ruchira S. Datta
Couldn't use for biological samples, as they are too complex--would get hundreds or thousands of peaks, which overlap with each other and are just a mess.
- Ruchira S. Datta
So, use mass spec. Then have two dimensional information: migration time from CE and m/z from mass spec. Thus have good resolution of peaks.
- Ruchira S. Datta
Upgraded to CE-TOFMS, more accurate and higher sensitivity.
- Ruchira S. Datta
Have large list of metabolites and anions they can identify.
- Ruchira S. Datta
Apply to E. coli systems biology, SCIENCE vol 316 (2007)
- Ruchira S. Datta
Disruptome: measure growh rate under different conditions of different single deletion mutants from single deletion mutant library of the Keio Collection, Baba et al Mol. Sys. Bio (2006)
- Ruchira S. Datta
For 306 genes, no single deletion mutants, presumably because they are essential genes.
- Ruchira S. Datta
For wild type and each single deletion mutant: metabolome (CE/TOF-MS: 541 compounds), proteome (enzymes in metabolic interactions), transcriptome, and fluxome using standard assays. Show multi-omics data together.
- Ruchira S. Datta
Observe reversed flux in knockout, trying to compensate for knocked out reaction. Similarly, see accumulation of substrates of disrupted enzymes. But these are very localized, and most metabolites remain unchanged.
- Ruchira S. Datta
KO: see only small changes in mRNAs, enzymes, and metabolites. Do see large changes in mRNAs and enzymes due to environmental conditions. So E. coli has very robust metabolism.
- Ruchira S. Datta
Next: red blood cells. Theor. Bio and Medical Modeling (2005)
- Ruchira S. Datta
Red blood cell in hypoxia versus normal condition. (Artery <=> vein). Many differences.
- Ruchira S. Datta
Low oxygen: deoxyHb increases and binds to Band3, enzymes are instead released from Band3, glycolysis is activated to generate ATP.
- Ruchira S. Datta
Three enzymes block glycolysis pathway normally, but in hypoxia block is removed.
- Ruchira S. Datta
Metabolomic biomarker found: Acetaminophen induced hepatotoxicity, J. Biol. Chem. 281 (2006)
- Ruchira S. Datta
Baran et al BMC Bioinformatics Differential display software: gave mouse acetaminophen, measured mouse liver, at 1h, 2h, 4h, 6h, 12h, 24h. Found good biomarker.
- Ruchira S. Datta
Therefore, using 20 patients and 20 controls, find biomarkers for various diseases: cancer, Alzheimers, mental illness.
- Ruchira S. Datta
Metabolome analysis of colon & stomach cancer, Hirayama et al Cancer Research 2009
- Ruchira S. Datta
Cancer cells reside in glucose limited condition always, but somehow ATP and energy charge is the same as normal. So, cancer cells may have different energy metabolism than normal tissue.
- Ruchira S. Datta
Put pancreatic cancer cells under glucose-deprived *and* oxygen-deprived conditions, did metabolome analysis. Found accumulation of succinate, only in these conditions.
- Ruchira S. Datta
Cancer cell may be using fumarate respiration as an "emergency" energy production machinery. No previous report of this in human cells, only in ascariasis--a parasitic worm that exists in glucose-limited, oxygen-limited condition. TCA cycle goes in opposite direction, accumulating succinate.
- Ruchira S. Datta
Fumarate respiration may be a crucial energy generator in nutrient-limited tumors => this could be a novel therapeutic target by existing anti-parasitic drugs.
- Ruchira S. Datta
Institute is in Tsuruoka, 1hr flight from Tokyo where most other campuses of Keio U. are. Have "metabolome factory". You must see! Please come!
- Ruchira S. Datta
15 mins from the beach, 30 mins from the mountains, famous hot springs, national treasures
- Ruchira S. Datta
Most important facility: the incubator of ideas: a hot tub, available 24/7!
- Ruchira S. Datta
What to do with large models of pathways once we obtain them? Just put them in textbooks, or use them in some systematic way as well.
- Ruchira S. Datta
Many possible uses, e.g., network evolutionary comparison / cross-speces module identification. Here will focus on alignment of physical and genetic networks, and moving from genome-wide association studies (GWAS) to network-wide "pathway" association (NWAS).
- Ruchira S. Datta
E.g., Charlie Boone, others assembling large networks of synthetic lethals and epistatic interactions in model species. Also see Tong et al Science 2001. Synthetic lethality is an example of a genetic interactions. Nevan Krogan and Boone also quantitate: get a number and a sign to describe the interaction.
- Ruchira S. Datta
E.g., quantitate colony size (area) of wild-type (WT), delete a: a_delta, delete b: b_delta. Expected colony size of double mutant is a_delta b_delta. Smaller colony suggests negative interaction, bigger colony suggests positive interaction. Many other models have been proposed. The mode of this product model is neutral interaction, as expected.
- Ruchira S. Datta
Quesion: how do these genes interact? One hypothesis: physically. But in fact very little overlap of genetic and physical interaction networks. However, genetic and physical interactions are orthogonal. Physical interactions within modules/complexes, genetic interactions among them. Kelley Nature Biotech 2005:
- Ruchira S. Datta
E.g., dynactin complex connected genetically to prefoldin complex, connected genetically to kinetochore.
- Ruchira S. Datta
Bandyopadhyay et al PLoS Comp Bio 2008: Apply the same procedure where nodes are entire clusters of physical interactions. Edges are entire bundles of genetic interactions.
- Ruchira S. Datta
Krogan and Ideker have NIGMS grant for comparision of genetic interaction networks between budding and fission yeasts (S. cerevisiae and S. pombe). Roguev et al Science 322: 405 (2008)
- Ruchira S. Datta
Now can have "synthetic lethal" at the level of complexes/modules rather than individual genes.
- Ruchira S. Datta
Next: NWAS: Network-based approaches to identify genetic interactions in gene association studies. With Nevan Krogan (UCSF), Richard Karp (UCB), Aude Guénolé & Haico van Attikum (LUMC), Greg Hannum & Rohith Srivas (UCSD).
- Ruchira S. Datta
David Goldstein in NYTimes: A Dissenting Voice as the Genome Is SIfted to Fight Disease. "There is abolutely no question," he said, "that for the whole hope of personalized medicine, the news has been just about as bleak as it could be." Can explain only a few percent of the genetic component of most common diseases.
- Ruchira S. Datta
Genome-wide association studies: type individuals at millions of polymorphic genetic sites (DNA markers or SNPs). Think of genome as line. Find SNPs that correlate with trait of interest. Plot genetic marker/individual matrix, have trait measure for each individual.
- Ruchira S. Datta
Problems: statistical power. Chance of observing spurious interactions is high. How to interpret biologically? The marker may not be within a gene, and there may be hundreds of genes nearby linked in. Finally, have complex traits with many marker interactions.
- Ruchira S. Datta
If introduce pairs of markers, then have quadratic multiple testing problem; problem of statistical power is even worse.
- Ruchira S. Datta
Genetic interactions occur frequently in GWAS, but they are impossible to find. Marker-marker interactions are very difficult to identify in GWAS data due to the lack of statistical power.
- Ruchira S. Datta
Treat interactions as interactions between intervals of SNPs, rather than individual ones. (Similar idea to linkage disequilibrium.) Then look for interval-interval interactions instead. Bring in the protein interaction network to separate the wheat from the chaff.
- Ruchira S. Datta
GWAS genetic interactions also run between physical networks and pathways. Now the genetic interactions are the ones revealed by the GWAS, unlike in the first part of the talk where they were synthetic lethals. E.g., ADH1,PRC1 has GWAS genetic interactions with *almost every* part of RNA-Polymerase I. Now *this* is very unlikely to have occurred by chance.
- Ruchira S. Datta
Can construct higher level maps of GWAS genetic interactions, between complexes, with bundles of interactions as links between them.
- Ruchira S. Datta
Are these validatable or reproducible using gene-by-gene approach of e.g., Krogan, Boone?
- Ruchira S. Datta
Closing thoughts: prior mechanistic knowledge, represented by a scaffold of biomolecular interactions may be the key to making sense of gene association studies. Conversely, GWAS provide a rapid and high coverage method to map genetic interaction networks at large scale. This talk was merely a proof of principle.
- Ruchira S. Datta
Peter Karp: genotype-phenotype? A: Know something about heritability from previous studies, e.g., that 50% of a particular disease is genetic, but right now may only be able to explain only 2%. Would like to explain more, and if possible also model environment.
- Ruchira S. Datta
Q: Goldstein's complaint may have been based on making the threshold for what is a prediction very high, may need to put it much lower. A: Yes, that's what was done, needed to increase statistical power. Approach by John Storey (sp). Integrating with protein interaction network allows us to go much further down in list while staying above the noise.
- Ruchira S. Datta
would be nice to have it as an OpenID as well ;) and have it really open, not owned/authored by a for-profit organization..
- Yaroslav Nikolaev
OpenID+. Having a site that (a) function as OpenID provider, (b) contains information about you (e.g. department, contact details) that _you_ are in control of (i.e. edit/hide), (c) can autogenerate your publication list, and (d) allows you to manually add other contributions to the advancement of science (e.g. open source projects). We'd need the backing of one or more major publishers, but stranger things have happened. Can't we set something up like that?
- Jan Aerts
Sounds like a good idea. One ID to rule them all....
- Allyson Lister
+1 Jan. Between OpenID and the auto-publication-list generators at places like Nature Network and BioMedExperts, it seems like most of the necessary functionality exists, just not in one place.
- Bill Hooker
There are initial investigations being made (certainly within the field of publishing and the library community) towards institutional identifiers which may well be easier to handle than trying to do the individual author identifiers.
- Jill O'Neill
But institutional identifiers alone will not work. I've moved quite a few times and saw that people still try to contact me on the email address from two jobs back, because that was the email of the "corresponding author".
- Jan Aerts
Would it help if journals suggested to authors that they include their OpenID with their address details, if they have one? That should be pretty easy to do.
- Maxine
Maxine: Yes ! That would be great ! It would be nice to see that OpenID just like we can see the DOI of the paper ! This would motivate the other publishers to do this !
- Pierre Lindenbaum
Maxine: Yes, yes, yes! That would be absolutely brilliant! Are Rafael, Simon, and Peter (or Bora) about? This might actually work!
- Cameron Neylon
I'm not a guru about OpenID. Can it be then used later to find the publications/geoloc/social networks ?
- Pierre Lindenbaum
In the medium term I agree with Maxine: let journals suggest to authors to include their OpenID. But I'm also with Deepak's comment in the "related entry": we should separate our author ID from our general online identity. I'm still brooding on how this all could be incorporated into a system where you as a researcher can update your scientific contributions yourself in a central place...
- Jan Aerts
Pierre: I suppose you still need a central website/database, like researcherID.com for example (I know: not open and stuff...). Ideally you'd log in using an OpenID which would also be your researcherID. Even better: the system could function as an OpenID provider itself (that would keep your scientific identity separate from your general online identity). But the website would then have all functionality to find publications/geoloc/social networks. Am I (a) kicking in open doors or (b) making no sense?
- Jan Aerts
Jan: That makes sense. Of course it would be great if the NCBI could be this OpenId provider (well, at least for the biologists... )
- Pierre Lindenbaum
Pierre: NCBI could indeed be an OpenID provider, but it should be limited to that. We need a separate entity doing the publication/geo... functionality. This is important enough that it should be the core function of the entity providing it. Also: would be nice if we could add contributions like "have helped in discussion about blabla on FriendFeed" :-). (Or is that "distracted discussion from blabla")
- Jan Aerts
All you really need is a unique identifier - it could be an openid or it could be a random string. The advantage of openid is that it acts as a pointer to a service which treats you as a resource. Services can then connect that to any other information that is available. The other advantage of openid is that the provider is completely irrelevant - it can be anybody from the journal to NCBI to an institution to a third party. You're never tied into one provider.
- Cameron Neylon
I'll say there was an interesting meeting early this year sponsored by CNI to bring publishers, A&I vendors (like Thompson Scientific), library reps (including OCLC and LIbrary of Congress), and others with interest in this to talk about it. OpenID was mentioned but many publishers and vendors already have their own (internal and not eager to share) identification systems. I'm not sure if anything definite came out of that meeting unfortunately (and I was there).
- Sarah
Maxine, if you get this proposal rolling, your name will be legion :)
- Neil Saunders
Sarah - quite a few of these points were made in the EMBO piece at the link. In fact, probably the article is a report arising from that meeting - though there is not much information of that sort, or about the author, there (ironically!). I will ask about the "display openID" and get back to you - will not be instant because one person is away until new year, but I won't forget.
- Maxine
BTW there has been a lot of discussion on this in Nature over the years too - since 2006 when I started the author blog I have attempted to capture the discussion there, see: http://blogs.nature.com/nautilu... (includes Raf's correspondence in fact).
- Maxine
@Chris : in my view, this central repository (CrossRef/NCBI?) would associate this ID with a FOAF file containing all the information you want to publicity release :, your interests, your web accounts, your contacts, your publications....
- Pierre Lindenbaum
Yes, sounds good Pierre. According to the EMBO article at the link and various others, one issue is all the world's registration systems recognising the ID. Other issues, also. As we mentioned in another thread very recently, I am following up on this and it is on the agenda of a wider discussion about authorship and related issues that is going on between various journals - I will keep people posted with what I hear.
- Maxine
One major problem with setting up UAIDs seems to be the identification of a single provider of these IDs, and the monopoly that would result from it. So I feel like asking a provocative question: does one really need to have only one UAID provider ? When nucleotide databases were started, new sequences were communicated either to EMBL or Genbank, or even to other, more specialised,...
more...
- Etienne Joly
Work published in Cell, August 21, 2009 (after 2 years)
- Ruchira S. Datta
Boltzmann visited Stanford in 1905. Gave first prescription about the protein structure, i.e., solving the folding problem: integrate the Boltzmann function. However, need to know entropy and enthalpy in excruciating detail.
- Ruchira S. Datta
Also, proteins are not random polypeptides: they have been selected so they can fold quickly. Thus the energy landscape is a funneled landscape.
- Ruchira S. Datta
So need to go back and forth between Boltzmann (physics) and Darwin (evolution).
- Ruchira S. Datta
Function is an associated property of selection. Issue 1: correct level of selection--gene(protein)? cell? organism? group? species? Issue 2: history vs function, i.e., historical accidents vs functional necessities.
- Ruchira S. Datta
E.g., serine protease S1 family has 1500 known examples. E.g., trypsin, chymotripsin. >1039 papers on this family in 2007. Binds serine of another protein in small pocket and cleaves it.
- Ruchira S. Datta
Serine proteases function: catalysis and specificity. Hydrolysis of Suc-Ala-Ala-Pro-X-AMC. Chymotrypsin specific for X=Phe, Trypsin specific for X=Lys. But experimenters mutated trypsin to have specificity like chymotrypsin.
- Ruchira S. Datta
Proteins subject to new Moore laws: number of available sequences increasing exponentially.
- Ruchira S. Datta
History is reflected in sequence. If proteins are in same tree, will have historical and functional correlations reflected in the alignment.
- Ruchira S. Datta
Can measure conservation. Conservation at each position can be measured e.g. by relative entropy. Conservation is heterogeneously distributed. The most conserved positions are in the core or at the functional interface.
- Ruchira S. Datta
Could approximate conservation by checking whether amino acid is the most frequent or not.
- Ruchira S. Datta
Ranganathan: statistical coupling analysis (SCA). Look at conservation at *two* positions. Have covariance matrix and thus can check correlated conservation.
- Ruchira S. Datta
Strong correlations are sparse and heterogeneously distributed.
- Ruchira S. Datta
Diagonalize the matrix. Look at the eigenvalues i.e., perform spectral analysis of correlations. This is how correlations are usually studied. E.g., physicists studied correlations between stocks' time series in the S&P 500. Potters, Rouchaud, et. al. found that the eigenvalues corresponded to distinct sectors of the economy: transportation, paper, drug manufacturing, etc.
- Ruchira S. Datta
Correspondence need not be one to one, e.g., a linear combination of eigenvalues could correspond to a financial sector.
- Ruchira S. Datta
This analysis (PCA, not sure why he's not using this term) defines 2-3 sectors of the serine protease family.
- Ruchira S. Datta
Are the sectors statistically independent? To test this, check correlation entropy: subtract the sum of the entropies for each sector, from the entropy of the two taken together. Found sectors are statistically quasi-independent.
- Ruchira S. Datta
Project the sectors onto the 3D structure? Find that sectors are contiguous 3d substructures, which do not follow the secondary structure.
- Ruchira S. Datta
Each of them occupies about 10% of the total sequence.
- Ruchira S. Datta
found one of the sectors involved in catalytic site, another in the specificity swap, the third unknown from literature
- Ruchira S. Datta
Halabi did alanine scanning mutagenesis of the rat trypsin. Plotted and found residues in one sector changed specificity without stability, whereas another changed stability without specificity.
- Ruchira S. Datta
a very cool result for me is the 3d correspondence and some of the hints on their importance to function
- Pedro Beltrao
Found strong cooperative epistasis using double mutants.
- Ruchira S. Datta
Thus sectors have quasi-independent functions.
- Ruchira S. Datta
Different sectors can separate: different subfamilies; or vertebrates and non-vertebrates; or enzymatic and non-enzymatic. Thus sectors are selected by seemingly independent evolutionary pressures giving quasi-independent functions.
- Ruchira S. Datta
Thus besides the primary, secondary, and tertiary structure of a protein, can think of sectors as the functional structure.
- Ruchira S. Datta
Functional sectors seem to be selected quasi-independently, i.e., as far as "levels of selection" we should go even below the level of protein. They were able to separate the historical vs functional correlations.
- Ruchira S. Datta
Want to extend to other proteins, and protein interactions. Further work on mathematics behind correlations: best weights? how many examples needed? could have "pseudo-sectors" due to sequence weighting issues, coming from historical rather than functional correlations. What is the physics behind these 3d substructures? How many sectors per protein? Do they need to be independent?
- Ruchira S. Datta
one of the follow ups they are thinking about is similar analysis on protein interactions. that should be interesting
- Pedro Beltrao
Have looked at some other protein families, e.g., SH2, SH3. Another group in Rockefeller has found sectors in another family. Interpreting the sectors must be done case-by-case.
- Ruchira S. Datta
Would like to extend to pathways, but often don't have well-aligned sequences for all members of the pathway.
- Ruchira S. Datta
Why is stability of serine proteases different in vertebrates vs non-vertebrates? A lot of work to be done.
- Ruchira S. Datta
i wonder if they could use distance in protein 3d space to weight the correlations instead of conservation
- Pedro Beltrao
I spent all day trying to implement SCA in python. They have some explaining to do. Some formulas are rather mysterious, and files that they say are available on their website, are not. The results don't look too good.
- Bosco Ho
Other coevolution methods routinely take 3d/structural considerations as a way to benchmark the inferred correlations
- Wladimir Labeikovsky
neutralism and selectionism ... this is going to be an overview talk so i expect that is going to be mostly about this review http://www.nature.com/nrg...
- Pedro Beltrao
arguments for and against the neutral theory , mostly a review so far
- Pedro Beltrao
genomic data indicates that most changes are fixated due to positive selection while molecular phenotypic data indicate that neutral variation are important for the origin of new phenotypes.
- Pedro Beltrao
slightly disappointing, just an overview of the concept described in the review mentioned above.
- Pedro Beltrao
intelligent therapeuctics .... describing modules for sensing computing and performing actions. leading up to their favorite tools, RNA
- Pedro Beltrao
RNA parts, RNA devices and systems .. the technologies , requires models to make predictions for the designs.
- Pedro Beltrao
designs rules that affect miRNA processing by drosha (under review) , these allow them to think about how to integrate a sensor in a way that it will change the processing and effect of the targeting miRNA
- Pedro Beltrao
they can use a small ligand to control the processing efficiency of the miRNA and therefore the in vivo effect of the miRNA,
- Pedro Beltrao
example application - t cell engineering, compound controlling t cell proliferation (paper under review) experiments done in cells and in mouse
- Pedro Beltrao
introduction to two component systems
- Pedro Beltrao
there is typically a one to one correspondence between his kinase and targets ... how is specificity maintained ? evolution by duplication and divergence. can we learn enough to create new pathways ?
- Pedro Beltrao
an advantage of looking at 2 component systems is that the possible targets are easy to identify from sequence analysis of the genome
- Pedro Beltrao
they can experimentally determine the specificity of the targets ... they have a strong kinetic preference in vitro. a kinase picks the right target by molecular recognition ... instead of things like docking co-expression etc
- Pedro Beltrao
if most of the specificity is regulated my residues in both kinases and targets, looking at covariance of sites in both should help to predict sites
- Pedro Beltrao
unpublished results .. a structure of a specific kinase-target pair ... the residues predicted to be important for specificity are in fact in the interface
- Pedro Beltrao
they could use this information to change the specificity of kinases and show examples of in vivo changes in pathways. again, in vivo specificity is mostly determined by kinase specificity .. what is very different from what is currently though to happen in more complex species ... http://www.pubmedcentral.nih.gov/article...
- Pedro Beltrao
cool work, showing possible trajectories of the 3 mutant required to change specificity. there is at least one route where the 2 mutant is more promiscuous, more likely route for evolutionary change
- Pedro Beltrao
they are looking know at duplication events and trying to think how the specificity changes happened
- Pedro Beltrao
some kinases have the target domain as part of the same protein ... these kinases have lower specificity very likely because the target co exists in the same protein. these have much lower signal for co evolution in sequence probably because the specificity is determined by co localization
- Pedro Beltrao
>450 preferentially conserved targets per miRNA
- Pedro Beltrao
only a small fraction has conservation of the 3prime compensatory sites
- Pedro Beltrao
looking at the 54 mammalian specific miRNA gave fewer significantly conserved sites. making prediction for these is harder.
- Pedro Beltrao
these are predictions ... how to study experimentally these predictions ? mRNA changes, IP SEQ site conservation, reporter assays .. none give info at the protein level
- Pedro Beltrao
mouse KO for miR 223 ... SILAC experiment to look with MS for changes with the Gygi lab
- Pedro Beltrao
used motif enrichment to look at motifs in the rna for the proteins with changes
- Pedro Beltrao
translational changes correlate well with transcriptional changes but most changes are small .. suggesting that there might be selection for precise levels of individual proteins
- Pedro Beltrao
18 to 33% messages with predicted sites respond to loss of miRNA. conserved sites are more likely to respond but most changes are for sites that are not conserved
- Pedro Beltrao
multiple sites in the same message act independently unless they are very close together (cooperativity ? interesting)
- Pedro Beltrao
they could use this information to look at the efficiency of repression looking at context around the sites
- Pedro Beltrao
the context information allows them to make site predictions that are of similar quality than using conservation. (this should allow them to do divergence studies)
- Pedro Beltrao
1997-2000: Palsson joined UCSD, lab members Christophe Schilling and Jeremy Edwards started making metabolic networks for E. coli, H. influenzae, H. pylori.
- Ruchira S. Datta
Iman Family & Joachim Foster, 2001-2003: Eukaryotes.
- Ruchira S. Datta
Many lab members: RECON 1: first human metabolic reconstruction (2007).
- Ruchira S. Datta
for metabolic networks we know have genotype/phenotype maps
- Pedro Beltrao
Network reconstruction is a BiGG knowledge base, encoding chemistry into a mathematical format. Birth of genome-scale (metabolic) systems biology. Now have mechanistic basis for the genotype-phenotype relationship at genome scale.
- Ruchira S. Datta
how do we build metabolic networks ?
- Pedro Beltrao
Unlike physics 100 years ago, need to account for dual causation (history vs function)
- Ruchira S. Datta
2. Network reconstructiong: building the genotype-phenotype relationship.
- Ruchira S. Datta
Check metastructure of genomes. M matrix (stoichiometric), E (expression) matrix, and O (operon) matrix, then integrate them.
- Ruchira S. Datta
"Meta-structure" of E. coli genome: higher level than operon structure.
- Ruchira S. Datta
Want to know where genes are, transcription factors, transcription start sites, promoters, etc.
- Ruchira S. Datta
Collate 4 'omics datatypes on a genome scale.
- Ruchira S. Datta
Have high-quality ChIP-chip data. 250bp resolution peaks. Have expression profiling using tiled arrays. Solexa for first ~30bp of transcripts. 1bp resolution. Proteomics by mass-spec.
- Ruchira S. Datta
the genome, chip chip data , transcription data, proteomics
- Pedro Beltrao
Thus have multiple genome-scale measurements of different kinds along the whole genome.
- Ruchira S. Datta
doing this for several bacterial ... done in e.coli
- Pedro Beltrao
Found >100 new transcripts. Some of them are quite small, candidate small RNAs.
- Ruchira S. Datta
35% of operons have multiple start sites with multiple active in given condition
- Pedro Beltrao
Integrate the four kinds of data to characterize different aspects of a module.
- Ruchira S. Datta
looks like the aim will be to go for a full model of e.coli
- Pedro Beltrao
M matrix is nice as it's just binary, so no errors in the matrix.
- Ruchira S. Datta
4-step process for metabolism: 1. Draft reconstruction, 2. Curated reconstruction, 3. Genome-scale metabolic model, 4. Validation and iterative improvement. Then ready for use as platform for design and discovery.
- Ruchira S. Datta
60-step process will be detailed in Nature Protocols.
- Ruchira S. Datta
Have exponential growth in available reconstructions and their uses.
- Ruchira S. Datta
by 2008 around 90 apps based on E.Coli metabolic reconstruction
- Attila Csordas
Had reconstruction jamborees. Conceived in 2006. Yeast metabolism: Nature Biotech Nov 2008; salmonella has just been submitted, human Recon 1.1 in June 2009, Recon 2.0 in March 2010, will have Staph. and TB.
- Ruchira S. Datta
the metabolic reconstruction is well established ... what about the transcriptional network
- Pedro Beltrao
Lots of ChIP-chip data shows role of metastructure.
- Ruchira S. Datta
4-step process for O matrix is still under development.
- Ruchira S. Datta
different transcriptional units depending on condition
- Pedro Beltrao
Merge into ME matrix: Metabolism and Expression. In particular, will contain all antibiotic targets.
- Ruchira S. Datta
Putting it all together. See Feist et al, Nature Rev Micro 02/2009
- Ruchira S. Datta
Marcus Cobert has done this for Mycoplasma genitalia, and has simulator which he will present at ICSB.
- Ruchira S. Datta
Inclusion of protein structure: Thermotoga maritima. Forthcoming in Science (2009). Put together in integrated reconstruction. Can see how folds travel through the pathways and infer how they might have come up through gene duplication.
- Ruchira S. Datta
upcoming paper in science thermotoga maritima structures for many proteins ( aprox 100) that were studied in the context of the metabolic network
- Pedro Beltrao
If drug has off-target binding site (=> side effect), one of the few ways to analyze that is with reconstruction: perturb in two places at once.
- Ruchira S. Datta
Toll-like Receptors in human macrophages has appeared. Large reconstruction w/ ~950 reactions. These reconstructions can be merged to make scalable models.
- Ruchira S. Datta
i really like this connection between structure function at a large network level
- Pedro Beltrao
3. Biological sciences in the era of systems biology.
- Ruchira S. Datta
Can replace expression profile with computed metabolic functions to analyze drug response phenotypes.
- Ruchira S. Datta
studying drug candidates using these network models and gene expression changes
- Pedro Beltrao
Lots of metabologmics data, e.g., Rabinowitz at Princeton. Use with multi-scale kinetic models.
- Ruchira S. Datta
Exo-metabolomics: easy to measure concentrations of metabolites *outside* of cell. If genetic defect leads to difference in secreted metabolite, have candidate biomarker.
- Ruchira S. Datta
Gap-filling: systematically finding missing parts. PNAS 103:17480 2006 Use model to find positive growth environments not explained by the model. Use model to hypothesize what is missing. Use bioinformatics to find it.
- Ruchira S. Datta
Understanding complex biological processes, e.g., adaptive laboratory evolution to optimal phenotypes. E.g., Nature 420(6912) (2002), and Nat Genet 36(10) (2004). Now it's possible to resequence the genomes and find all the mutations that appeared during the adaptive evolution. We can infer their causality using allelic replacement.
- Ruchira S. Datta
Unexpected: mutations to global regulators and RNAP are more frequent than mutations to specific transcription factors.
- Ruchira S. Datta
Mutations in RNAP are in-frame deletions in the jaw region of the enzyme where it binds to the DNA. Studied w/ R. Landick, U. Wisconsin: leaves initiation site much faster. Adaptive mutations in jaw region have consistent effect on its kinetics.
- Ruchira S. Datta
Systems Level Metabolic Engineering: Current Opinions in Biotech, 2008, 19:454-460. From random mutations to targeted mutations, to system-level.
- Ruchira S. Datta
Review: Feist and Palsson Nature Biotech 2008. Another one forthcoming in Molec. Sys. Biol. 2009, which includes work on communities.
- Ruchira S. Datta
the small scale modelling/design principles people must be dying by the end of this talk ;)
- Pedro Beltrao
Q: Axiom of mass conservation in cell? The cell grows? A: There is mass balance.
- Ruchira S. Datta
Q: What about non-model organisms? 99% of microbial universe is non-culturable. What would you do given only a genome? A: There are very few model organisms known in great detail and we tend to extrapolate from them. Metabolic maps for e.g. Geobacter showed that even though the organism was poorly known, the reconstruction was surprisingly useful.
- Ruchira S. Datta
an interesting question about how applicable is network reconstruction to novel bacteria that cannot be grown in the lab
- Pedro Beltrao
One of the unknown frontiers is how to deal with communities. Now we can sequence the entire metagenome, and culture some of the cells. We can subtract those genomes from the metagenome and make a synthetic community. There's a lot of talk at the federal level in this country on how to deal with communities effectively.
- Ruchira S. Datta
Q: How to enhance production of specific compounds? A: Metabolic reconstruction allows computation of good genetic manipulations. In many cases these are not intuitive ahead of time. Can predict what genes to introduce and delete for a desired phenotype, and predict whether it is evolutionarily stable and its growth.
- Ruchira S. Datta
Q: Have mostly been talking at single cell level; how about tissues and organs? A: Human reconstruction has been tailored based on tissue-specific profiles, though not curated yet. So should be able to get organelle-specific and cell-specific models. No model yet has >1 cell type. Unpublished model of liver also models adipose tissue and muscle.
- Ruchira S. Datta
Q: How to assess completeness and correctness of reconstructed network? A: E. coli has about 4400 genes in the genomes, but 1200 of those genes are in the metabolic reconstruction. 95% have biochemical data associated, so high-quality reconstruction. Assess completeness using gap analysis. Conversely, look at metabolomic data: metabolites in reconstruction? 20-30 metabolites were absent...
more...
- Ruchira S. Datta
Thinks 90% of metabolic functions of E. coli are in the reconstruction.
- Ruchira S. Datta
grand challenge of understanding signal transduction in mammalian cells: >= 200 unique cell types, >= 1000 signalling proteins in each cell type (with some overlap).
- Ruchira S. Datta
how modular are signalling pathways ?
- Pedro Beltrao
have concept of modularity: signaling components are part of pathway and process modules
- Ruchira S. Datta
Signalling initiated through cell-matrix interactions, cell-cell interactions, hormone/GF receptors, nutrient/stress, or cell checkpoints. Via signaling components, these affect many processes, including differentiation and proliferation emphasized here.
- Ruchira S. Datta
transducing inputs: cAMP/cGMP, IP3/CA2+, PI3K, Ras/MapK Wnt, STAT; transducing outputs: migration control module, G1/S restriction point.
- Ruchira S. Datta
# of functions controlled is very variable in different cell types, e.g., stem cells vs fully differentiated cell.
- Ruchira S. Datta
Time course of signaling propagation can be described by feedback which modulates. Only a few types of relationships. Cf. Brandman, Meyer, Science, 2008, and Brandman et al, Science, 2005. E.g., negative feedback, transient generator, adaptive, system, positive feedback, bistable switch.
- Ruchira S. Datta
how cells respond to inputs .. he cliams that there are very few types of input/output functions
- Pedro Beltrao
To dissect complex cell signalling systems, need to use cell readouts and perturbations. E.g. for cell readouts, live biosensors (hard to create ones that are inert in biology of cell), or antibodies. Cell readouts have advantage of single cell data. Perturbations might be using siRNA, genetic; chemical, light.
- Ruchira S. Datta
these simplicity of input/output functions is created by complex signalling interaction networks and understanding these networks requires technological developments in cell imaging and perturbations
- Pedro Beltrao
Meyer's lab likes live-cell fluorescent biosensors.
- Ruchira S. Datta
history of sensing imaging tools in one slide ... gfp etc
- Pedro Beltrao
Many contributors to live-cell fluorescent signaling tools, starting with 1980 Tsien indicator of Ca2+, and Shimomura, Chalfie, Tsien GFP (last year's Nobel prize). Last 10 years: cAMP sensor FRET sensor, RAS sensor, protein kinase biosensor, cell cycle biosensor. Still hard to find biosensor for particular setting.
- Ruchira S. Datta
Would like to be able to rapidly perturb signaling processes. Use synthetic biology to engineer activation domains.
- Ruchira S. Datta
a lesson in buzzwords :), using synthetic biology to develop rapid pertubationts for systems analysis
- Pedro Beltrao
Fluorescently monitor autonomous polarization in uniform chemoattractant of HL60 neutrophil cells. Regular pathway fMetLeuPhe receptor -> PI3K -> polarization, positively feeding back to PI3K. Instead, induced PI3K activation uniformly using p85-peptide-FKBP.
- Ruchira S. Datta
Use siRNA's to sample many perturbations of the circuit.
- Ruchira S. Datta
Made siRNA library targeting the signaling proteome.
- Ruchira S. Datta
Important decision in the life of a somatic cell: to divide or not to divide. Must divide to maintain/regenerate tissue, but uncontrolled growth is cancer. So cell cycle must be carefully controlled. Also, mammalian cells have 100000 possible ORC's (origins of replication), but activating many simultaneously can also lead to cancer.
- Ruchira S. Datta
Have limited model of a restriction point for cell cycle entry, but would like to know what signalling components are involved, and whether it is all-or-nothing or based on a bistable switch.
- Ruchira S. Datta
Able to link different cell cycle phases with proteins using siRNA analysis.
- Ruchira S. Datta
using sets of siRNA of specific function to do more targeted analysis.
- Pedro Beltrao
Found parallel rather than sequential mechanism for CDK4 activation & DNA replication. Suggests that CDK4 activation is part of positive feedback.
- Ruchira S. Datta
Could CDK4 activation have bistable switch mechanism? Plotted relative Rb-phosphoryation vs. relative Hoechst DNA stain. Checked fraction of cells exceeding Rb-p level at timepoints hours after serum addition. Found pattern characteristic of bistable switch.
- Ruchira S. Datta
DNA replication is ultrasensitive to CDK4 activation. This would be needed to suppress basal replication activity in somatic cells.
- Ruchira S. Datta
Degradation of p21, an inhibitor of CDK4 and CDK2, is part of a positive feedback loop.
- Ruchira S. Datta
Single cell analysis, comparing p21 concentation with Rb-phosphorylation, shows that p21 degradation is a limiting step for CDK4 activation.
- Ruchira S. Datta
Requires a ligase at the last step of the replication fork. May ensure that the whole system is working.
- Ruchira S. Datta
The siRNA perturbation analysis shows tight correlation, indicating integral role of p21 in the feedback system. Interlinked positive feedback engages over an 8 hour window.
- Ruchira S. Datta
some interesting feedbacks in cell cycle control but he is going to fast for me :) i will wait for the paper
- Pedro Beltrao
Have differential equation model that replicates experimental data. Rate balance plot shows there is an all-or-none decision to enter the cell cycle. Behavior changes at robust bifurcation in the model, indicating a "restriction point" for cell cycle entry.
- Ruchira S. Datta
Interesting module, the CDK ramp module, as have very slow ramp over several hours.
- Ruchira S. Datta
predicted > 10 000 000 fold increase in DNA rep rate for 6 fold CDK4 increase?
- Attila Csordas
Growth factors trigger collective migration of cells in endothelial monoloayers. This enables development, wound healing, and cancer. Migrating cells coordinate with each other as they move. Endothelial monolayer integrity is crucial to maintain the barrier between the circulating blood and tissues.
- Ruchira S. Datta
Used spring-loaded 96 well scratcher to make many scratches quickly.
- Ruchira S. Datta
Migrate quite fast while in intact monoloayer, almost as fast as when free.
- Ruchira S. Datta
they can measure in 96 plates the effect of siRNA in migration
- Pedro Beltrao
Assigned genes to putative control modules using PCA
- Ruchira S. Datta
Separate modules for cell motility, directed migration, and coordination of cells within the sheet.
- Ruchira S. Datta
autonomous basal polarization and motility function of cell motility module is independent of growth factors
- Ruchira S. Datta
directed migration module directs pioneer behavior: induces extension into cell-free space
- Ruchira S. Datta
coordination keeps cells within the monolayer together
- Ruchira S. Datta
using their observations to build models of cell motility, nice combination
- Pedro Beltrao
model predicts role of lateral drag in sheet integrity. Collective flow behavior requires drag steering. E.g., drag steering helps close small endothelial lesions. Growth factors not needed for this.
- Ruchira S. Datta
These functional modules are much easier to separate than signaling modules. Combinations of sub-modules create emerging systems behaviors.
- Ruchira S. Datta
Potentially interesting, but at the moment I get to wait ~2 min after which I receive an "Internal Server Error"
- Lars Juhl Jensen
It may be the entire Internet is collapsing due to massive DOS on twitter. Or not...
- Ricardo Vidal
I installed it without problems. Do you want me to send you the XPI?
- Paulo Nuin
It installed just fine. The problem is that half of the time when I want it to process an abstract, it crashes server-side. Maybe the server is overloaded? I'll try again tomorrow.
- Lars Juhl Jensen
The curse of systems biology: you will be a jack of all trades, rather than a master of one.
- Allyson Lister
How does one automatically model systems and pathways from a variety of data
- Oliver Hofmann
Ideker et al.: Ann Rev Genomics Hum Genetics 2001 - his PhD work (Systems Biology: A new approach to decoding life.
- Allyson Lister
Given basic knowledge of a pathway measure the global response to systematic perturbations, compare predictions and observations and determine the goodness of fit, revise and repeat
- Oliver Hofmann
Ideker (Sciene 2001), GAL metabolic flow as an example
- Oliver Hofmann
Use systematic interaction data (Protein-DNA, PPI, biochemical reactions) to zoom out from a well-understood small pathway to a more global view
- Oliver Hofmann
The final figure of that Science manuscript, he feels, launched his career.
- Allyson Lister
Change to the 'modern area'. Querying large biological networks for active modules, colouring graphs with expression states, enzyme activity, any kind of activity. Automatically extract subnetworks / modules from the global 'hairball'
- Oliver Hofmann
Interaction Database Dump, aka "Hairballs", which aren't good for a whole lot.
- Allyson Lister
Projection of siRNA phenotypes on a HIV protein interaction network (human-human and human-HIV) to identify active modules
- Oliver Hofmann
For what @Oliver said directly above: (Konig et al. Cell 2008).
- Allyson Lister
A working map for moving network biology to the clinic: does not have to be complete and can tolerate false positives. Innovation in how to assemble the map (what pieces / input)
- Oliver Hofmann
Key problem: moving past the pretty pictures and interpreting them
- Oliver Hofmann
What follows is a very nice slide on the timeline of both biological sequence comparison and biological network comparison. Can't write that! See @Diego above for citation for this
- Allyson Lister
Trey thinks there are better things out there now than PathBLAST and networkBLAST.
- Allyson Lister
Genetic interactions form a distinct type of network maps. Comparison across network types instead of across species
- Oliver Hofmann
Here, there exists a genetic interaction between gene A and B if phenotype of mutant a is OK, mutant b is OK, and mutant ab is sick. (Tong et al. Science 2001)
- Allyson Lister
Identify biological pathways / physical interactions that support genetic interactions
- Oliver Hofmann
Kelley and Ideker Nat Biotech 2005 worked on systematic identification of parallel pathway relations.
- Allyson Lister
Genetic interactions run 'between' clusters of physical interactions, not within them
- Oliver Hofmann
Regions are enriched for stress genes. Binding of factors to the region seems to be condition-specific
- Oliver Hofmann
Factors being sequestered away from their target genes under certain conditions?
- Oliver Hofmann
Network-based disease diagnosis and prognosis
- Oliver Hofmann
Changes to a breast cancer metastasis classifier using additional features. (Imperfect) diagnostic sets of genes have been identified previously, ROC at around 0.65
- Oliver Hofmann
Lack of robustness across studies (overlap between three classifier sets not larger than random expected overlap)
- Oliver Hofmann
Overlay with PPI data to identify informative subnetworks (Chuang, Mol Sys Biol 2007)
- Oliver Hofmann
Taylor et al. 2009 Nat Biotech, better approach recommended by Trey
- Diego M. Riaño-Pachón
One thing I have noticed is that in any ovie, v show, etc where anything odd happens, the English are always portrayed as standing in place saying something like "This can't be happening" or some stupid thing like that, when it clearly IS happening. Is this some sort of national trait? I like to think that if, say, the dead rise and turn on the living, I would fight back + run to safety FIRST and worry about the ontological implications after.
- Neal Jansons
Laugh, sure. But your pounds sterling end up in a California bank account. ;-)
- Chris Baskind
My cousin dl'ed the Moron Test and tried to get me to take it. I told him he'd already failed because he was a moron for buying it in the first place. He was unamused. I wasn't.
- tinypants - Hagitha of FF
To be fair, he's kind of a tool, so it's less a reflection of the app and more a reflection of how much I want to throw him off the nearest high-rise.
- tinypants - Hagitha of FF
I notice that the U.S. is no longer a nation of morons -- for the last few days, "The Sims 3" is #1, so we're now a nation of replicants.
- Stephen Mack
got this after my first intelligent use of ff filtered search. populist USA likes to feign ignorance and UK likes to superficial awareness
- Lane Rapp
One of the easy criticisms to make about Twitter Data is that equivalent information should be extracted semantically, by analyzing the natural language of user tweets, instead of by users or applications injecting data manually. Ideally, this would be possible, but unfortunately it's not. The technology industry is a long way from truly useful natural language processing, and furthermore, not all types of data can be extracted in that way.
- Pierre Lindenbaum
This sums up what I felt on playing with it, much more lucidly than I could've done. I hope it doesn't just become a huge and very clever white elephant
- Andrew Clegg
Yes, this is a neat description of the basic problem. "Here's what I think the answer is to your question, and NEVER YOU MIND how I know." The issue of WA employees having to curate the data is also important - how is that sustainable?
- Matthew Todd
"As the sheer volume of stuff on the Web keeps growing, keyword search keeps getting closer to its breaking point. Adding structure to the Web is one way to make sense of all that data, and Google is starting the tackle the problem with a Google Labs project called Google Squared, which Marissa Mayer mentioned earlier today at the company’s Searchology briefing."
- Chris, Taskerrific Guy
from Bookmarklet
The NCBI BioSystems Database currently contains biological pathways from two source databases, KEGG and the EcoCyc subset of BioCyc, and is designed to accommodate other types of biosystems such as diseases as data about them become available. Through these collaborations, the BioSystems database facilitates access to, and provides the ability to compute on, a wide range of biosystems data. Detailed diagrams and annotations for individual biosystems are then available on the web sites of the source databases.
- Pierre Lindenbaum
Yes, many bio wikis are more like a sandwich toaster; full of initial promise but quickly languishing, grease-coated, at the back of a high cupboard.
- Neil Saunders
Love the analogy. Is it just me, or does this observation make the whole Wikipedia phenomenon that much more impressive? I mean, there are actually people there that do care about (and in some cases, only care about) keeping things clean. And lots of them! Always wonder though if these people have found their long-term niche, or if there's just a steady stream of cannon fodder...
- Andrew Su
My experience is that people either "get" and love the wiki ethos, or they don't, so may well be that it's a niche. Also worth remembering the long tail; there's a very small number of people responsible for most wikipedia activity, a few hundred at most.
- Neil Saunders
But Neil, "a very small number of people responsible for most wikipedia activity" = short head... (Looking for the published analysis on this subject...)
- Andrew Su
under figure 4.30 of http://libresoft.es/Members..., the authors do observe a "highly biased distributions towards a small core of very active logged authors". But as I recall, if you look at amount of contributed content (as opposed to number of revisions) the distribution is much flatter... Still looking for the reference...
- Andrew Su
FWIW, this paper has another take on WP editing patterns: http://www.viktoria.se/altchi.... "the rise and decline of the influence of elite users found above does not depend on the type of metric used (either percentage of edits or percentage of changed content). However, while the percentage of edits declined sharply in the 2005-2006 period, the percentage of changed content has remained remarkably stable." -- a lot seems to come down to how you count...
- Andrew Su