Weirdly, I can read the page at the link you gave, but when I click on "printer friendly" for easier cutting-and-pasting, I get a paywall. Anyway, sent you a copy of the text by email, let me know if that's not what you needed.
- Bill Hooker
Several questions: what proteins in a cell? How do the complexes form? Evolution of gene duplications/epigenetics? Personalising treatment will be final part of talk.
- Scott Edmunds
Cool -- protein-protein docking programs can often identify correct binders from all the incorrect binders, even if the absolute score or estimate of binding energy is not correct, the relative estimates work to find correct binders among the top predicted binders.
- Barb Bryant
Moreover, correct binders often have many conformations near the correct conformation that have good scores, which helps in the comparison to other (wrong) binding pairs.
- Barb Bryant
(Question I have about looking for evidence at the protein level of the many observed mRNA splice isoforms: can they see peptides that span exon boundaries? wouldn't that provide very good evidence, but of course much rarer, compared to peptides that reside entirely within an exon?)
- Barb Bryant
Chromatin structure plays a role in eukaryotic genomic evolution, as do DNA replication dynamics. The act of replication induces cellular stress, with exposed single-strand DNA leading to DNA damage.
- Barb Bryant
Ernst & Kellis: we now have new genome-wide data showing chromatin states -- and can know which regions are euchromatic.
- Barb Bryant
We also have maps of speed of replication along the human genome, showing late and early replication locations.
- Barb Bryant
They use phylostratification for evolutionary trees of each human gene, back to when there was a duplication.
- Barb Bryant
We now put together gene age, replication time and euchromatin. We get a surprising result. Old families replicate earlier; newer genes replicate later and are more exposed to replicative stress (and therefore are more vulnerable to change?)
- Barb Bryant
Genes that replicate late in the cell cycle are in heterochromatin-rich regions.
- Barb Bryant
Development of specialized cell types is one consequence of this process.
- Barb Bryant
Moving now to Bioinformatics in personalized cancer treatment
- Iddo Friedberg
Paper on PALB2, pancreatic cancer susceptibility gene found by exomic sequencing, resulting in better, rationally targeted cancer treatment for a patient.
- Barb Bryant
International Cancer Consortium: Spain will sequence 500 genomes of Leukemia patients
- Iddo Friedberg
Redundancy in genomics can be exploited --> CaBLAST. Works on compressed data. Size of compressed DB is proportional to the size of non-redundant data
- Iddo Friedberg
coarse analysis on compressed data - refined analysis on relevant regions
- Shannon McWeeney
@emergentnexus-- I think what you were talking about was this, right? "Ortholog assignments: Ensembl homology descriptions “ortholog 1:1” and “apparent ortholog 1:1” were used to annotate orthologous pairs. The apparent orthologs were treated as 1-to-1 orthologs since this description can result from a situation where a gene duplication is...
Don't know much about FF. We could create a "group," but that would seem overkill.
- Erick Matsen
I asked Matt Hahn about this, here's his reply: "one-to-many relationships are only between a single-copy gene and the co-orthologous inparalogs in the other species; we certainly did not include outparalogs in this. Of course the family-based tests do lump out- and inparalogs, but this is explicitly not the case in all other analyses."
- Iddo Friedberg
Eric - maybe you can just post the previous tweets about this here
- Jonathan Eisen
@iddux: they describe two situations in their paper. First: "Ensembl homology descriptions “ortholog 1:1” and “apparent ortholog 1:1” were used to annotate orthologous pairs. The apparent orthologs were treated as 1-to-1 orthologs since this description can result from a situation where a gene duplication is actually followed by gene losses in both lineages, but more often occurs because of an incorrect tree topology and incorrect duplication node labeling."
- Ruchira S. Datta
The first situation is what they call orthologs. They only call 1:1 orthologs, orthologs. The usual definition of 1:1 ortholog: among the homologous sequences in the gene family, there is one sequence in human and one sequence in mouse. In this situation the orthology relationship is high-confidence, which is why it is called out specially by Compara. I read from the text above that this is the *only* situation they call orthologs. The conventional definition would certainly call these orthologs.
- Ruchira S. Datta
Second: "Paralog assignments: all between-species paralogs were treated as outparalogs." This is the only other situation they describe. Thus, I had concluded that this included all the other between-species homolog pairs in their dataset. If they left out a bunch of human-mouse homolog pairs that are neither between-species outparalogs nor 1:1 orthologs, then I would say that invalidates the results since that's exactly what the ortholog conjecture is about.
- Ruchira S. Datta
They say: "the fact that within-species gene pairs cannot be confused with between-species gene pairs (of any kind) means that our main results are robust to the exact tree topologies." I don't understand how any results about the ortholog conjecture can be robust to the exact tree topologies. If they think their "main results" are that in-species paralogs are more similar than orthologs, the ortholog conjecture bears no relationship with that statement.
- Ruchira S. Datta
Here is a, hopefully clear, formulation of the ortholog conjecture: functional divergence speeds up after duplication events. In the context of the paper, this is what the ortholog conjecture says: suppose in some protein family T, first there was a duplication event D, leading to two ancestral groups of sequences A1 and A2. Then there was a speciation event between primates and rodents. This would lead to 4 sequences: From A1: human gene H1 & mouse gene M1. From A2: human gene H2 & mouse gene M2.
- Ruchira S. Datta
The ortholog conjecture says that H1 is likely to be more functionally similar to M1 than to M2, and that H2 is likely to be more functionally similar to M2 than to M1. That's it. That's what one has to refute to refute the ortholog conjecture.
- Ruchira S. Datta
You can see how if the only orthologs you are considering are 1:1 orthologs, there is no way of confirming or refuting the ortholog conjecture.
- Ruchira S. Datta
The general theory also includes the concept of subfunctionalization. This concept says that since the functional constraint is somewhat relaxed, H1 and H2 can develop more specific functions. If that's the case, and there's a completely different protein family of 1:1 orthologs H3 and M3, then one might expect H1 to be more functionally similar to M1 than H3 is to M3. But this depends on comparing functional similarity between families.
- Ruchira S. Datta
If the "main result" is that in-species paralogs are more similar than orthologs: this is saying that in our example, H1 is more similar to H2 than H1 is to M1. That may very well be. It's comparing apples and oranges. To apply the ortholog conjecture to in-species paralogs, we would have to say "the ortholog of H1 in human is more similar to it than any paralog of H1 in human", which is trivially true, since the only possible ortholog of H1 in human is H1 itself.
- Ruchira S. Datta
So, here's what I thought when I made my tweet: in the situation I described, for the sequence H1, they are calling both M1 and M2 between-species outparalogs of H1, rather than using Compara to designate M1 as the ortholog and M2 as the paralog. Their language about being "robust to exact tree topologies" also gave me this impression--there is no way to decide which sequence to call M1 and which to call M2 while staying robust to exact tree topologies.
- Ruchira S. Datta
So, if that's the case, here's what their curve says: the between-species outparalog with high sequence identity to H1 is highly functionally similar to it. That's likely to be M1. The between-species outparalog with low sequence identity to H1 has low functional similarity to it. That's likely to be M2. H1 is more functionally similar to M1 than to M2 -- exactly what the ortholog conjecture says.
- Ruchira S. Datta
The one-to-one ortholog M3 retains constant functional similarity to H3 regardless of sequence identity. That's also consistent with the theory: since there is no duplicate to take up the slack, the function is constrained to remain the same, so only sequence changes that don't affect the function are viable.
- Ruchira S. Datta
Calling both M1 and M2 "between-species outparalogs" of H1 is completely contrary to the usual definition of orthology: one of them is the ortholog and one of them is the paralog. (Or, due to gene losses, it could be in some instance that neither of them is an ortholog, but to assume that this is always the case would be very strange.)
- Ruchira S. Datta
The most usual reason why biologists apply orthology for function prediction is that they have a model organism where experiments are easy, and they have a species of interest where experiments are hard (or haven't been done for whatever reason). They want to transfer information from the gene in the model organism to the correct one in their species of interest.
- Ruchira S. Datta
If they can actually do experiments in their species of interest (and thus derive information on in-species paralogs), then more power to them. But this is not the case for the vast, vast majority of species for which we have gene sequences.
- Ruchira S. Datta
OK, reading this but it's a lot to take in (and dinner beckons). Will continue later tonight. Think we might get Matt/Pedja in on this?
- Iddo Friedberg
Well, if y'all could check whether my interpretation of the paper has any obvious flaws first, I would appreciate it. After all, if I'm the only one who understands what they wrote this way, and some alternative interpretation makes more sense, I'd rather just retract my comments/tweets. If none of you points out some alternative interpretation, we could ask them for clarification.
- Ruchira S. Datta
BTW, between-species outparalog is a bit redundant: an inparalog is a paralog in the same species, where the duplication occurred after the last speciation event (i.e., within the species itself).
- Ruchira S. Datta
For anyone who wants to enter this discussion, it's about "Testing the Ortholog Conjecture with Comparative Functional Genomic Data from Mammals" by Nehrt et al. http://bit.ly/lFD5wH
- Erick Matsen
Ruchira-- from my perspective as a naive observer, it does appear to me that the difference has to do with the definition of the "ortholog conjecture". What they call the OC is what you call the apples-to-oranges comparison.
- Erick Matsen
I do wish they included a diagram of the type you describe to clarify their definitions. I think it's perfectly reasonable to ask the authors to clarify. From Matt's comment via Iddo, it does appear to me that their terminology is different than yours.
- Erick Matsen
Namely, say there is a gene H1 that is single copy in human, and orthologous to M1. Then say there has been a duplication in mouse leading to M2. It does seem like his comment is saying that H1 vs (M1,M2) would be considered paralogs.
- Erick Matsen
Interesting discussion! I wasn't aware that there is ambiguity in these definitions. We study a single gene in fruit flies (FoxP) which has duplicated twice in vertebrates, yielding Foxp1-4. In our manuscript, we write that FoxP is the fly orthologue to FoxP2 (which is how we found it in the database). However, the largest similarity (on the genomic level) is to FoxP1. What are the human and fly FoxP's to each other?
- Björn Brembs
Erick: As I've explained, the apples-to-oranges comparison is neither addressed by orthological theory nor the usual case needed in applications, so I don't understand the framing of that as "the ortholog conjecture."
- Ruchira S. Datta
in the case you describe (M1, M2) would be co-orthologs of H1, and M1 and M2 would be inparalogs of each other, in the standard terminology. The ortholog conjecture would be agnostic as to which of M1 and M2, if either, would be more functionally similar to H1.
- Ruchira S. Datta
Björn, sequence similarity is only a rough guide in my view -- however it is the basis of many, graph-based ortholog prediction methods. I personally would want to look at a tree. I'll see if I can do that momentarily.
- Ruchira S. Datta
However, just based on what you said, Foxp1-4 would all be co-orthologs of fly FoxP. The tree topology you describe would give no basis for labelling one of them as "the" ortholog.
- Ruchira S. Datta
Björn, here is what I get from our PHOG 1.0 server with the fruitfly protein as a query: http://phylofacts.berkeley.edu/ortholo... I chose a distance threshold of 0.9375, as this is the distance tuned for human-fruitfly in PHOG 1.0.
- Ruchira S. Datta
It looks like they also list all of FOXP1-4 as co-orthologs, as I would expect.
- Ruchira S. Datta
So, after duplication the co-orthologs start diverging from each other. The ortholog conjecture doesn't say which of them would be most functionally similar to the single fruitfly protein. The authors say it is likely to be the one with most sequence similarity. That's what a reasonable person would guess and not contrary to the ortholog conjecture.
- Ruchira S. Datta
In fact, graph-based orthology prediction methods would probably pick that one (the most sequence similar one).
- Ruchira S. Datta
Erick, though, the case of H1 vs (M1, M2) ought not to be what the between-species outparalogs curve describes, because M1 and M2 are inparalogs, not outparalogs.
- Ruchira S. Datta
Phew, that means we didn't screw it up :-) That's how we have it. Because of different isoforms, the sequence alignment is somewhat tricky, but it seems the fly FoxP gene is most similar to the FoxP1 gene in vertebrates. Functionally, not much is known. What little is known seems consistent with the idea that both FoxP1 and Foxp2 share similar functionality, but in different specializations.
- Björn Brembs
At least, we found FoxP because FoxP2 is involved in language and language is thought by some (since Skinner, 1957: "Verbal Behavior") to be an operant behavior. we found the fly FoxP to be involved in operant learning, but not in other forms of learning.
- Björn Brembs
Björn, how interesting! I would think language requires a large collection of behaviors, some of which might be operant and others not, but of course I don't know much about it.
- Ruchira S. Datta
Ruchira-- I would like for some of your explanation here to end up in a blog post, or something! I'd like to hear the author's response as well. Any such plans?
- Erick Matsen
Erick, maybe this is the right time for Iddo to ask the authors for clarification?
- Ruchira S. Datta
Erick, since the paper is in PLoS, it seems like the natural spot would be a comment on the article itself.
- Ruchira S. Datta
I'm at JFK on the way to the EBI Genome Campus in Cambridge, UK for Quest for Orthologs 2011, the orthologists' powwow.
- Ruchira S. Datta
I mentioned the paper in relation to a talk and people have brought up some other issues. Maybe a response will be assembled.
- Ruchira S. Datta
I would suggest at least posting a link to this discussion on the PLOS Site
- Jonathan Eisen
I've wondered this. I sorta figured it was because archaea tend to be found in environments where it's less likely that potential hosts would encounter them. But that raises a further question: why do archaea tend to occupy extreme niches? Is it because bacteria out-compete them elsewhere?
- Bill Hooker
There are quite a few mesophilic archaea. By some estimates, 20% of the ocean microbial mass. Not to mention all the archaeal commensals that live with their animal hosts. So it might be that archaeal extremophiles are the majority in extreme conditions, but the majority of archaea are mesophiles.
- Iddo Friedberg
from Android
I was kidding, but is it really not avail. anymore? Could be the evil speculative domain parking stuff practiced by the likes of GoDaddy et al.
- Mr. Gunn
1 Gazillion points to Mr/Dr./Prof J. Eisien for creating these soon to be, infamous Eisenome and Eisenomics blog sites via this humble FriendFeed thread !!
- Graham Steel
Despite my use of twitter for the #scio10 meeting, I continue to be amazed by how awesome friendfeed is --- friendfeed - I am back
- Jonathan Eisen
Not that I wont still tweet - but I do love friendfeed
- Jonathan Eisen
Make good use of it. It's a rare scientist that can play this trick of a self-named -ome. "Suome" doesn't quite roll off the tongue. "Lindenbaumome"? "Steelome?" "Szczesnyome"? None quite have the same ring. "Hookerome" is close, but I'm sure that's just putting a nerdy spin on a third-grade playground taunt (just a guess Bill)...
- Andrew Su
Andrew, I think we all have to hand this one to you for coming up with the idea in the first place :-)
- Graham Steel
@Andrew "Iddome"? Going back to the hiring and P&T discussion, an inner joke in my department says I was hired only because one of the senior researchers there is working on indoleamine 2,3-dioxygenase (IDO). Hey, it's as good a metric as the ISI impact factor...
- Iddo Friedberg
+1 Iddo - thanks for a hilarious discussion Andrew, Graham, Johnathan....
- Mr. Gunn
Andrew, even the judge at our wedding made fun of my name, pointing out that well, of course my wife wasn't going to take *that* name. I'm just wondering how I get a grant to study my own -ome... :-)
- Bill Hooker