one family with many domain architectures, all sharing a kinase domain
- Ruchira S. Datta
Multidomain sequences evolve via gene duplication and domain shuffling.
- Gabriele Sales
multidomain sequences evolve via gene duplication and domain shuffling
- Ruchira S. Datta
The same domain may appear in multiple, unrelated proteins.
- Gabriele Sales
A definition will be presented that is in line with Fitch' proposition of homology.
- Roland Krause
can have case where genes share common ancestry, but domain architecture has changed
- Ruchira S. Datta
Difference between sequences related by vertical descent and related by domain insertion.
- Roland Krause
Two kinds of relations among genomes: relation by vertical descent or relation by domain insertion.
- Gabriele Sales
similarly can have the converse: through domain shuffling, genes that are not homologous can come to have the same domain architecture
- Ruchira S. Datta
It is possible to distinguish such two cases?
- Gabriele Sales
Given two sequences with similarity: Can one distinguish the two szenarios?
- Roland Krause
orthologs are a subset of homologs, and homologs intersect with the set of significantly similar sequences
- Ruchira S. Datta
also have distant homologs which don't appear to be significantly similar
- Ruchira S. Datta
A Venn diagram, including orthologs, homologs, distant homologs and significantly similar sequences with modification.
- Roland Krause
inferences that can be drawn from vertical descent (similar molecular functions) and domain insertion (bindng partners) are different
- Allyson Lister
Biological interpretation of vertical descent: molecular function; regulation; comparative mapping; processes of gene duplication and genome rearrangement.
- Gabriele Sales
Interpretations of domain insertion: protein specialization; ligand specificity; localization; process of domain shuffling.
- Gabriele Sales
vertical descent implies similar: molecular function, regulation, comparative mapping, and is useful for processes of duplication and genome rearrangement
- Ruchira S. Datta
domain insertion leads to relationships of protein specialization, ligand binding, and cellular localization
- Ruchira S. Datta
In animals and plants multidomain sequences become more important than in bacteria.
- Gabriele Sales
The more higher eukaryotes will be sequenced, the more the problem needs to be addressed.
- Roland Krause
therefore, among similar sequences, want to distinguish which are related by vertical descent, and which by domain insertion
- Ruchira S. Datta
people look at sequence similarity E-value, and at alignment coverage
- Ruchira S. Datta
Alignment length is typically used to distinguish domain re-arrangements. Needs a decent mode model.
- Roland Krause
Good example that sequence similarity or e-values are not capable of distinguishing the two caes.
- Roland Krause
The goal of this method is to identify sequence pairs related by VD and DI,and should work on a broad range of families
- Allyson Lister
And needs to be computationally feasible.
- Roland Krause
To test, they looked at 20 well-studied families related by vertical descent.
- Allyson Lister
They had a much larger set of negative examples (40,000).
- Allyson Lister
PSI-BLAST performs worse then BLAST for sequences with variable architecture multi-domain proteins(!) as it pulls in non-homologous parts of sequences.
- Roland Krause
All methods do well with conserved multidomain proteins. They were more challenged by Variable multidomain, where Psi-BLAST doesn't do as well as BLAST. Both methods are extremely challenged when all the sequences were put into the analysis together.
- Allyson Lister
Pairwise comparisons are not sufficient. Try networks instead.
- Gabriele Sales
Pairwise sequences might not be enough, use the structure of the similarity networks.
- Roland Krause
Two sequences are compared in the context of their respective neighborhoods (i.e. other sequences that show similarity).
- Gabriele Sales
Domain architecture is implicitly present in the network.
- Allyson Lister
Open question. The model is explicitly based on insertion and deletion. What about de novo sequence formation?
- Gabriele Sales
Comment by Kevin Karplus: Use log scale for false positives in the ROC plots.
- Roland Krause