Friday, April 28, 2006

Exploring Functional Landscapes of Proteins via Manifold Embeddings of the Gene Ontology


Gilad Lerman
UM, Mathematics

Friday, April 28, 2006
11:15 am

402 Walter Library

While rigorous measures of similarity for protein sequence and structure are now well established, the problem of defining functional relationships has been particularly daunting. Here, we present several, novel manifold embedding techniques to estimate distances between Gene Ontology (GO) functional annotations. We apply the embeddings to define functional distances between protein domains. To evaluate accuracy, we correlate the functional distance to the well-established measures of sequence, structural and phylogenetic similarity. Finally, we show that manual classification of structures into folds and superfamilies is mirrored by proximity in the newly defined function space.? We show how functional distances place structure-function relationships in biological context resulting in novel inferences about divergent and convergent evolution. Our methods and results can be readily generalized and applied to a wide array of biologically relevant investigations, such as accuracy of annotation transference, the relationship between sequence, structure, and function, or coherence of expression modules. This is a joint work with Borya Shakhnovich, Bioinformatics Program, Boston University.