University of Minnesota
University Relations
http://www.umn.edu/urelate
612-624-6868
myU OneStop


Go to unit's home.

Home | Seminars and Symposia | Past seminars/symposia: Friday, October 6, 2006

An optimization framework for applying discriminative classifiers to protein domain segmentation

by

Rui Kuang
Computer Science

Friday, October 6, 2006
12:00 Lunch
12:15 Seminar

402 Walter Library

The problem of locating and classifying the domains in a protein sequence is a long-standing challenge in computational biology. State-of-the-art methods perform domain segmentation by scanning the query sequence using a database of generative models of known domain families. However, many studies suggest that detection of remote homology can be accomplished more effectively by using a discriminative classifier rather than a generative model. We propose several dynamic programming and linear programming algorithms for simultaneous domain segmentation and classification in the remote homology setting. Our algorithms recognize domains using support vector machine (SVM) classifiers trained for remote homology detection. We optimize the segmentation of the sequence based on the classification scores of domain recognizers over subsequences of the protein. Our experiments on two SCOP datasets show that the proposed algorithms achieve significantly better results for both domain recognition and boundary identification than a baseline algorithm based on PSI-BLAST. Moreover, use of the optimization framework to combine segment-based SVM prediction scores outperforms the simple greedy approach of locating the highest scoring non-overlapping segments.