Home | Seminars and Symposia | Past seminars/symposia: Monday, May 12, 2003

DTC Seminar Series

Finding Pattern Similarity in Large Databases


Philip S. Yu

Monday, May 12, 2003
3:30 pm

402 Walter Library

Finding pattern similarity in large databases has wide spread applications. For example, in Bioinformatics, researchers often need to retrieve sequences which has similar portions to a given sequence, or identify genes whose expression levels rise and fall coherently under a set of experimental conditions; i.e., they exhibit fluctuation of similar shape when conditions change. In e-commerce, collaborative filtering methods are often employed to make product recommendation based on purchase patterns of other consumers with similar preferences. As the size of the database increases, brute force scanning of the databases to discover pattern similarity becomes prohibitively expensive. We examine the issues on using indexing to accelerate the retrieval of similar patterns and present efficient indexing methods to facilitate similarity search. We then consider how to perform clustering based on pattern similarity.


Philip S. Yu received M.S. and Ph.D. degrees in Electrical and Engineering from Stanford University. He is with the IBM Thomas J. Watson Research Center and currently manager of the Software Tools and Techniques group. His research interests include data mining, Internet applications and technologies, database systems, multimedia systems, parallel and distributed processing, and performance modeling. Dr. Yu has published more than 340 papers in reference journals and conferences. He holds or has applied for more than 250 US patents. Dr. Yu is a Fellow of the ACM and a Fellow of the IEEE. He is the Editor-in-Chief of IEEE Transactions on Knowledge and Data Engineering. He is also an associate editor of ACM Transactions on the Internet Technology and that of Knowledge and Information Systems. He is a member of the IEEE Data Engineering steering committee and is also on the steering committee of IEEE Conference on Data Mining. In addition to serving as program committee member on various conferences, he was the program co-chairs of the 11th International Conference on Data Engineering and the 6th Pacific Area Conference on Knowledge Discovery and Data Mining, and the program chairs of the 2nd International Workshop on Research Issues on Data Engineering: Transaction and Query Processing, the PAKDD Workshop on Knowledge Discovery from Advanced Databases, and the 2nd International Workshop on Advanced Issues of E-Commerce and Web-based Information Systems. He has received several IBM and external honors including Best Paper Award, two IBM Outstanding Innovation Awards, an Outstanding Technical Achievement Award, two Research Division Awards and the 71st plateau of Invention Achievement Awards. He also received an IEEE Region 1 Award for "promoting and perpetuating numerous new electrical engineering concepts" in 1999. Dr. Yu is an IBM Master Inventor.