University of Minnesota
University Relations
myU OneStop

Go to unit's home.

Home | Seminars and Symposia | Past seminars/symposia: Wednesday, October 13, 2010

DTC Leading Edge Seminar Series

Automatic Summarization


Frank Schilder
Thomson Reuters

Wednesday, October 13, 2010
3:30 p.m. reception
4:00 p.m. seminar

401/402 Walter Library

Professionals (e.g., scientists, journalists, lawyers) often need to get a quick impression of science findings, news stories or legal issues. To do this, they rely on summaries and abstracts, but how are those summaries produced? Human-written summaries are too time intensive to be useful for people seeking minute-by-minute information updates in our fast-paced internet age. Thomson Reuters R&D has therefore spent the last five years conducting research on automatic text summarization, and has developed several innovative solutions. Natural Language Processing (NLP) techniques have been used to create automatic summaries that tend to be of lower quality than human-produced summaries but are fast and still valuable for the information seeker. I will present one of our NLP approaches to multi-document summarization and discuss how the quality of summaries can be automatically evaluated. Our summarization system is based on a machine learning method called support vector regression (SVR).


Frank Schilder is a lead research scientist at the Research & Development department of Thomson Reuters. He joined Thomson Reuters in 2004, where he has been doing applied research on summarization technologies and information extraction systems. His summarization work has been implemented as the snippet generator for search results of WestLawNext, the new legal research system produced by Thomson Reuters. His current research activities involve participation in different research competitions such as the Text Analysis Conference (TAC) carried out by the National Institute of Standards and Technology (NIST). He obtained a Ph.D. in Cognitive Science from the University of Edinburgh, Scotland, in 1997. From 1997 to 2003, he was employed by the Department for Informatics at the University of Hamburg, Germany, first as a post-doctoral researcher and later as an assistant professor. Frank has authored several journal articles and book chapters including "Natural language processing: Overview" from the Encyclopedia of Language and Linguistics, co-authored with Peter Jackson, the chief scientist of Thomson Reuters. He serves as reviewer for journals in Computational Linguistics and as program committee member of various conferences organized by the Association of Computational Linguistics (ACL).