-
Combining Large-Vocabulary Continuous Speech Recognition and Spoken Term Detection for Robust Speech
Tue, Oct 20, 2009 @ 10:30 AM - 12:00 PM
Ming Hsieh Department of Electrical and Computer Engineering
Conferences, Lectures, & Seminars
Guest Speaker:
Douglas W. Oard
University of MarylandAbstract: Well tuned Large-Vocabulary Continuous Speech Recognition (LVCSR) has been shown to generally be more effective than vocabulary-independent techniques as a basis for topic-based ranked retrieval of spoken content. Tuning LVCSR systems to a topic domain can be costly, however, and Out-Of-Vocabulary (OOV) query terms can adversely affect retrieval effectiveness when that tuning is not performed. I will show, however, that retrieval effectiveness for queries with OOV terms can be substantially improved by combining evidence from LVCSR with additional evidence from utterance-scale Spoken Term Detection (STD). The combination is performed by using relevance judgments from held-out topics to learn generic (i.e., topic-independent), smooth, non-decreasing transformations from LVCSR and STD system scores to relevance probabilities. I'll describe an evaluation using a test collection that includes, conversational speech audio from an oral history collection, topics based on actual requests for information in that collection, and relevance judgments made by trained experts. For short queries, our combined system recovers 57% of the mean average precision that could have been obtained through LVCSR domain tuning. This is joint work with Scott Olsson, using a test collection built in collaboration with Sam Gustman of the USC Shoah Foundation Institute for Visual History and Education.About the speaker: Douglas Oard is an Associate Professor at the University of Maryland, College Park, with joint appointments in the College of Information Studies and the Institute for Advanced Computer Studies; he is on sabbatical at Berkeley's iSchool for the Fall 2009 semester. Dr. Oard earned his Ph.D. in Electrical Engineering from the University of Maryland. His research interests center around the use of emerging technologies to support information seeking by end users, with recent work on interactive techniques for cross-language information retrieval and techniques for search and sense-making in conversational media. Additional information is available at http://www.glue.umd.edu/~oard/.Hosted by Shrikanth NarayananLocation: Hughes Aircraft Electrical Engineering Center (EEB) - 248
Audiences: Everyone Is Invited
Contact: Talyia Veal