| |
Combining LVCSR and STD for Robust Speech Retrieval
Douglas W. Oard
University of Maryland
Tuesday, September 01, 2009
12:30
Well tuned Large-Vocabulary Continuous Speech Recognition (LVCSR) has been
shown to generally be more effective than vocabulary-independent
techniques as a basis for topic-based ranked retrieval of spoken content.
Tuning LVCSR systems to a topic domain can be costly, however, and
Out-Of-Vocabulary (OOV) query terms can adversely affect retrieval
effectiveness when that tuning is not performed. I will show, however,
that retrieval effectiveness for queries with OOV terms can be
substantially improved by combining evidence from LVCSR with additional
evidence from utterance-scale Spoken Term Detection (STD). The combination
is performed by using relevance judgments from held-out topics to learn
generic (i.e., topic-independent), smooth, non-decreasing transformations
from LVCSR and STD system scores to relevance probabilities. I'll describe
an evaluation using a test collection that includes, conversational speech
audio from an oral history collection, topics based on actual requests for
information in that collection, and relevance judgments made by trained
experts. For short queries, our combined system recovers 57% of the mean
average precision that could have been obtained through LVCSR domain
tuning. This is joint work with Scott Olsson. I'll conclude my talk with
a few remarks about some recently completed work on conversational text,
and on my plans for speech-related work during my sabbatical visit here at
Berkeley.
Bio: Douglas Oard is an Associate Professor at the University of Maryland,
College Park, with joint appointments in the College of Information
Studies and the Institute for Advanced Computer Studies; he is on
sabbatical at Berkeley's iSchool for the Fall 2009 semester. Dr. Oard
earned his Ph.D. in Electrical Engineering from the University of
Maryland. His research interests center around the use of emerging
technologies to support information seeking by end users, with recent work
on interactive techniques for cross-language information retrieval and
techniques for search and sense-making in conversational media.
Additional information is available at http://www.glue.umd.edu/~oard/.
|
|