| |
Spoken Document Retrieval and Browsing
Ciprian Chelba
Google
Tuesday, September 02, 2008
12:30
Ever increasing computing power and connectivity bandwidth together
with falling storage costs result in an overwhelming amount of data of
various types being produced, exchanged, and stored. Consequently,
search emerges as a key application as more and more data is being
saved.
Speech search has not received much attention due to the fact that
large collections of untranscribed spoken material have not been
available, mostly due to storage constraints. As storage becomes
cheaper, the availability and usefulness of large collections of
spoken documents is limited strictly by the lack of adequate
technology to exploit them. Manually transcribing speech is expensive
and sometimes outright impossible due to privacy concerns. This leads
us to exploring an automatic approach to searching and navigating
spoken document collections.
The talk will focus on techniques for the indexing and retrieval of
spoken audio files, and results on a corpus (MIT iCampus) containing
recorded academic lectures.
|
|