| |
Biologically-Inspired Multistream Features for Large Vocabulary Speech Recognition
Suman Ravuri
ICSI
Tuesday, June 09, 2009
12:30
Most modern features used for ASR are carefully-crafted
single features that attempt to modify the entire spectrum in some
meaningful manner. In this approach, we use multiple streams to
perform automatic speech recognition in a large vocabulary setting.
The streams are inspired by ferrets' cortical responses to audio
stimuli and are split into different temporal and spectral
modulations. Its basic advantage over single stream features is that
if one or more of the streams is corrupted by noise, other features
can be used to perform robust speech recognition. We show how these
features are incorporated into an automatic speech recognition system
and using only 4 streams, obtain results that rival state-of-the-art
systems. Moreover, we discuss opportunities and challenges posed when
moving from 4 to hundreds of streams.
|
|