Event

 
 

Biologically-Inspired Multistream Features for Large Vocabulary Speech Recognition

Suman Ravuri

ICSI

Tuesday, June 09, 2009
12:30

Most modern features used for ASR are carefully-crafted single features that attempt to modify the entire spectrum in some meaningful manner. In this approach, we use multiple streams to perform automatic speech recognition in a large vocabulary setting. The streams are inspired by ferrets' cortical responses to audio stimuli and are split into different temporal and spectral modulations. Its basic advantage over single stream features is that if one or more of the streams is corrupted by noise, other features can be used to perform robust speech recognition. We show how these features are incorporated into an automatic speech recognition system and using only 4 streams, obtain results that rival state-of-the-art systems. Moreover, we discuss opportunities and challenges posed when moving from 4 to hundreds of streams.

 
Copyright © 2005 International Computer Science Institute. All Rights Reserved.