Easy Does It: Robust Spectro-Temporal Many-Stream ASR Without Fine Tuning Streams

TitleEasy Does It: Robust Spectro-Temporal Many-Stream ASR Without Fine Tuning Streams
Publication TypeConference Paper
Year of Publication2012
AuthorsRavuri, S., & Morgan N.
Page(s)4309-4312
Other Numbers3242
Abstract

Previous work has shown that spectro-temporal features reduce the word error rate for automatic speech recognition under noisy conditions. These systems, however, required significant hand-tuning in order to determine which spectral and temporal modulations should be included in a particular stream. In this work, streams are split into one spectral and temporal modulation each and their posterior probabilities are combined once each stream is discriminatively trained via multilayer perceptron. We show that this combination structure performs as well or better than more elaborate methods in which multiple spectral and temporal modulations are hand-picked per stream. In addition, these type of features outperform standard noise-robust features such as the “Advanced Front End” features, whereas our hand-picked spectro-temporal features do not.

URLhttp://www.icsi.berkeley.edu/pubs/speech/easydoesit12.pdf
Bibliographic Notes

Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2012), pp. 4309-4312, Kyoto, Japan

Abbreviated Authors

S. Ravuri and N. Morgan

ICSI Research Group

Speech

ICSI Publication Type

Article in conference proceedings