Easy Does It: Robust Spectro-Temporal Many-Stream ASR Without Fine Tuning Streams
Title | Easy Does It: Robust Spectro-Temporal Many-Stream ASR Without Fine Tuning Streams |
Publication Type | Conference Paper |
Year of Publication | 2012 |
Authors | Ravuri, S., & Morgan N. |
Page(s) | 4309-4312 |
Other Numbers | 3242 |
Abstract | Previous work has shown that spectro-temporal features reduce the word error rate for automatic speech recognition under noisy conditions. These systems, however, required significant hand-tuning in order to determine which spectral and temporal modulations should be included in a particular stream. In this work, streams are split into one spectral and temporal modulation each and their posterior probabilities are combined once each stream is discriminatively trained via multilayer perceptron. We show that this combination structure performs as well or better than more elaborate methods in which multiple spectral and temporal modulations are hand-picked per stream. In addition, these type of features outperform standard noise-robust features such as the Advanced Front End features, whereas our hand-picked spectro-temporal features do not. |
URL | http://www.icsi.berkeley.edu/pubs/speech/easydoesit12.pdf |
Bibliographic Notes | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2012), pp. 4309-4312, Kyoto, Japan |
Abbreviated Authors | S. Ravuri and N. Morgan |
ICSI Research Group | Speech |
ICSI Publication Type | Article in conference proceedings |