Comparing Different Flavors of Spectro-Temporal Features for ASR

TitleComparing Different Flavors of Spectro-Temporal Features for ASR
Publication TypeConference Paper
Year of Publication2011
AuthorsMeyer, B. T., Ravuri S., Schädler M. René, & Morgan N.
Page(s)1269-1272
Other Numbers3181
Abstract

In the last decade, several studies have shown that the robustnessof ASR systems can be increased when 2D Gabor filtersare used to extract specific modulation frequencies from theinput pattern. This paper analyzes important design parametersfor spectro-temporal features based on a Gabor filter bank:We perform experiments with filters that exhibit different phasesensitivity. Further, we analyze if non-linear weighting with amulti-layer perceptron (MLP) and a subsequent concatenationwith mel-frequency cepstral coefficients (MFCCs) has beneficialeffects. For the Aurora2 noisy digit recognition task, the useof phase sensitive filters improved the MFCC baseline, whereasusing filters that neglect phase information did not. While MLPprocessing alone did not have a large effect on the overall performance,the best results were obtained for MLP-processedphase sensitive filters and added MFCCs, with relative error reductions

Acknowledgment

This work was partially funded by the Deutscher Akademischer Austausch Diesnst (DAAD) through a postdoctoral fellowship granted to Bernd Meyer. Further support was provided to Suman Ravuri by the National Defense Science and Engineering Graduate Fellowship (NDSEG); and to Nelson Morgan by Cisco Systems.

URLhttp://www.icsi.berkeley.edu/pubs/speech/comparingdifferent11.pdf
Bibliographic Notes

Proceedings of the 12th Annual Conference of the International Speech Communication Association (Interspeech 2011), Florence, Italy, pp. 1269-1272

Abbreviated Authors

B. T. Meyer, S. V. Ravuri, M. R. Schaedler, and N. Morgan

ICSI Research Group

Speech

ICSI Publication Type

Article in conference proceedings