Blind Non-Intrusive Speech Intelligibility Prediction Using Twin-HMMs

TitleBlind Non-Intrusive Speech Intelligibility Prediction Using Twin-HMMs
Publication TypeConference Paper
Year of Publication2016
AuthorsKarbasi, M., Abdelaziz A. Hussen, Meutzner H., & Kolossa D.
Published inProceedings of Interspeech 2016
Abstract

Automatic prediction of speech intelligibility is highly desirable in the speech research community, since listening tests are time-consuming and can not be used online. Most of the available objective speech intelligibility measures are intrusive methods, as they require a clean reference signal in addition to the corresponding noisy/processed signal at hand. In order to overcome the problem of predicting the speech intelligibility in the absence of the clean reference signal, we have proposed in [1] to employ a recognition/synthesis framework called twin hidden Markov model (THMM) for synthesizing the clean features, required inside an intrusive intelligibility prediction method. The new framework can predict the speech intelligibility equally well as well-known intrusive methods like the short-time objective intelligibility (STOI). The original THMM, however, requires the correct transcription for synthesizing the clean reference features, which is not always available. In this paper, we go one step further and investigate the use of the recognized transcription instead of the oracle transcription for obtaining a more widely applicable speech intelligibility prediction. We show that the output of the newly-proposed blind approach is highly correlated with the human speech recognition results, collected via crowdsourcing in different noise conditions.

Acknowledgment

This research has received funding from the European Union’s Seventh Framework Programme FP7/2007-2013/ under REA grant agreement n◦ [317521]. The authors would like to thank Jon Barker for providing a noisy version of the Grid database with comprehensive listening test results.

ICSI Research Group

Speech