Broad Phonetic Classes for Speaker Verification with Noisy, Large-Scale Data

TitleBroad Phonetic Classes for Speaker Verification with Noisy, Large-Scale Data
Publication TypeTechnical Report
Year of Publication2014
AuthorsLei, H., & Mirghafori N.
Other Numbers3689
Abstract

While the incorporation of phonetic information has contributed to speaker verification improvements for lexically unconstrained speech in the past, improvements have not been widely observed using the state-of-the-art i-vector system, which typically performs best using a "bag-of-frames" approach. This work explores ways to incorporate Broad Phonetic Class (BPC) information for the i-vector system with noisy speech data that is not lexically constrained. Different approaches for combining the BPCs have been examined. Results suggest that, through parallelization and combination strategies, BPCs may contribute to roughly a 13% improvement over an i-vector baseline system. However, confounding factors such as increased parameter size, use of noise-generated speech data, and the advantage of combination strategies are potential caveats to attributing the improvement to the discriminating power of BPCs alone.

Acknowledgment

This work was funded by Air Force Research Laboratory (AFRL) award FA8750-12-1-0016. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the view of the AFRL.

URLhttps://www.icsi.berkeley.edu/pubs/techreports/TR-14-002.pdf
Bibliographic Notes

ICSI Technical Report TR-14-001

Abbreviated Authors

H. Lei and N. Mirghafori

ICSI Research Group

Speech

ICSI Publication Type

Technical Report