Event

 
 

A Study of the Extended Baum-Welch Algorithm and Conditional Maximum Likelihood Estimation Using Hidden Markov Models

Steve Wegmann

Nuance

Friday, August 22, 2008
2:00 pm

Maximum mutual information (MMI), a special case of conditional maximum likelihood, is an estimation technique which has been successfully applied to speech recognition problems using hidden Markov models for many years: at first on small tasks using multinomial models, later to small tasks with multivariate normal models, through to the present on very large tasks using mixture, multivariate normal models. Although MMI has been extremely successful at reducing the word error rate on difficult speech recognition tasks, the current MMI machinery, the Extended Baum-Welch algorithm, is not as well understood as maximum likelihood estimation using the Baum-Welch algorithm. Previous work in the relevant literature has concentrated on showing that the conditional likelihood increases with each iteration of Extended Baum-Welch, but it has largely ignored questions such as: what happens to the model parameters during repeated iterations of Extended Baum-Welch? In this talk, we examine this and related questions by running hundreds of iterations of Extended Baum-Welch on a standard Wall Street Journal task. We also report on preliminary work which attempts to identify why Extended Baum-Welch is so successful at reducing the error rate. ---- bio ---- Steven Wegmann is a Principal Research Scientist at Nuance Communications where he studies the mathematical and statistical properties of algorithms used in speech recognition and heads a small group of acoustic modelers.

Steven came to Nuance via its acquisition of VoiceSignal Technologies. At VoiceSignal he was the lead acoustic modeler in a small group that developed models for speaker independent name and digit recognition products in more than twenty languages that shipped in hundreds of millions of cell phones. This group also developed models for discrete and continuous dictation, mobile search, and hmm-based synthesis products.

Before joining VoiceSignal and before a brief post-acquisition stint at Lernout and Hauspie, Steven was a member of the research department at Dragon Systems where he was a key contributor on a variety of government sponsored research projects. Earlier in his career, he was a mathematician who specialized in algebraic topology. He obtained his Ph.D. in mathematics at the University of Warwick while he was a Marshall Scholar.

 
Copyright © 2005 International Computer Science Institute. All Rights Reserved.