Event

 
 

Why Does Maximum Mutual Information Estimation Work So Well?

Steven Wegmann and Larry Gillick

Nuance Communications

Tuesday, January 20, 2009
12:30

Why does maximum mutual information estimation (MMI) consistently outperform maximum likelihood estimation (MLE) on speech recognition tasks using hidden Markov models? A standard statistical argument shows that if our model assumptions were correct, then MMI would not outperform MLE. The natural question to ask is what erroneous model assumptions is MMI compensating for? In this talk we attempt to answer this question using two methods. In the first we simulate training and test data that depart from our models in controlled ways and examine recognition results before and after MMI. In the second we assess how the data differ from our models by studying expected and observed properties of the scores that they emit.

 
Copyright © 2005 International Computer Science Institute. All Rights Reserved.