Event

 
 

The Complexity of Phrase Alignment Models

John Denero


Monday, April 28, 2008
12:30

Unsupervised models that align phrases instead of words are a natural progression in machine translation. However, while some effective word alignment models (Model 1, Model 2 & HMM) can be learned tractably with EM, phrase alignment models cannot. I'll talk about how to show that estimation and inference under these models is intractable. Then, I'll present two useful approximation techniques. First, I'll talk about how to cast phrase alignment search as an integer linear programming (ILP) problem and find the optimal alignment reliably and quickly with off-the-shelf ILP software. Second, we'll look at how to estimate translation probabilities under a phrase alignment model using a Gibbs sampling procedure. I'll present two new estimators of phrase alignment probabilities and how they perform in a phrase-based translation pipeline.

 
Copyright © 2005 International Computer Science Institute. All Rights Reserved.