Event

 
 

Generative Models for Unsupervised Language Learning

Tom Griffiths

UCB Psychology Department

Tuesday, February 10, 2009
12:30

Learning a language requires making inferences about the components of the language at many different levels, from sorting sounds into phonemes to recognizing which words are semantically related. Human learners typically make these inferences without direct instruction, performing a kind of unsupervised learning. In statistics, unsupervised learning is often treated as a problem of density estimation: a class of generative models is specified, and learning consists of estimating the parameters of that model. From this perspective, understanding human learning reduces to a question of how to specify appropriate generative models for natural language. I will talk about recent work exploring two kinds of generative models for unsupervised language learning: nonparametric Bayesian models, which provide a way to capture the statistics of natural languages in a way that is beneficial for identifying latent structure, and topic models, which pick out the long-range correlations between the occurrence of words that are relevant to identifying semantic relatedness.

 
Copyright © 2005 International Computer Science Institute. All Rights Reserved.