An Efficient Probabilistic Context-Free Parsing Algorithm That Computes Prefix Probabilities

TitleAn Efficient Probabilistic Context-Free Parsing Algorithm That Computes Prefix Probabilities
Publication TypeTechnical Report
Year of Publication1993
AuthorsStolcke, A.
Other Numbers853
Abstract

We describe an extension of Earley's parser for stochastic context-free grammars that computes the following quantities given a stochastic context-free grammar and an input string: a) probabilities of successive prefixes being generated by the grammar; b) probabilities of substrings being generated by the nonterminals, including the entire string being generated by the grammar; c) most likely (Viterbi) parse of the string; d) posterior expected number of applications of each grammar production, as required for reestimating rule probabilities. (a) and (b) are computed incrementally in a single left-to-right pass over the input. Our algorithm compares favorably to standard bottom-up parsing methods for SCFGs in that it works efficiently on sparse grammars by making use of Earley's top-down control structure. It can process any context-free rule format without conversion to some normal form, and combines computations for (a) through (d) in a single algorithm. Finally, the algorithm has simple extensions for processing partially bracketed inputs, and for finding partial parses and their likelihoods on ungrammatical inputs.

URLhttp://www.icsi.berkeley.edu/ftp/global/pub/techreports/1993/tr-93-065.pdf
Bibliographic Notes

ICSI Technical Report TR-93-065

Abbreviated Authors

A. Stolcke

ICSI Publication Type

Technical Report