Automatic Induction of Finite State Transducers for Simple Phonological Rules

TitleAutomatic Induction of Finite State Transducers for Simple Phonological Rules
Publication TypeTechnical Report
Year of Publication1994
AuthorsGildea, D., & Jurafsky D.
Other Numbers922
Abstract

This paper presents a method for learning phonological rules from sample pairs of underlying and surface forms, without negative evidence. The learned rules are represented as finite state transducers that accept underlying forms as input and generate surface forms as output. The algorithm for learning them is an extension of the OSTIA algorithm for learning general subsequential finite state transducers. Although OSTIA is capable of learning arbitrary s.f.s.t's in the limit, large dictionaries of actual English pronunciations did not give enough samples to correctly induce phonological rules. We then augmented OSTIA with two kinds of knowledge specific to natural language phonology, representing a naturalness bias from "universal grammar." A bias that underlying phones are often realized as phonetically similar or identical surface phones was implemented by using alignment information between the underlying and surface strings. A bias that phonological rules apply across natural phonological classes was implemented by learning decision trees based on phonetic features on each state of the transducer. The additions helped in learning more compact, accurate, and general transducers than the unmodified OSTIA algorithm. An implementation of the algorithm successfully learns a number of English postlexical rules, including flapping, t-insertion and t-deletion.

URLhttp://www.icsi.berkeley.edu/ftp/global/pub/techreports/1994/tr-94-052.pdf
Bibliographic Notes

ICSI Technical Report TR-94-052

Abbreviated Authors

D. Gildea and D. Jurafsky

ICSI Publication Type

Technical Report