Event

 
 

Articulatory inversion: implementing improvements and exploring implications for perceptual studies

Pierre Divenyi and Adam Lammert

Speech and Hearing Research Lab, VA Martinez

Monday, July 28, 2008
12:30

Over the last quarter century, a number of investigators have been interested in decomposing the speech signal into functions of the various articulatory structures which generate the signal. Our work is, in part, aimed at recovering those underlying articulatory gesture functions. This consists of building a frame-by-frame training codebook of conditional probabilities of a certain gesture having a certain magnitude, given a certain acoustic pattern. Gestures underlying a test signal are then recovered using dynamic programming, estimating the minimum cost trace across the conditional probabilities. The gestures used to build the codebook were derived from the TaDA system – an articulatory synthesizer developed at Haskins Laboratories. If, hypothetically, the human listener also has a similar codebook at his disposal, and he/she is able to extract the gesture functions from speech reaching his/her ear, then it is possible that speech is perceived by reconstituting the message from the slowly varying gesture functions rather than by phonemic segmentation of the acoustic signal. To illustrate this postulated perceptual process, the results of a study will be shown in which the central portion of spondee words was cut out and replaced by various fillers, which retain different types and amounts of information from the excised speech. Confusions based on gesture distances between presented and reported words possess interesting properties.

 
Copyright © 2005 International Computer Science Institute. All Rights Reserved.