The ICSI Meeting Corpus: Close-Talking and Far-Field, Multi-Channel Transcriptions for Speech and Language Researchers

TitleThe ICSI Meeting Corpus: Close-Talking and Far-Field, Multi-Channel Transcriptions for Speech and Language Researchers
Publication TypeConference Paper
Year of Publication2004
AuthorsEdwards, J.
Published inProceedings of the Workshop on Compiling and Processing Spoken Language Corpora at the Fourth International Conference on Language Resources and Evaluation (LREC 2004)
Page(s)8-11
Other Numbers486
Abstract

The recently-completed ICSI Meeting Corpus is available through the LDC. It consists of audio and transcripts of 75 research meetings, ranging in size from 3 to 10 people, with an average of 6 people. The meetings were recorded by means of both close-talking (headset or lapel) microphones and far-field (table-top) microphones. The close-talking microphones enable separation of each person's audible activities from those of every other participant. The far-field microphones provide a view of the meeting as a whole. The transcripts preserve words and other communicative phenomena, displayed in musical score format, time-synchronized to the digitized audio recordings. The corpus is intended as a resource for both speech researchers and language researchers. This paper describes the methods used to prepare the corpus, some interesting challenges and solutions, and the benefits of using both close-talking and far-field microphones.

URLhttp://www.icsi.berkeley.edu/ftp/global/pub/speech/papers/edwards-lrec2004.pdf
Bibliographic Notes

Proceedings of the Workshop on Compiling and Processing Spoken Language Corpora at the Fourth International Conference on Language Resources and Evaluation (LREC 2004), pp. 8-11

Abbreviated Authors

J. A. Edwards

ICSI Research Group

Speech

ICSI Publication Type

Article in conference proceedings