The ICSI Meeting Corpus: Close-Talking and Far-Field, Multi-Channel Transcriptions for Speech and Language Researchers
Title | The ICSI Meeting Corpus: Close-Talking and Far-Field, Multi-Channel Transcriptions for Speech and Language Researchers |
Publication Type | Conference Paper |
Year of Publication | 2004 |
Authors | Edwards, J. |
Published in | Proceedings of the Workshop on Compiling and Processing Spoken Language Corpora at the Fourth International Conference on Language Resources and Evaluation (LREC 2004) |
Page(s) | 8-11 |
Other Numbers | 486 |
Abstract | The recently-completed ICSI Meeting Corpus is available through the LDC. It consists of audio and transcripts of 75 research meetings, ranging in size from 3 to 10 people, with an average of 6 people. The meetings were recorded by means of both close-talking (headset or lapel) microphones and far-field (table-top) microphones. The close-talking microphones enable separation of each person's audible activities from those of every other participant. The far-field microphones provide a view of the meeting as a whole. The transcripts preserve words and other communicative phenomena, displayed in musical score format, time-synchronized to the digitized audio recordings. The corpus is intended as a resource for both speech researchers and language researchers. This paper describes the methods used to prepare the corpus, some interesting challenges and solutions, and the benefits of using both close-talking and far-field microphones. |
URL | http://www.icsi.berkeley.edu/ftp/global/pub/speech/papers/edwards-lrec2004.pdf |
Bibliographic Notes | Proceedings of the Workshop on Compiling and Processing Spoken Language Corpora at the Fourth International Conference on Language Resources and Evaluation (LREC 2004), pp. 8-11 |
Abbreviated Authors | J. A. Edwards |
ICSI Research Group | Speech |
ICSI Publication Type | Article in conference proceedings |