Multimodal Addressee Detection in Multiparty Dialogue Systems
Title | Multimodal Addressee Detection in Multiparty Dialogue Systems |
Publication Type | Conference Paper |
Year of Publication | 2015 |
Authors | Tsai, T.. J., Stolcke A., & Slaney M. |
Other Numbers | 3811 |
Abstract | Addressee detection answers the question, Are you talking to me?When multiple users interact with a dialogue system, it is importantto know when a user is speaking to the computer and when he or sheis speaking to another person. We approach this problem from a multimodalperspective, using lexical, acoustic, visual, dialog state, andbeam-forming information. Using data from a multiparty dialoguesystem, we demonstrate the benefit of using multiple modalities overusing a single modality. We also assess the relative importance of thevarious modalities in predicting the addressee. In our experiments,we find that acoustic features are by far the most important, that ASRand system-state information are useful, and that visual and beamformingfeatures provide little additional benefit. Our study suggeststhat acoustic, lexical, and system state information are an effective,economical combination of modalities to use in addressee detection. |
URL | https://www.icsi.berkeley.edu/pubs/speech/mmaddressee15.pdf |
Bibliographic Notes | Proceedings of the 40th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2015), Brisbane, Australia |
Abbreviated Authors | TJ Tsai, A. Stolcke, and M. Slaney |
ICSI Research Group | Speech |
ICSI Publication Type | Article in conference proceedings |