| |
Speech and Human-Computer Interaction in BID
John Canny
UC Berkeley
Tuesday, July 07, 2009
12:30
At the Berkeley Institute of Design, we have active research projects in mobile interaction, smart spaces, and technology for developing
regions. All of these areas favor speech and natural language interfaces. This talk will cover several of our projects. First, in
embedded speech, we have been porting and open-source large-vocabulary speech recognizer (CMU Sphinx 3), to mobile platforms.
We have a prototype that runs on most ARM-based smartphones. Second, our lab uses a voice interface for lighting control which learns user lighting preferences and the phrases they use to describe them. Third, the lab has a coarse resolution distributed microphone array for speaker
localization and noise cancellation. Fourth, we have developed a pronunciation feedback system for learners of English as part of our
MILLEE project (English-as-a-second-language learning in India and for migrant communities in the US). Fifth, we are developing voice-based conversational agents for language learning. And finally, we are exploring the general problem of human learning of complex action, using speech as an example. We treat this as a massive model selection problem, and use simple cross-validation criterion to enable a
fraction of potential models during training. This model assumes that learning is online: i.e. using a single-pass training phase over a
massive data set.
|
|