Using a GPU, Online Diarization = Offline Diarization

TitleUsing a GPU, Online Diarization = Offline Diarization
Publication TypeTechnical Report
Year of Publication2012
AuthorsFriedland, G.
Other Numbers3233
Abstract

This article presents a low-latency, online speaker diarization system ("who is speaking now?") based on the repeated execution of a GPU-optimized, highly efficient offline diarization system ("who spoke when"). The system fulfills all requirements of the diarization task, i.e., it does not require any a priori information about the input, including specific speaker models. In contrast to earlier attempts at online diarization, the system achieves similar accuracy to the underlying offline system and does not require explicit detection of new speakers. Using GPUs, online diarization has become a side-effect of offline diarization, obsoleting the requirement for specialized online diarization systems.

Acknowledgment

This research is partly supported by Microsoft (Award #024263) and Intel (Award #024894) funding, by matching funding by U.C. Discovery (Award #DIG07-10227), and a CISCO URP grant. I want to thank the following persons for their support in writing this article: Adam Janin, Luke Gottlieb, Carlos Vaquero, Henry Cook, Mary Knox, and Nelson Morgan.

URLhttp://www.icsi.berkeley.edu/pubs/techreports/TR-12-004.pdf
Bibliographic Notes

ICSI Technical Report TR-12-004

Abbreviated Authors

G. Friedland

ICSI Research Group

Speech

ICSI Publication Type

Technical Report