Acoustic Super Models for Large Scale Video Event Detection

TitleAcoustic Super Models for Large Scale Video Event Detection
Publication TypeConference Paper
Year of Publication2011
AuthorsMertens, R., Lei H., Gottlieb L., Friedland G., & Divakaran A.
Other Numbers3202
Abstract

Given the exponential growth of videos published on theInternet, mechanisms for clustering, searching, and browsinglarge numbers of videos have become a major researcharea. More importantly, there is a demand for event detectorsthat go beyond the simple finding of objects butrather detect more abstract concepts, such as “feeding ananimal” or a “wedding ceremony”. This article presents anapproach for event classification that enables searching forarbitrary events, including more abstract concepts, in foundvideo collections based on the analysis of the audio track.The approach does not rely on speech processing, and islanguage-indepent, instead it generates models for a set ofexample query videos using a mixture of two types of audiofeatures: Linear-Frequency Cepstral Coefficients and ModulationSpectrogram Features. This approach can be used incomplement with video analysis and requires no domain specifictagging. Application of the approach to the TRECVidMED 2011 development set, which consists of more than4000 random “wild” videos from the Internet, has shown a

Acknowledgment

This work has been supported by the Intelligence AdvancedResearch Projects Activity (IARPA) via Departmentof Interior National Business Center contract number D11-PC20066. The U.S. Government is authorized to reproduceand distribute reprints for Governmental purposes notwithstandingany copyright annotation thereon. The views andconclusions contained herein are those of the authors andshould not be interpreted as necessarily representing the officialpolicies or endorsements, either expressed or implied,of IARPA, DOI/NBC, or the U.S. Government.

URLhttp://www.icsi.berkeley.edu/pubs/speech/ICSI_acousticsuper11.pdf
Bibliographic Notes

Proceedings of the ACM International Workshop on Events in Multimedia (EiMM11), Scottsdale, Arizona

Abbreviated Authors

R. Mertens, H. Lei, L. Gottlieb, G. Friedland, and A. Divakaran

ICSI Research Group

Audio and Multimedia

ICSI Publication Type

Article in conference proceedings