Supervised Deep Hashing for Highly Efficient Cover Song Detection
Title | Supervised Deep Hashing for Highly Efficient Cover Song Detection |
Publication Type | Conference Paper |
Year of Publication | 2019 |
Authors | Ye, Z., Choi J., & Friedland G. |
Published in | Proceedings of the 2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR) |
Abstract | This paper proposes a supervised deep hashing approach for highly efficient and effective cover song detection. Our system consists of two identical sub-neural networks, each one having a hash layer to learn a binary representations of input audio in the form of spectral features. A loss function joins the two outputs of the sub-networks by minimizing the Hamming distance for a pair of audio files covering the same music work. We further enhance system performance by loudness embedding, beat synchronization, and early fusion of input audio features. The output of 128- bit hash reaches state-of-the-art performance with mean pairwise accuracy. This system demonstrates the possibility of memory-efficient and real-time efficient cover song detection with satisfiable accuracy in large scale. |
Acknowledgment | This project is partially funded by an AWS Research Grant and a collaborative Strategic Initiative grant led by Lawrence Livermore National Laboratory (U.S. Dept. of Energy contract DE-AC52-07NA27344). Any findings and conclusions are the authors, and do not necessarily reflect the views of the funders. |
URL | https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8695391 |
ICSI Research Group | Audio and Multimedia |