Automatic Music Summarization Method by using the Bit Error Rate of the Audio Fingerprint and a System thereof

Kim, Minseong;Park, Mansoo;Kim, Hoirin;

doi:10.9717/kmms.2013.16.4.453

Journal of Korea Multimedia Society (한국멀티미디어학회논문지)

Volume 16 Issue 4
/
Pages.453-463
/
2013
/
1229-7771(pISSN)
/
2384-0102(eISSN)

Korea Multimedia Society (한국멀티미디어학회)

DOI QR Code

Automatic Music Summarization Method by using the Bit Error Rate of the Audio Fingerprint and a System thereof

오디오 핑거프린트의 비트에러율을 이용한 자동 음악 요약 기법 및 시스템

김민성 (국방기술품질원) ;
박만수 (코난테크놀로지) ;
김회린 (한국과학기술원)

Received : 2013.03.10
Accepted : 2013.04.09
Published : 2013.04.30

https://doi.org/10.9717/kmms.2013.16.4.453 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

In this paper, we present an effective method and a system for the music summarization which automatically extract the chorus portion of a piece of music. A music summary technology is very useful for browsing a song or generating a sample music for an online music service. To develop the solution, conventional automatic music summarization methods use a 2-dimensional similarity matrix, statistical models, or clustering techniques. But our proposed method extracts the music summary by calculating BER(Bit Error Rate) between audio fingerprint blocks which are extracted from a song. But we could directly use an enormous audio fingerprint database which was already saved for a music retrieval solution. This shows the possibility of developing a various of new algorithms and solutions using the audio fingerprint database. In addition, experiments show that the proposed method captures the chorus of a song more effectively than a conventional method.

본 논문은 음악의 코러스(chorus) 구간을 자동으로 추출하는 기법 및 시스템에 대하여 다루었다. 코러스 구간을 자동으로 추출하는 음악 요약 기술은 방대한 음악 데이터베이스에서 특정 음악 검색을 용이하게 할 수 있으며, 온라인 스트리밍 서비스에서 샘플 음악을 생성할 때 사용될 수 있다. 이를 구현하기 위해, 기존의 알고리즘들은 2차원 유사도 행렬, 확률모델, 신경망모델, 템포 특징 벡터, 클러스터링 기법 등을 적절히 활용하여 개발되었다. 본 논문에서는 음악의 오디오 핑거프린트를 추출한 후 곡 내의 오디오 핑거프린트 구간 쌍의 비트에러율을 통해 음악 요약을 추출한다. 다만, 음악 검색 솔루션에서 사용된 오디오 핑거프린트가 데이터베이스에 이미 존재할 경우에는 이를 바로 로딩한 후 비트에러율을 계산하여 음악 요약을 추출할 수 있다. 이런 방법은 이미 만들어진 데이터베이스를 변형 없이 그대로 사용할 수 있음으로써 음악 데이터베이스를 활용한 다양한 알고리즘과 솔루션의 가능성을 보여주었다. 또한, 음악의 코러스를 추출하는데 있어서 기존 방식보다 매우 뛰어난 성능을 보임을 알 수 있었다.

Keywords

References

Placido Domingo, Digital Music Report 2012, International Federation of the Phonographic Industry (IFPI), 2012.
James Manyika, Michael Chui, Brad Brown, Jacques Bughin, Richard Dobbs, Charles Roxburgh, and Angela Hung Byers, Big Data: The Next Frontier for Innovation, Competition, and Productivity, McKinsey & Company, New York, 2011.
Matthew Cooper and Jonathan Foote, "Automatic Music Summarization via Similarity Analysis," Proc. IRCAM, pp. 81-85, 2002.
Sangho Kim, Sungtak Kim, Suk-bong Kwon, and Hoirin Kim, "A Music Summarization Scheme using Tempo Tracking and Two Stage Clustering," IEEE workshop on Multimedia Signal Processing 2006, Vol. 1. pp. 225-228, 2006.
Geoffroy Peeters, Amaury La Burthe, and Xavier Rodet, "Toward Automatic Music Audio Summary Generation from Signal Analysis," Proc. Int. Conf. Music Information Retrieval , pp. 94-100, 2002.
Sangho Kim, Sugntak Kim, and Horin Kim, "Automatic Music Summarization using Vector Quantization and Segment Similarity," The Journal of the Acoustical Society of Korea, Vol. 27, No. 2E, pp. 51-56, 2008.
C. Burges, D. Plastina, J. Platt, E. Renshaw, and H. Malvar, "Using Audio Fingerprinting for Duplicate Detection and Thumbnail Generation," Proce. of the IEEE International Conference on Acoustics, Speech and Signal Processing, Vol. 4, pp. 9-12, 2005.
허태관, 조황원, 남기표, 이재현, 이석필, 박성 주, 박강령, "내용 기반 음원 검출 시스템 구현에 관한 연구," 멀티미디어학회논문지, 제12권, 제11호, pp. 1581-1592, 2009.
J.A. Haitsma and T.Kalker, "A Highly Robust Audio Fingerprinting System," Proc. of ISMIR 2002, pp. 144-148, 2002.
Mansoo Park, Hoi-Rin Kim, and Seung Hyun Yang, "Frequency-Temporal Filtering for a Robust Audio Fingerprinting Scheme in Real-Noise Environments," ETRI Journal , Vol. 28, No. 4, pp. 509-512, 2006. https://doi.org/10.4218/etrij.06.0205.0135
B. Logan and S. Chu, "Music Summarization using Key Phrases," Proc. IEEE International Conference on Acoustics, Speech, Signal Processing, pp. 749-752, 2000.
Is Your Big Data Hot, Warm, or Cold?, http://ibmdatamag.com/2012/06/is-your -big-data-hot-warm-or-cold, 2012.

Cited by

Perceptual Bound-Based Asymmetric Image Hash Matching Method vol.20, pp.10, 2013, https://doi.org/10.9717/kmms.2017.20.10.1619
파워마스크를 이용한 영상 핑거프린트 정합 성능 개선 vol.23, pp.1, 2013, https://doi.org/10.9717/kmms.2020.23.1.008