A Design of Matching Engine for a Practical Query-by-Singing/Humming System with Polyphonic Recordings

Lee, Seok-Pil;Yoo, Hoon;Jang, Dalwon;

doi:10.3837/tiis.2014.02.0024

KSII Transactions on Internet and Information Systems (TIIS)

제8권2호
/
Pages.723-736
/
2014
/
1976-7277(pISSN)
/
1976-7277(eISSN)

한국인터넷정보학회 (Korean Society for Internet Information)

DOI QR Code

A Design of Matching Engine for a Practical Query-by-Singing/Humming System with Polyphonic Recordings

Lee, Seok-Pil (Department of Digital Media Technology, Sangmyung University) ;
Yoo, Hoon (Department of Digital Media Technology, Sangmyung University) ;
Jang, Dalwon (Korea Electronic Technology Institute)

투고 : 2014.12.20
심사 : 2014.01.21
발행 : 2014.02.27

https://doi.org/10.3837/tiis.2014.02.0024 인용 PDF KSCI KPUBS

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

This paper proposes a matching engine for a query-by-singing/humming (QbSH) system with polyphonic music files like MP3 files. The pitch sequences extracted from polyphonic recordings may be distorted. So we use chroma-scale representation, pre-processing, compensation, and asymmetric dynamic time warping to reduce the influence of the distortions. From the experiment with 28 hour music DB, the performance of our QbSH system based on polyphonic database is very promising in comparison with the published QbSH system based on monophonic database. It shows 0.725 in MRR(Mean Reciprocal Rank). Our matching engine can be used for the QbSH system based on MIDI DB also and that performance was verified by MIREX 2011.

키워드

참고문헌

Nicola Orio, "Music Retrieval: A Tutorial and Review," Foundations and Trends in Information Retrieval, vol. 1, no 1, 1-90, 2006. https://doi.org/10.1561/1500000002
J. Stephen Downie, "The Music Information Retrieval Evaluation eXchange (MIREX) Next Generation Project," project prospectus, 2011.
R. Typke, F. Wiering and R. C. Veltkamp, "A survey of music information retrieval systems," in Proc. of ISMIR, pp.153-160, 2005.
G. Tzanetakis, G. Essl and P. Cook, "Automatic musical genre classification of audio signals," in Proc. of Int. Conf. Music Information Retrieval, Bloomington, IN, pp. 205-210, 2001.
D. Jang, M. Jin and C. D. Yoo, "Music genre classification using novel features and a weighted voting method," in Proc. of ICME, 2008.
R. Typke, P. Giannopoulos, R. C. Veltkamp, F. Wiering and R. V. Oostrum, "Using transportation distances for measuring melodic similarity," in Proc. of Int. Conf. Music Information Retrieval, pp. 107-114, 2003.
G. Poliner, D. Ellis, A. Ehmann, E. Gomez, S. Streich and B. Ong, "Melody transcription from music audio: Approaches and evaluation," IEEE Trans. on Audio, Speech, Language Processing, vol. 15, no. 4, pp. 1247-1256, 2007. https://doi.org/10.1109/TASL.2006.889797
S. Jo and C. D. Yoo, "Melody extraction from polyphonic audio based on particle filter," in Proc. of ISMIR, 2010.
D. P.W. Ellis and G. E. Poliner, "Identifying cover songs ith chroma features and dynamic programming beat racking," in Proc. of Int. Conf. Acoustic, Speech and Signal processing, Honolulu, HI, 2007.
J. -S. R. Jang and H.-R. Lee, "A general framework of progressive filtering and its application to query by singing/humming," IEEE Trans. on Audio, Speech, and language Processing, vol. 16, no. 2, pp. 350-358, 2008 . https://doi.org/10.1109/TASL.2007.913035
J. S. Seo, M. Jin, S. Lee, D. Jang, S. Lee and C. D. Yoo, "Audio fingerprinting based on normalized spectral subband moments", IEEE Signal Processing letters, vol. 13, issue 4, pp. 209-212, 2006. https://doi.org/10.1109/LSP.2005.863678
D. Jang, C. D. Yoo, S. Lee, S. Kim and T. Kalker, "Pairwise Boosted Audio Fingerprint," IEEE Trans. on Information Forensics and Security, vol. 4, no. 4, pp. 995-1004, 2009. https://doi.org/10.1109/TIFS.2009.2034452
Y. Liu, K. Cho, H. S. Yun, J. W. Shin and N. S. Kim, "DCT based multiple hashing technique for robust audio finger printing," in Proc. of ICCASP, 2009.
P. Cano, E. Batlle, T. Lalker and J. Haitsma, "A review of audio fingerprinting," Journal of VLSI signal processing, vol. 41, no. 3, pp. 271-284, 2005. https://doi.org/10.1007/s11265-005-4151-3
W. Son, H-T. Cho, K. Yoon and S-P Lee, "Sub-fingerprint masking for a robust audio fingerprinting system in a real-noise environment for portable consumer devices," IEEE Trans. on Consumer Electronics, vol. 56, no. 1, pp. 156-160, 2010. https://doi.org/10.1109/TCE.2010.5439139
A. Ghias, J Logan and D Chamberlin, "Query by humming: musical information retrieval in an audio database", In Proc. of ACM Multimedia, pp. 231-236, 1995.
L. Wang, S. Huang, S. Hu, J. Liang and B. Xu, "An effective and efficient method for query by humming system based on multi-similarity measurement fusion," in Proc. of ICALIP, 2008.
H. M. Yu, W. H. Tsai and H. M. Wang, "A query-by-singing system for retrieving karaoke music," IEEE Trans. on multimedia, vol. 10, no. 8, pp. 1626-1637, 2008. https://doi.org/10.1109/TMM.2008.2007345
M. Ryynanen and A. Klapuri, "Query by humming of MIDI and audio using locality sensitive hashing," in Proc. of ICASSP, 2008.
X. Wu and M. Li, "A top down approach to melody match in pitch contour for query by humming," in Proc. of International Symposium of Chinese Spoken Language Processing, 2006.
K. Kim, K. R. Park, S. J. Park, S. P. Lee and M. Y. Kim, "Robust Query-by-Singing/Humming System against Background Noise Environments," IEEE Trans. On Consumer Electronics, vol. 57, no. 2, pp. 720-725, May 2011. https://doi.org/10.1109/TCE.2011.5955213
J. Song, S. Y. Bae and K. Yoon, "Mid-level music melody representation of polyphonic audio for query by humming system," in Proc. of Int. Conf. Music Information Retrieval, 2002.
C. C. Wang, J-S. R. Jang and W. Wang, "An improved query by singing/humming system using melody and lyrics information", in Proc. of Int. Society for Music Information Retrieval Conf., pp. 45-50, 2010.
A. P. Klapuri, "Multiple fundamental frequency estimation based on harmonicity and spectral smoothness," IEEE Trans. on Speech Audio Process., vol. 11, no. 6, pp. 804-816, 2003. https://doi.org/10.1109/TSA.2003.815516
C. M. Bishop, Pattern recognition and machine learning, Springer, 2006.
S. Schapire and Y. Singer, "Improoved boosting algorithms using confidence-rated predictions," Machine Learning, vol. 37, no. 3, pp. 297-336, 1999. https://doi.org/10.1023/A:1007614523901
D. Jang, C. D. Yoo and T. Kalker, "Distance metric learning for content identification," IEEE Trans. on Information Forensics and Security, vol. 5, issue. 4, pp932-944, 2010. https://doi.org/10.1109/TIFS.2010.2064769
I. Cohen, "Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging," IEEE Trans. on Speech and Audio Processing, vo. 11, pp. 466-475, 2003. https://doi.org/10.1109/TSA.2003.811544
Y. D. Cho, M. Y. Kim and S. R. Kim, "A spectrally mixed excitation (SMX) vocoder with robust parameter determination," in Proc. of ICASSP, pp. 601-604, 1998.
Z. Duan, Y. Zhang, C. Zhang and Z. Shi, "Unsupervised single-channel music source separation by average harmonic structure modeling," IEEE Trans. on Audio Speech Language Processing, vol. 16, no. 4, pp. 766-778, 2008. https://doi.org/10.1109/TASL.2008.919073
MIREX website. http://www.musicir.org/mirex/wiki/MIREX HOME.
D. Jang, S.-P. Lee, "Query by singing/humming system based on the combination of DTW distances for MIREX 2011," http://www.musicir.org/mirex/abstracts/2011/JSSLP1.pdf (2011).
Essen associative code and folk database, http://www.esac-data.org.

KSII Transactions on Internet and Information Systems (TIIS)

A Design of Matching Engine for a Practical Query-by-Singing/Humming System with Polyphonic Recordings

초록

키워드

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)