Browse > Article
http://dx.doi.org/10.3837/tiis.2014.02.0024

A Design of Matching Engine for a Practical Query-by-Singing/Humming System with Polyphonic Recordings  

Lee, Seok-Pil (Department of Digital Media Technology, Sangmyung University)
Yoo, Hoon (Department of Digital Media Technology, Sangmyung University)
Jang, Dalwon (Korea Electronic Technology Institute)
Publication Information
KSII Transactions on Internet and Information Systems (TIIS) / v.8, no.2, 2014 , pp. 723-736 More about this Journal
Abstract
This paper proposes a matching engine for a query-by-singing/humming (QbSH) system with polyphonic music files like MP3 files. The pitch sequences extracted from polyphonic recordings may be distorted. So we use chroma-scale representation, pre-processing, compensation, and asymmetric dynamic time warping to reduce the influence of the distortions. From the experiment with 28 hour music DB, the performance of our QbSH system based on polyphonic database is very promising in comparison with the published QbSH system based on monophonic database. It shows 0.725 in MRR(Mean Reciprocal Rank). Our matching engine can be used for the QbSH system based on MIDI DB also and that performance was verified by MIREX 2011.
Keywords
Query-by-singing/humming; music information retrieval; matching engine; dynamic time warping; pitch sequence; MIREX;
Citations & Related Records
연도 인용수 순위
  • Reference
1 C. M. Bishop, Pattern recognition and machine learning, Springer, 2006.
2 S. Schapire and Y. Singer, "Improoved boosting algorithms using confidence-rated predictions," Machine Learning, vol. 37, no. 3, pp. 297-336, 1999.   DOI   ScienceOn
3 D. Jang, C. D. Yoo and T. Kalker, "Distance metric learning for content identification," IEEE Trans. on Information Forensics and Security, vol. 5, issue. 4, pp932-944, 2010.   DOI   ScienceOn
4 I. Cohen, "Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging," IEEE Trans. on Speech and Audio Processing, vo. 11, pp. 466-475, 2003.   DOI   ScienceOn
5 Y. D. Cho, M. Y. Kim and S. R. Kim, "A spectrally mixed excitation (SMX) vocoder with robust parameter determination," in Proc. of ICASSP, pp. 601-604, 1998.
6 Z. Duan, Y. Zhang, C. Zhang and Z. Shi, "Unsupervised single-channel music source separation by average harmonic structure modeling," IEEE Trans. on Audio Speech Language Processing, vol. 16, no. 4, pp. 766-778, 2008.   DOI   ScienceOn
7 MIREX website. http://www.musicir.org/mirex/wiki/MIREX HOME.
8 D. Jang, S.-P. Lee, "Query by singing/humming system based on the combination of DTW distances for MIREX 2011," http://www.musicir.org/mirex/abstracts/2011/JSSLP1.pdf (2011).
9 Essen associative code and folk database, http://www.esac-data.org.
10 D. Jang, C. D. Yoo, S. Lee, S. Kim and T. Kalker, "Pairwise Boosted Audio Fingerprint," IEEE Trans. on Information Forensics and Security, vol. 4, no. 4, pp. 995-1004, 2009.   DOI   ScienceOn
11 Y. Liu, K. Cho, H. S. Yun, J. W. Shin and N. S. Kim, "DCT based multiple hashing technique for robust audio finger printing," in Proc. of ICCASP, 2009.
12 P. Cano, E. Batlle, T. Lalker and J. Haitsma, "A review of audio fingerprinting," Journal of VLSI signal processing, vol. 41, no. 3, pp. 271-284, 2005.   DOI   ScienceOn
13 W. Son, H-T. Cho, K. Yoon and S-P Lee, "Sub-fingerprint masking for a robust audio fingerprinting system in a real-noise environment for portable consumer devices," IEEE Trans. on Consumer Electronics, vol. 56, no. 1, pp. 156-160, 2010.   DOI   ScienceOn
14 A. Ghias, J Logan and D Chamberlin, "Query by humming: musical information retrieval in an audio database", In Proc. of ACM Multimedia, pp. 231-236, 1995.
15 L. Wang, S. Huang, S. Hu, J. Liang and B. Xu, "An effective and efficient method for query by humming system based on multi-similarity measurement fusion," in Proc. of ICALIP, 2008.
16 H. M. Yu, W. H. Tsai and H. M. Wang, "A query-by-singing system for retrieving karaoke music," IEEE Trans. on multimedia, vol. 10, no. 8, pp. 1626-1637, 2008.   DOI   ScienceOn
17 M. Ryynanen and A. Klapuri, "Query by humming of MIDI and audio using locality sensitive hashing," in Proc. of ICASSP, 2008.
18 X. Wu and M. Li, "A top down approach to melody match in pitch contour for query by humming," in Proc. of International Symposium of Chinese Spoken Language Processing, 2006.
19 K. Kim, K. R. Park, S. J. Park, S. P. Lee and M. Y. Kim, "Robust Query-by-Singing/Humming System against Background Noise Environments," IEEE Trans. On Consumer Electronics, vol. 57, no. 2, pp. 720-725, May 2011.   DOI   ScienceOn
20 J. Song, S. Y. Bae and K. Yoon, "Mid-level music melody representation of polyphonic audio for query by humming system," in Proc. of Int. Conf. Music Information Retrieval, 2002.
21 C. C. Wang, J-S. R. Jang and W. Wang, "An improved query by singing/humming system using melody and lyrics information", in Proc. of Int. Society for Music Information Retrieval Conf., pp. 45-50, 2010.
22 A. P. Klapuri, "Multiple fundamental frequency estimation based on harmonicity and spectral smoothness," IEEE Trans. on Speech Audio Process., vol. 11, no. 6, pp. 804-816, 2003.   DOI   ScienceOn
23 Nicola Orio, "Music Retrieval: A Tutorial and Review," Foundations and Trends in Information Retrieval, vol. 1, no 1, 1-90, 2006.   DOI
24 D. Jang, M. Jin and C. D. Yoo, "Music genre classification using novel features and a weighted voting method," in Proc. of ICME, 2008.
25 J. Stephen Downie, "The Music Information Retrieval Evaluation eXchange (MIREX) Next Generation Project," project prospectus, 2011.
26 R. Typke, F. Wiering and R. C. Veltkamp, "A survey of music information retrieval systems," in Proc. of ISMIR, pp.153-160, 2005.
27 G. Tzanetakis, G. Essl and P. Cook, "Automatic musical genre classification of audio signals," in Proc. of Int. Conf. Music Information Retrieval, Bloomington, IN, pp. 205-210, 2001.
28 R. Typke, P. Giannopoulos, R. C. Veltkamp, F. Wiering and R. V. Oostrum, "Using transportation distances for measuring melodic similarity," in Proc. of Int. Conf. Music Information Retrieval, pp. 107-114, 2003.
29 G. Poliner, D. Ellis, A. Ehmann, E. Gomez, S. Streich and B. Ong, "Melody transcription from music audio: Approaches and evaluation," IEEE Trans. on Audio, Speech, Language Processing, vol. 15, no. 4, pp. 1247-1256, 2007.   DOI   ScienceOn
30 S. Jo and C. D. Yoo, "Melody extraction from polyphonic audio based on particle filter," in Proc. of ISMIR, 2010.
31 D. P.W. Ellis and G. E. Poliner, "Identifying cover songs ith chroma features and dynamic programming beat racking," in Proc. of Int. Conf. Acoustic, Speech and Signal processing, Honolulu, HI, 2007.
32 J. -S. R. Jang and H.-R. Lee, "A general framework of progressive filtering and its application to query by singing/humming," IEEE Trans. on Audio, Speech, and language Processing, vol. 16, no. 2, pp. 350-358, 2008 .   DOI   ScienceOn
33 J. S. Seo, M. Jin, S. Lee, D. Jang, S. Lee and C. D. Yoo, "Audio fingerprinting based on normalized spectral subband moments", IEEE Signal Processing letters, vol. 13, issue 4, pp. 209-212, 2006.   DOI   ScienceOn