Development of Audio Melody Extraction and Matching Engine for MIREX 2011 tasks

  • Song, Chai-Jong (DigitalMedia R&D Center, Broadcasting and ICT R&D Division, KETI) ;
  • Jang, Dalwon (DigitalMedia R&D Center, Broadcasting and ICT R&D Division, KETI) ;
  • Lee, Seok-Pil (DigitalMedia R&D Center, Broadcasting and ICT R&D Division, KETI) ;
  • Park, Hochong (Dept. of Electronics Engineering, Kwangwoon University)
  • 송재종 (전자부품연구원 정보통신미디어본부 디지털미디어연구센터) ;
  • 장달원 (전자부품연구원 정보통신미디어본부 디지털미디어연구센터) ;
  • 이석필 (전자부품연구원 정보통신미디어본부 디지털미디어연구센터) ;
  • 박호종 (광운대학교 전자공학과)
  • Published : 2012.07.05

Abstract

In this paper, we proposed a method for extracting predominant melody of polyphonic music based on harmonic structure. Harmonic structure is an important feature parameter of monophonic signal that has spectral peaks at the integer multiples of its fundamental frequency. We extract all fundamental frequency candidates contained in the polyphonic signal by verifying the required condition of harmonic structure. Then, we combine those harmonic peaks corresponding to each extracted fundamental frequency and assign a rank to each after calculating its harmonic average energy. We run pitch tracking based on the rank of extracted fundamental frequency and continuity of fundamental frequency, and determine the predominant melody. For the query by singing/humming (QbSH) task, we proposed Dynamic Time Warping (DTW) based matching engine. Our system reduces false alarm by combining the distances of multiple DTW processes. To improve the performance, we introduced the asymmetric sense, pitch level compensation, and distance intransitiveness to DTW algorithm.

Keywords