• Title/Summary/Keyword: Polyphonic Music

Search Result 17, Processing Time 0.022 seconds

Tempo Detection of Polyphonic Music Signal (다성 음악 신호의 템포 검출 기술)

  • Lee, Donggyu;Kim, Kijun;Park, Hochong
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2012.07a
    • /
    • pp.167-170
    • /
    • 2012
  • 본 논문에서는 박자 분류 방법을 사용하여 다성 음악 신호의 템포 쌍을 검출하는 방법을 제안한다. 템포를 검출하는 방법은 음의 시작점을 추출하여 음악의 주기적인 흐름을 파악한 뒤, 그 주기를 템포로 변환하는 과정으로 구성된다. 제안한 기술은 템포로 추측되는 배수 관계의 템포 후보를 추출한 뒤, 템포 후보를 박자에 따라 분류하고 곡의 빠르기를 고려하여 최종 템포 쌍을 검출한다. 제안한 방법을 사용하여 높은 정확도로 템포 쌍이 검출되는 것을 확인하였다.

  • PDF

Single Channel Polyphonic Music Separation Using Sparseness and Overlapping NMF (Overlapping NMF와 Sparseness를 이용한 단일 채널 다성 음악의 음원 분리)

  • Kim, Min-Je;Choi, Seung-Jin
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2005.07b
    • /
    • pp.769-771
    • /
    • 2005
  • In this paper we present a method of separating musical instrument sound sources from their monaural mixture, where we take the harmonic structure of music into account and use the sparseness and the overlapping NMF [1] to select representative spectral basis vectors which are used to reconstruct unmixed sound. A method of spectral basis selection is illustrated and experimental results with monaural mixture of voice/cello and trumpet/viola are shown to confirm the validity of our proposed method.

  • PDF

Development of Audio Melody Extraction and Matching Engine for MIREX 2011 tasks

  • Song, Chai-Jong;Jang, Dalwon;Lee, Seok-Pil;Park, Hochong
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2012.07a
    • /
    • pp.164-166
    • /
    • 2012
  • In this paper, we proposed a method for extracting predominant melody of polyphonic music based on harmonic structure. Harmonic structure is an important feature parameter of monophonic signal that has spectral peaks at the integer multiples of its fundamental frequency. We extract all fundamental frequency candidates contained in the polyphonic signal by verifying the required condition of harmonic structure. Then, we combine those harmonic peaks corresponding to each extracted fundamental frequency and assign a rank to each after calculating its harmonic average energy. We run pitch tracking based on the rank of extracted fundamental frequency and continuity of fundamental frequency, and determine the predominant melody. For the query by singing/humming (QbSH) task, we proposed Dynamic Time Warping (DTW) based matching engine. Our system reduces false alarm by combining the distances of multiple DTW processes. To improve the performance, we introduced the asymmetric sense, pitch level compensation, and distance intransitiveness to DTW algorithm.

  • PDF

A study on improving the performance of the machine-learning based automatic music transcription model by utilizing pitch number information (음고 개수 정보 활용을 통한 기계학습 기반 자동악보전사 모델의 성능 개선 연구)

  • Daeho Lee;Seokjin Lee
    • The Journal of the Acoustical Society of Korea
    • /
    • v.43 no.2
    • /
    • pp.207-213
    • /
    • 2024
  • In this paper, we study how to improve the performance of a machine learning-based automatic music transcription model by adding musical information to the input data. Where, the added musical information is information on the number of pitches that occur in each time frame, and which is obtained by counting the number of notes activated in the answer sheet. The obtained information on the number of pitches was used by concatenating it to the log mel-spectrogram, which is the input of the existing model. In this study, we use the automatic music transcription model included the four types of block predicting four types of musical information, we demonstrate that a simple method of adding pitch number information corresponding to the music information to be predicted by each block to the existing input was helpful in training the model. In order to evaluate the performance improvement proceed with an experiment using MIDI Aligned Piano Sounds (MAPS) data, as a result, when using all pitch number information, performance improvement was confirmed by 9.7 % in frame-based F1 score and 21.8 % in note-based F1 score including offset.

Study of the Musical Spaces Composition in Daniel Libeskind Architecture (다니엘 리베스킨트 건축의 음악적 공간 구성에 관한 연구)

  • Song, Dae-Ho
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.16 no.1
    • /
    • pp.793-800
    • /
    • 2015
  • This study is to analyze correlation between Daniel Libeskind's architecture works and the effects as a series of music works be learning in youth and opera "Aron and Mose". The results showed that First, Libeskind created convergence of invisible line by borrowing the composition of freely flowed scales in score from his architecture. Second, he composed geometrical shapes of contrapuntal reiteration based on double tune in music structure, in other words forms of polyphonic proportion. and he expressed the geometrically, freely line rhythm by planning composition of multidimensional spaces, "Hybrid", planning the contrast of material, form by results of Intertextually combination between Architecture and Music. Third, he tried to express the pain, fear, anxiety, etc. of the past spatially, and constructed "the spaces of absence" on his works through inspiration from Arnold Sch$\ddot{o}$nberg's works.

A Novel Query-by-Singing/Humming Method by Estimating Matching Positions Based on Multi-layered Perceptron

  • Pham, Tuyen Danh;Nam, Gi Pyo;Shin, Kwang Yong;Park, Kang Ryoung
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.7 no.7
    • /
    • pp.1657-1670
    • /
    • 2013
  • The increase in the number of music files in smart phone and MP3 player makes it difficult to find the music files which people want. So, Query-by-Singing/Humming (QbSH) systems have been developed to retrieve music from a user's humming or singing without having to know detailed information about the title or singer of song. Most previous researches on QbSH have been conducted using musical instrument digital interface (MIDI) files as reference songs. However, the production of MIDI files is a time-consuming process. In addition, more and more music files are newly published with the development of music market. Consequently, the method of using the more common MPEG-1 audio layer 3 (MP3) files for reference songs is considered as an alternative. However, there is little previous research on QbSH with MP3 files because an MP3 file has a different waveform due to background music and multiple (polyphonic) melodies compared to the humming/singing query. To overcome these problems, we propose a new QbSH method using MP3 files on mobile device. This research is novel in four ways. First, this is the first research on QbSH using MP3 files as reference songs. Second, the start and end positions on the MP3 file to be matched are estimated by using multi-layered perceptron (MLP) prior to performing the matching with humming/singing query file. Third, for more accurate results, four MLPs are used, which produce the start and end positions for dynamic time warping (DTW) matching algorithm, and those for chroma-based DTW algorithm, respectively. Fourth, two matching scores by the DTW and chroma-based DTW algorithms are combined by using PRODUCT rule, through which a higher matching accuracy is obtained. Experimental results with AFA MP3 database show that the accuracy (Top 1 accuracy of 98%, with an MRR of 0.989) of the proposed method is much higher than that of other methods. We also showed the effectiveness of the proposed system on consumer mobile device.

Blind Rhythmic Source Separation (블라인드 방식의 리듬 음원 분리)

  • Kim, Min-Je;Yoo, Ji-Ho;Kang, Kyeong-Ok;Choi, Seung-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.8
    • /
    • pp.697-705
    • /
    • 2009
  • An unsupervised (blind) method is proposed aiming at extracting rhythmic sources from commercial polyphonic music whose number of channels is limited to one. Commercial music signals are not usually provided with more than two channels while they often contain multiple instruments including singing voice. Therefore, instead of using conventional modeling of mixing environments or statistical characteristics, we should introduce other source-specific characteristics for separating or extracting sources in the under determined environments. In this paper, we concentrate on extracting rhythmic sources from the mixture with the other harmonic sources. An extension of nonnegative matrix factorization (NMF), which is called nonnegative matrix partial co-factorization (NMPCF), is used to analyze multiple relationships between spectral and temporal properties in the given input matrices. Moreover, temporal repeatability of the rhythmic sound sources is implicated as a common rhythmic property among segments of an input mixture signal. The proposed method shows acceptable, but not superior separation quality to referred prior knowledge-based drum source separation systems, but it has better applicability due to its blind manner in separation, for example, when there is no prior information or the target rhythmic source is irregular.