• Title/Summary/Keyword: Music Transcription

Search Result 14, Processing Time 0.017 seconds

Reducing latency of neural automatic piano transcription models (인공신경망 기반 저지연 피아노 채보 모델)

  • Dasol Lee;Dasaem Jeong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.42 no.2
    • /
    • pp.102-111
    • /
    • 2023
  • Automatic Music Transcription (AMT) is a task that detects and recognizes musical note events from a given audio recording. In this paper, we focus on reducing the latency of real-time AMT systems on piano music. Although neural AMT models have been adapted for real-time piano transcription, they suffer from high latency, which hinders their usefulness in interactive scenarios. To tackle this issue, we explore several techniques for reducing the intrinsic latency of a neural network for piano transcription, including reducing window and hop sizes of Fast Fourier Transformation (FFT), modifying convolutional layer's kernel size, and shifting the label in the time-axis to train the model to predict onset earlier. Our experiments demonstrate that combining these approaches can lower latency while maintaining high transcription accuracy. Specifically, our modified model achieved note F1 scores of 92.67 % and 90.51 % with latencies of 96 ms and 64 ms, respectively, compared to the baseline model's note F1 score of 93.43 % with a latency of 160 ms. This methodology has potential for training AMT models for various interactive scenarios, including providing real-time feedback for piano education.

An Improved Automatic Music Transcription Method Using TV-Filter and Optimal Note Combination (TV-필터와 최적 음표조합을 이용한 개선된 가변템포 음악채보방법)

  • Ju, Young-Ho;Lee, Joonwhoan
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.23 no.4
    • /
    • pp.371-377
    • /
    • 2013
  • This paper proposes three methods for improving the accuracy of auto-music transcription considering with time-varying tempo from monophonic sound. The first one that uses TV(Total Variation) filter for smoothing the pitch data reduces the fragmentation in the pitch segmentation result. Also, the measure finding method that combines three different ways based on pitch and energy of sound data, respectively as well as based on rules produces more stable result. In addition the temporal result of note-length encoding is corrected in optimal way that the resulted encoding minimizes the sum of quantization error in a measure while the sum of note-lengths is equal to the number of beats. In the experiment with 16 children songs, we obtained the improved result in which measure finding was complete, the accuracy of encoding for note-length and pitch was about 91.3 and 86.7, respectively.

Automatic Music Transcription System Using SIDE (SIDE를 이용한 자동 음악 채보 시스템)

  • Hyoung, A-Young;Lee, Joon-Whoan
    • The KIPS Transactions:PartB
    • /
    • v.16B no.2
    • /
    • pp.141-150
    • /
    • 2009
  • This paper proposes a system that can automatically write singing voices to music notes. First, the system uses Stabilized Diffusion Equation(SIDE) to divide the song to a series of syllabic parts based on pitch detection. By the song segmentation, our method can recognize the sound length of each fragment through clustering based on genetic algorithm. Moreover, this study introduces a concept called 'Relative Interval' so as to recognize interval based on pitch of singer. And it also adopted measure extraction algorithm using pause data to implement the higher precision of song transcription. By the experiments using 16 nursery songs, it is shown that the measure recognition rate is 91.5% and DMOS score reaches 3.82. These findings demonstrate effectiveness of system performance.

Finding Measure Position Using Combination Rules of Musical Notes in Monophonic Song (단일 음원 노래에서 음표의 조합 규칙을 이용한 마디 위치 찾기)

  • Park, En-Jong;Shin, Song-Yi;Lee, Joon-Whoan
    • The Journal of the Korea Contents Association
    • /
    • v.9 no.10
    • /
    • pp.1-12
    • /
    • 2009
  • There exist some regular multiple relations in the intervals of notes when they are combined within one measure. This paper presents a method to find the exact measure positions in monophonic song based on those relations. In the proposed method the individual intervals are segmented at first and the rules that state the multiple relations are used to find the measure position. The measures can be applied as the foundational information for extracting beat and tempo of a song which can be used as background knowledge of automatic music transcription system. The proposed method exactly detected the measure positions of 11 songs out of 12 songs except one song which consist of monophonic voice song of the men and women. Also one can extract the information of beat and tempo of a song using the information about extracted measure positions with music theory.