• Title/Summary/Keyword: 피아노 채보

Search Result 2, Processing Time 0.015 seconds

Reducing latency of neural automatic piano transcription models (인공신경망 기반 저지연 피아노 채보 모델)

  • Dasol Lee;Dasaem Jeong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.42 no.2
    • /
    • pp.102-111
    • /
    • 2023
  • Automatic Music Transcription (AMT) is a task that detects and recognizes musical note events from a given audio recording. In this paper, we focus on reducing the latency of real-time AMT systems on piano music. Although neural AMT models have been adapted for real-time piano transcription, they suffer from high latency, which hinders their usefulness in interactive scenarios. To tackle this issue, we explore several techniques for reducing the intrinsic latency of a neural network for piano transcription, including reducing window and hop sizes of Fast Fourier Transformation (FFT), modifying convolutional layer's kernel size, and shifting the label in the time-axis to train the model to predict onset earlier. Our experiments demonstrate that combining these approaches can lower latency while maintaining high transcription accuracy. Specifically, our modified model achieved note F1 scores of 92.67 % and 90.51 % with latencies of 96 ms and 64 ms, respectively, compared to the baseline model's note F1 score of 93.43 % with a latency of 160 ms. This methodology has potential for training AMT models for various interactive scenarios, including providing real-time feedback for piano education.

Vision-Based Piano Music Transcription System (비전 기반 피아노 자동 채보 시스템)

  • Park, Sang-Uk;Park, Si-Hyun;Park, Chun-Su
    • Journal of IKEEE
    • /
    • v.23 no.1
    • /
    • pp.249-253
    • /
    • 2019
  • Most of music-transcription systems that have been commercialized operate based on audio information. However, these conventional systems have disadvantages of environmental dependency, equipment dependency, and time latency. This paper studied a vision-based music-transcription system that utilizes video information rather than audio information, which is a traditional method of music-transcription programs. Computer vision technology is widely used as a field for analyzing and applying information from equipment such as cameras. In this paper, we created a program to generate MIDI file which is electronic music notes by using smart-phone cameras to record the play of piano.