• 제목/요약/키워드: optical music recognition

검색결과 13건 처리시간 0.023초

YOLO 기반의 광학 음악 인식 기술 및 가상현실 콘텐츠 제작 방법 (YOLO based Optical Music Recognition and Virtual Reality Content Creation Method)

  • 오경민;홍요섭;백건영;전찬준
    • 스마트미디어저널
    • /
    • 제10권4호
    • /
    • pp.80-90
    • /
    • 2021
  • 딥러닝에 기반한 광학 음악 인식 기술(Optical Music Recognition, OMR)을 사용하여 도출된 결과를 가상현실 (Virtual Reality, VR) 게임에 적용시킨 것을 제안한다. 딥러닝 모델은 YOLO v5를 사용했으며 검출되지 않은 객체를 검출하기 위해 Hough transform 사용, 보표 크기 수정 등을 수행한다. 출력된 결과 파일을 사용하여 VR 게임에서 BPM, 최대 콤보 수, 음정과 박자를 분석하여 사용하고 리소스 관리를 위한 Object Pooling 기술을 통해 노트가 밀리는 현상을 방지한다. 광학 음악 인식 기술을 통해 나온 음악 요소로 VR 게임을 제작하여 VR 콘텐츠 제공과 함께 광학 음악 인식의 활용성을 넓히는 것을 확인하였다.

Optical Music Score Recognition System for Smart Mobile Devices

  • Han, SeJin;Lee, GueeSang
    • International Journal of Contents
    • /
    • 제10권4호
    • /
    • pp.63-68
    • /
    • 2014
  • In this paper, we propose a smart system that can optically recognize a music score within a document and can play the music after recognition. Many historic handwritten documents have now been digitalized. Converting images of a music score within documents into digital files is particularly difficult and requires considerable resources because a music score consists of a 2D structure with both staff lines and symbols. The proposed system takes an input image using a mobile device equipped with a camera module, and the image is optimized via preprocessing. Binarization, music sheet correction, staff line recognition, vertical line detection, note recognition, and symbol recognition processing are then applied, and a music file is generated in an XML format. The Music XML file is recorded as digital information, and based on that file, we can modify the result, logically correct errors, and finally generate a MIDI file. Our system reduces misrecognition, and a wider range of music score can be recognized because we have implemented distortion correction and vertical line detection. We show that the proposed method is practical, and that is has potential for wide application through an experiment with a variety of music scores.

Improved Lexicon-driven based Chord Symbol Recognition in Musical Images

  • Dinh, Cong Minh;Do, Luu Ngoc;Yang, Hyung-Jeong;Kim, Soo-Hyung;Lee, Guee-Sang
    • International Journal of Contents
    • /
    • 제12권4호
    • /
    • pp.53-61
    • /
    • 2016
  • Although extensively developed, optical music recognition systems have mostly focused on musical symbols (notes, rests, etc.), while disregarding the chord symbols. The process becomes difficult when the images are distorted or slurred, although this can be resolved using optical character recognition systems. Moreover, the appearance of outliers (lyrics, dynamics, etc.) increases the complexity of the chord recognition. Therefore, we propose a new approach addressing these issues. After binarization, un-distortion, and stave and lyric removal of a musical image, a rule-based method is applied to detect the potential regions of chord symbols. Next, a lexicon-driven approach is used to optimally and simultaneously separate and recognize characters. The score that is returned from the recognition process is used to detect the outliers. The effectiveness of our system is demonstrated through impressive accuracy of experimental results on two datasets having a variety of resolutions.

Camera-based Music Score Recognition Using Inverse Filter

  • Nguyen, Tam;Kim, SooHyung;Yang, HyungJeong;Lee, GueeSang
    • International Journal of Contents
    • /
    • 제10권4호
    • /
    • pp.11-17
    • /
    • 2014
  • The influence of acquisition environment on music score images captured by a camera has not yet been seriously examined. All existing Optical Music Recognition (OMR) systems attempt to recognize music score images captured by a scanner under ideal conditions. Therefore, when such systems process images under the influence of distortion, different viewpoints or suboptimal illumination effects, the performance, in terms of recognition accuracy and processing time, is unacceptable for deployment in practice. In this paper, a novel, lightweight but effective approach for dealing with the issues caused by camera based music scores is proposed. Based on the staff line information, musical rules, run length code, and projection, all regions of interest are determined. Templates created from inverse filter are then used to recognize the music symbols. Therefore, all fragmentation and deformation problems, as well as missed recognition, can be overcome using the developed method. The system was evaluated on a dataset consisting of real images captured by a smartphone. The achieved recognition rate and processing time were relatively competitive with state of the art works. In addition, the system was designed to be lightweight compared with the other approaches, which mostly adopted machine learning algorithms, to allow further deployment on portable devices with limited computing resources.

손사보 악보의 광학음악인식을 위한 CNN 기반의 보표 및 마디 인식 (Staff-line and Measure Detection using a Convolutional Neural Network for Handwritten Optical Music Recognition)

  • Park, Jong-Won;Kim, Dong-Sam;Kim, Jun-Ho
    • 한국정보통신학회논문지
    • /
    • 제26권7호
    • /
    • pp.1098-1101
    • /
    • 2022
  • With the development of computer music notation programs, when drawing sheet music, it is often drawn using a computer. However, there are still many use of hand-written notations for educational purposes or to quickly draw sheet music such as listening and dictating. In previous studies, OMR focused on recognizing the printed music sheet made by music notation program. the result of handwritten OMR with camera is poor because different people have different writing methods, and lens distortion. In this study, as a pre-processing process for recognizing handwritten music sheet, we propose a method for recognizing a staff using linear regression and a method for recognizing a bar using CNN. F1 scores of staff recognition and barline detection are 99.09% and 95.48%, respectively. This methodologies are expected to contribute to improving the accuracy of handwriting.

Score Image Retrieval to Inaccurate OMR performance

  • Kim, Haekwang
    • 방송공학회논문지
    • /
    • 제26권7호
    • /
    • pp.838-843
    • /
    • 2021
  • This paper presents an algorithm for effective retrieval of score information to an input score image. The originality of the proposed algorithm is that it is designed to be robust to recognition errors by an OMR (Optical Music Recognition), while existing methods such as pitch histogram requires error induced OMR result be corrected before retrieval process. This approach helps people to retrieve score without training on music score for error correction. OMR takes a score image as input, recognizes musical symbols, and produces structural symbolic notation of the score as output, for example, in MusicXML format. Among the musical symbols on a score, it is observed that filled noteheads are rarely detected with errors with its simple black filled round shape for OMR processing. Barlines that separate measures also strong to OMR errors with its long uniform length vertical line characteristic. The proposed algorithm consists of a descriptor for a score and a similarity measure between a query score and a reference score. The descriptor is based on note-count, the number of filled noteheads in a measure. Each part of a score is represented by a sequence of note-count numbers. The descriptor is an n-gram sequence of the note-count sequence. Simulation results show that the proposed algorithm works successfully to a certain degree in score image-based retrieval for an erroneous OMR output.

음악기보 인식을 위한 다중필터의 설계 및 유사판별 성능분석 (Design of optimal multiplexed filter and an analysis on the similar discrimination for music notatins recognition)

  • 유진선;김남
    • 전자공학회논문지D
    • /
    • 제34D권6호
    • /
    • pp.65-74
    • /
    • 1997
  • In this paper, SA-multiplexed filter is designed using SA (simulated ananealing) to recognize music notation patterns varying in size, shape, position and having considerably many similar shapes for optical pattern recognition system. This filter has correlation resutls at wanted location and can identify same class, classify similar class for scale-varianted or rotation-varianted music notation patterns havng learning process. Also, the optimum filter is oriented to analyze on the similar discrimination at acquired position using SA and enhances optical diffractive efficiency as well as peak beam intensity. Compared with POF *(phase only filter), cosine-BPOF(cosine-binary phase only filter), that has excellent discrimination capability even if the different rate is 0.1% quantitatively.

  • PDF

Super-resolution in Music Score Images by Instance Normalization

  • Tran, Minh-Trieu;Lee, Guee-Sang
    • 스마트미디어저널
    • /
    • 제8권4호
    • /
    • pp.64-71
    • /
    • 2019
  • The performance of an OMR (Optical Music Recognition) system is usually determined by the characterizing features of the input music score images. Low resolution is one of the main factors leading to degraded image quality. In this paper, we handle the low-resolution problem using the super-resolution technique. We propose the use of a deep neural network with instance normalization to improve the quality of music score images. We apply instance normalization which has proven to be beneficial in single image enhancement. It works better than batch normalization, which shows the effectiveness of shifting the mean and variance of deep features at the instance level. The proposed method provides an end-to-end mapping technique between the high and low-resolution images respectively. New images are then created, in which the resolution is four times higher than the resolution of the original images. Our model has been evaluated with the dataset "DeepScores" and shows that it outperforms other existing methods.

A Covariance-matching-based Model for Musical Symbol Recognition

  • Do, Luu-Ngoc;Yang, Hyung-Jeong;Kim, Soo-Hyung;Lee, Guee-Sang;Dinh, Cong Minh
    • 스마트미디어저널
    • /
    • 제7권2호
    • /
    • pp.23-33
    • /
    • 2018
  • A musical sheet is read by optical music recognition (OMR) systems that automatically recognize and reconstruct the read data to convert them into a machine-readable format such as XML so that the music can be played. This process, however, is very challenging due to the large variety of musical styles, symbol notation, and other distortions. In this paper, we present a model for the recognition of musical symbols through the use of a mobile application, whereby a camera is used to capture the input image; therefore, additional difficulties arise due to variations of the illumination and distortions. For our proposed model, we first generate a line adjacency graph (LAG) to remove the staff lines and to perform primitive detection. After symbol segmentation using the primitive information, we use a covariance-matching method to estimate the similarity between every symbol and pre-defined templates. This method generates the three hypotheses with the highest scores for likelihood measurement. We also add a global consistency (time measurements) to verify the three hypotheses in accordance with the structure of the musical sheets; one of the three hypotheses is chosen through a final decision. The results of the experiment show that our proposed method leads to promising results.