Search | Korea Science

A Threshold Adaptation based Voice Query Transcription Scheme for Music Retrieval (음악검색을 위한 가변임계치 기반의 음성 질의 변환 기법)

Han, Byeong-Jun;Rho, Seung-Min;Hwang, Een-Jun
- The Transactions of The Korean Institute of Electrical Engineers
- /
- v.59 no.2
- /
- pp.445-451
- /
- 2010
This paper presents a threshold adaptation based voice query transcription scheme for music information retrieval. The proposed scheme analyzes monophonic voice signal and generates its transcription for diverse music retrieval applications. For accurate transcription, we propose several advanced features including (i) Energetic Feature eXtractor (EFX) for onset, peak, and transient area detection; (ii) Modified Windowed Average Energy (MWAE) for defining multiple small but coherent windows with local threshold values as offset detector; and finally (iii) Circular Average Magnitude Difference Function (CAMDF) for accurate acquisition of fundamental frequency (F0) of each frame. In order to evaluate the performance of our proposed scheme, we implemented a prototype music transcription system called AMT2 (Automatic Music Transcriber version 2) and carried out various experiments. In the experiment, we used QBSH corpus [1], adapted in MIREX 2006 contest data set. Experimental result shows that our proposed scheme can improve the transcription performance.
https://doi.org/10.5370/KIEE.2010.59.2.445 인용 PDF KSCI

Vision-Based Piano Music Transcription System (비전 기반 피아노 자동 채보 시스템)

Park, Sang-Uk;Park, Si-Hyun;Park, Chun-Su
- Journal of IKEEE
- /
- v.23 no.1
- /
- pp.249-253
- /
- 2019
Most of music-transcription systems that have been commercialized operate based on audio information. However, these conventional systems have disadvantages of environmental dependency, equipment dependency, and time latency. This paper studied a vision-based music-transcription system that utilizes video information rather than audio information, which is a traditional method of music-transcription programs. Computer vision technology is widely used as a field for analyzing and applying information from equipment such as cameras. In this paper, we created a program to generate MIDI file which is electronic music notes by using smart-phone cameras to record the play of piano.
https://doi.org/10.7471/ikeee.2019.23.1.249 인용 PDF KSCI HTML

Music Transcription Using Non-Negative Matrix Factorization (비음수 행렬 분해 (NMF)를 이용한 악보 전사)

Park, Sang-Ha;Lee, Seok-Jin;Sung, Koeng-Mo
- The Journal of the Acoustical Society of Korea
- /
- v.29 no.2
- /
- pp.102-110
- /
- 2010
Music transcription is extracting pitch (the height of a musical note) and rhythm (the length of a musical note) information from audio file and making a music score. In this paper, we decomposed a waveform into frequency and rhythm components using Non-Negative Matrix Factorization (NMF) and Non-Negative Sparse coding (NNSC) which are often used for source separation and data clustering. And using the subharmonic summation method, fundamental frequency is calculated from the decomposed frequency components. Therefore, the accurate pitch of each score can be estimated. The proposed method successfully performed music transcription with its results superior to those of the conventional methods which used either NMF or NNSC.
https://doi.org/10.7776/ASK.2010.29.2.102 인용 PDF KSCI

Humming based High Quality Music Creation (허밍을 이용한 고품질 음악 생성)

Lee, Yoonjae;Kim, Sunmin
- Proceedings of the Korean Society for Noise and Vibration Engineering Conference
- /
- 2014.10a
- /
- pp.146-149
- /
- 2014
In this paper, humming based automatic music creation method is described. It is difficult for the general public which does not have music theory to compose the music in general. However, almost people can make the main melody by a humming. With this motivation, a melody and chord sequences are estimated by the humming analysis. In this paper, humming is generated without a metronome. Then based on the estimated chord sequence, accompaniment is generated using the MIDI template matched to each chord. The 5 Genre is supported in the music creation. The melody transcription is evaluated in terms of onset and pitch estimation accuracy and MOS evaluation is used for created music evaluation.
PDF

A study on improving the performance of the machine-learning based automatic music transcription model by utilizing pitch number information (음고 개수 정보 활용을 통한 기계학습 기반 자동악보전사 모델의 성능 개선 연구)

Daeho Lee;Seokjin Lee
- The Journal of the Acoustical Society of Korea
- /
- v.43 no.2
- /
- pp.207-213
- /
- 2024
In this paper, we study how to improve the performance of a machine learning-based automatic music transcription model by adding musical information to the input data. Where, the added musical information is information on the number of pitches that occur in each time frame, and which is obtained by counting the number of notes activated in the answer sheet. The obtained information on the number of pitches was used by concatenating it to the log mel-spectrogram, which is the input of the existing model. In this study, we use the automatic music transcription model included the four types of block predicting four types of musical information, we demonstrate that a simple method of adding pitch number information corresponding to the music information to be predicted by each block to the existing input was helpful in training the model. In order to evaluate the performance improvement proceed with an experiment using MIDI Aligned Piano Sounds (MAPS) data, as a result, when using all pitch number information, performance improvement was confirmed by 9.7 % in frame-based F1 score and 21.8 % in note-based F1 score including offset.
https://doi.org/10.7776/ASK.2024.43.2.207 인용 PDF

NMF Based Music Transcription Using Feature Vector Database (특징행렬 데이터베이스를 이용한 NMF 기반 음악전사)

Shin, Ok Keun;Ryu, Da Hyun
- Journal of Advanced Marine Engineering and Technology
- /
- v.36 no.8
- /
- pp.1129-1135
- /
- 2012
To employ NMF to transcribe music by extracting feature matrix and weight matrix at the same time, it is necessary to know in advance the dimension of the feature matrix, and to determine the pitch of each extracted feature vector. Another drawback of this approach is that it becomes more difficult to accurately extract the feature matrix as the number of pitches included in the target music increases. In this study, we prepare a feature matrix database, and apply the matrix to transcribe real music. Transcription experiments are conducted by applying the feature matrix to the music played on the same piano on which the feature matrix is extracted, as well as on the music played on another piano. These results are also compared to those of another experiment where the feature matrix and weight matrix are extracted simultaneously, without making use of the database. We could observe that the proposed method outperform the method in which the two matrices are extracted at the same time.
https://doi.org/10.5916/jkosme.2012.36.8.1129 인용 PDF KSCI

Structural Analysis Algorithm for Automatic Transcription 'Pansori' (판소리 자동채보를 위한 구조분석 알고리즘)

Ju, Young-Ho;Kim, Joon-Cheol;Seo, Kyoung-Suk;Lee, Joon-Whoan
- The Journal of the Korea Contents Association
- /
- v.14 no.2
- /
- pp.28-38
- /
- 2014
For western music there has been a volume of researches on music information analysis for automatic transcription or content-based music retrieval. But it is hard to find the similar research on Korean traditional music. In this paper we propose several algorithms to automatically analyze the structure of Korean traditional music 'Pansori'. The proposed algorithm automatically distinguishes between the 'sound' part and 'speech' part which are named 'sori' and 'aniri', respectively, using the ratio of phonetic and pause time intervals. For rhythm called 'jangdan' classification the algorithm makes the robust decision using the majority voting process based on template matching. Also an algorithm is suggested to detect the bar positions in the 'sori' part based on Kalman filter. Every proposed algorithm in the paper works so well enough for the sample music sources of 'Pansori' that the results may be used to automatically transcribe the 'Pansori'.
https://doi.org/10.5392/JKCA.2014.14.02.028 인용 PDF KSCI

Automatic Music Transcription Considering Time-Varying Tempo (가변 템포를 고려한 자동 음악 채보)

Ju, Youngho;Babukaji, Baniya;Lee, Joonwhan
- The Journal of the Korea Contents Association
- /
- v.12 no.11
- /
- pp.9-19
- /
- 2012
Time-varying tempo of a song is one of the error sources for the identification of a note duration in automatic music recognition. This paper proposes an improved music transcription scheme equipped with the identification of note duration considering the time-varying tempo. In the proposed scheme the measures are found at first and the tempo, the playing time of each measure, is then estimated. The tempo is then used for resizing each IOI(Inter Onset Interval) length and considered to identify the accurate note duration, which increases the degree of correspondence to the music piece. In the experiment the proposed scheme found the accurate measure position for 14 monophonic children songs out of 16 ones recorded by men and women. Also, it achieved about 89.4% and 84.8% of the degree of matching to the original music piece for identification of note duration and pitch, respectively.
https://doi.org/10.5392/JKCA.2012.12.11.009 인용 PDF KSCI

Gutenberg Galaxy and Music (구텐베르크 은하계와 음악)

KIM, Hyokyung
- Trans-
- /
- v.5
- /
- pp.49-64
- /
- 2018
Marshall Mcluhan, a media scholar, created the word Gutenberg Gaaxy meaning the new environment formulated by printing technic and he insisted that it changed human life entirely. In the history of human, the media evolved into printing technic through oral and transcription. This evolution of media and the environment created by the media are the most important point of Mcluhan's theory. He sees the world as the result of media evolution. In mcluhan's sight, Gutenberg Galaxy is the first environment composed by the media. Based on the mcluhan's theory, this study focused on the environment created by the media and applied it into the western music history. The link of the Gutenberg Galaxy and the western music, especially in romantic era, will be the main subject of the study. The Book is the most representative media of the printing technic. In the era of oral and transcription, the communication was limited by the spatial restriction. However, the book was free to spatial condition and this character of the book made the knowledge free. The knowledges delivered by the oral and transcription were mostly the matter of mundane world because they are so close to the human life, even though they are narrating about the God's world. The book, free to expanding the knowledge beyond the world, made the knowledge transcendent and expanded the sight of the humans into the transcendent world. The modern western world is the product of the expanded knowledge by the book and so does the music. In the time of printing technic, the music begun to gain the population by the printed sheet music. As delivering the music through the printed sheet, the music received transcendence and mystery as meeting the spirit of the times. This link formed by the time of Gutenberg Galaxy will be the main focus of the study and it will prove the link between the media and the western music.
PDF

Extraction of Chord and Tempo from Polyphonic Music Using Sinusoidal Modeling

Kim, Do-Hyoung;Chung, Jae-Ho
- The Journal of the Acoustical Society of Korea
- /
- v.22 no.4E
- /
- pp.141-149
- /
- 2003
As music of digital form has been widely used, many people have been interested in the automatic extraction of natural information of music itself, such as key of a music, chord progression, melody progression, tempo, etc. Although some studies have been tried, consistent and reliable results of musical information extraction had not been achieved. In this paper, we propose a method to extract chord and tempo information from general polyphonic music signals. Chord can be expressed by combination of some musical notes and those notes also consist of some frequency components individually. Thus, it is necessary to analyze the frequency components included in musical signal for the extraction of chord information. In this study, we utilize a sinusoidal modeling, which uses sinusoids corresponding to frequencies of musical tones, and show reliable chord extraction results of sinusoidal modeling. We could also find that the tempo of music, which is the one of remarkable feature of music signal, interactively supports the chord extraction idea, if used together. The proposed scheme of musical feature extraction is able to be used in many application fields, such as digital music services using queries of musical features, the operation of music database, and music players mounting chord displaying function, etc.
PDF KSCI

Search Result 14, Processing Time 0.022 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)