Search | Korea Science

Automatic melody extraction algorithm using a convolutional neural network

Lee, Jongseol;Jang, Dalwon;Yoon, Kyoungro
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.11 no.12
- /
- pp.6038-6053
- /
- 2017
In this study, we propose an automatic melody extraction algorithm using deep learning. In this algorithm, feature images, generated using the energy of frequency band, are extracted from polyphonic audio files and a deep learning technique, a convolutional neural network (CNN), is applied on the feature images. In the training data, a short frame of polyphonic music is labeled as a musical note and a classifier based on CNN is learned in order to determine a pitch value of a short frame of audio signal. We want to build a novel structure of melody extraction, thus the proposed algorithm has a simple structure and instead of using various signal processing techniques for melody extraction, we use only a CNN to find a melody from a polyphonic audio. Despite of simple structure, the promising results are obtained in the experiments. Compared with state-of-the-art algorithms, the proposed algorithm did not give the best result, but comparable results were obtained and we believe they could be improved with the appropriate training data. In this paper, melody extraction and the proposed algorithm are introduced first, and the proposed algorithm is then further explained in detail. Finally, we present our experiment and the comparison of results follows.
https://doi.org/10.3837/tiis.2017.12.019 인용 PDF KSCI

Treatment Effect of a Modified Melodic Intonation Therapy (MMIT) in Korean Aphasics

Ko, Do-Heung;Jeong, Ok-Ran
- Speech Sciences
- /
- v.4 no.2
- /
- pp.91-102
- /
- 1998
The present study attempted to modify the conventional Melodic Intonation Therapy (MIT) in three aspects: number of syllables of adjacent target utterances (ATU), melody patterns of ATU, and initial listening of melody and intoned speech with the eyes closed. The modified Melodic Intonation Therapy (MMIT) was applied to two severe Korean aphasics. The patients exhibited a severely nonfluent aphasia resulting from a left CVA(Cerebrovascular Accident). The purpose of the modification was to avoid perseveration and improve reflective listening skills. First, the treatment program avoided ATU with the same number of syllables. Second, four different patterns of melody were developed: rising type, falling type, V-type, and inverted V-type. One type of prosodic pattern was preceded and followed by another type of melody. These two variations were to decrease perseverative behaviors. Finally, the patients kept their eyes closed when the clinician played and hummed a target melody at the initial stage of the program in order to improve reflective listening skills. A single-subject alternating treatment design was used. The effects of MMIT were compared to the conventional MIT. Differing the number of syllables and the type of melodic patterns decreased perseverative behaviors and produced more correct names. The initial listening of the target melody with the patients' eyes closed seemed to increase their attentiveness and result in a more fluent production of target utterances. Probable reasons for the effectiveness of MMIT were discussed.
PDF

Representative Melodies Retrieval using Waveform and FFT Analysis of Audio (오디오의 파형과 FFT 분석을 이용한 대표 선율 검색)

Chung, Myoung-Bum;Ko, Il-Ju
- Journal of KIISE:Software and Applications
- /
- v.34 no.12
- /
- pp.1037-1044
- /
- 2007
Recently, we extract the representative melody of the music and index the music to reduce searching time at the content-based music retrieval system. The existing study has used MIDI data to extract a representative melody but it has a weak point that can use only MIDI data. Therefore, this paper proposes a representative melody retrieval method that can be use at all audio file format and uses digital signal processing. First, we use Fast Fourier Transform (FFT) and find the tempo and node for the representative melody retrieval. And we measure the frequency of high value that appears from PCM Data of each node. The point which the high value is gathering most is the starting point of a representative melody and an eight node from the starting point is a representative melody section of the audio data. To verity the performance of the method, we chose a thousand of the song and did the experiment to extract a representative melody from the song. In result, the accuracy of the extractive representative melody was 79.5% among the 737 songs which was found tempo.
PDF KSCI

Improved Melody Recognition Performance of a Cochlear Implant Speech Processing Strategy Using Instantaneous Frequency Encoding Based on Teager Energy Operator

Choi, Sung-Jin;Ryu, Sang-Baek;Kim, Kyung-Hwan
- Journal of Biomedical Engineering Research
- /
- v.31 no.6
- /
- pp.417-426
- /
- 2010
We present a speech processing strategy incorporating instantaneous frequency (IF) encoding for the enhancement of melody recognition performance of cochlear implants. For the IF extraction from incoming sound, we propose the use of a Teager energy operator (TEO), which is advantageous for its lower computational load. From time-frequency analysis, we verified that the TEO-based method provides proper IF encoding of input sound, which is crucial for melody recognition. Similar benefit could be obtained also from the use of a Hilbert transform (HT), but much higher computational cost was required. The melody recognition performance of the proposed speech processing strategy was compared with those of a conventional strategy using envelope extraction, and the HT-based IF encoding. Hearing tests on normal subjects were performed using acoustic simulation and a musical contour identification task. Insignificant difference in melody recognition performance was observed between the TEO-based and HT-based IF encodings, and both were superior to the conventional strategy. However, the TEO-based strategy was advantageous considering that it was approximately 35% faster than the HT-based strategy.
https://doi.org/10.9718/JBER.2010.31.6.417 인용 PDF KSCI

Music Retrieval Using the Geometric Hashing Technique (기하학적 해싱 기법을 이용한 음악 검색)

Jung, Hyosook;Park, Seongbin
- The Journal of Korean Association of Computer Education
- /
- v.8 no.5
- /
- pp.109-118
- /
- 2005
In this paper, we present a music retrieval system that compares the geometric structure of a melody specified by a user with those in a music database. The system finds matches between a query melody and melodies in the database by analyzing both structural and contextual features. The retrieval method is based on the geometric hashing algorithm which consists of two steps; the preprocessing step and the recognition step. During the preprocessing step, we divide a melody into several fragments and analyze the pitch and duration of each note of the fragments to find a structural feature. To find a contextual feature, we find a main chord for each fragment. During the recognition step, we divide the query melody specified by a user into several fragments and search through all fragments in the database that are structurally and contextually similar to the melody. A vote is cast for each of the fragments and the music whose total votes are the maximum is the music that contains a matching melody against the query melody. Using our approach, we can find similar melodies in a music database quickly. We can also apply the method to detect plagiarism in music.
PDF

An Automatic Rhythm and Melody Composition System Considering User Parameters and Chord Progression Based on a Genetic Algorithm (유전알고리즘 기반의 사용자 파라미터 설정과 코드 진행을 고려한 리듬과 멜로디 자동 작곡 시스템)

Jeong, Jaehun;Ahn, Chang Wook
- Journal of KIISE
- /
- v.43 no.2
- /
- pp.204-211
- /
- 2016
In this paper, we propose an automatic melody composition system that can generate a sophisticated melody by adding non-harmony tone in the given chord progression. An overall procedure consists of two steps, which are the rhythm generation and melody generation parts. In the rhythm generation part, we designed new fitness functions for rhythm that can be controlled by a user setting parameters. In the melody generation part, we designed new fitness functions for melody based on harmony theory. We also designed evolutionary operators that are conducted by considering a musical context to improve computational efficiency. In the experiments, we compared four metaheuristics to optimize the rhythm fitness functions: Simple Genetic Algorithm (SGA), Elitism Genetic Algorithm (EGA), Differential Evolution (DE), and Particle Swarm Optimization (PSO). Furthermore, we compared proposed genetic algorithm for melody with the four algorithms for verifying performance. In addition, composition results are introduced and analyzed with respect to musical correctness.
https://doi.org/10.5626/JOK.2016.43.2.204 인용 KSCI

Humming based High Quality Music Creation (허밍을 이용한 고품질 음악 생성)

Lee, Yoonjae;Kim, Sunmin
- Proceedings of the Korean Society for Noise and Vibration Engineering Conference
- /
- 2014.10a
- /
- pp.146-149
- /
- 2014
In this paper, humming based automatic music creation method is described. It is difficult for the general public which does not have music theory to compose the music in general. However, almost people can make the main melody by a humming. With this motivation, a melody and chord sequences are estimated by the humming analysis. In this paper, humming is generated without a metronome. Then based on the estimated chord sequence, accompaniment is generated using the MIDI template matched to each chord. The 5 Genre is supported in the music creation. The melody transcription is evaluated in terms of onset and pitch estimation accuracy and MOS evaluation is used for created music evaluation.
PDF

Study on the song title query by humming melody information (허밍 운율정보를 이용한 곡목 검색 기술)

Lee Ji-Yeoun;Hahn Min-Soo
- MALSORI
- /
- no.44
- /
- pp.131-143
- /
- 2002
Music query by humming is a challenging problem since the humming signal inevitably contains much variation and inaccuracy. In this paper, we suggest an algorithm for querying a wanted song from music database by humming its melody. In order to suit or adapt the inaccurate peoples humming, a new melody representation technique is proposed. Our algorithm is basically a pitch and duration information-based one and performs fairly well. 85% of correct query rate of the song is achieved for the top 3 matches when tested with 20 songs.
PDF

Indexing and Retrieval Mechanism using Variation Patterns of Theme Melodies in Content-based Music Information Retrievals (내용 기반 음악 정보 검색에서 주제 선율의 변화 패턴을 이용한 색인 및 검색 기법)

구경이;신창환;김유성
- Journal of KIISE:Databases
- /
- v.30 no.5
- /
- pp.507-520
- /
- 2003
In this paper, an automatic construction method of theme melody index for large music database and an associative content-based music retrieval mechanism in which the constructed theme melody index is mainly used to improve the users' response time are proposed. First, the system automatically extracted the theme melody from a music file by the graphical clustering algorithm based on the similarities between motifs of the music. To place an extracted theme melody into the metric space of M-tree, we chose the average length variation and the average pitch variation of the theme melody as the major features. Moreover, we added the pitch signature and length signature which summarize the pitch variation pattern and the length variation pattern of a theme melody, respectively, to increase the precision of retrieval results. We also proposed the associative content-based music retrieval mechanism in which the k-nearest neighborhood searching and the range searching algorithms of M-tree are used to select the similar melodies to user's query melody from the theme melody index. To improve the users' satisfaction, the proposed retrieval mechanism includes ranking and user's relevance feedback functions. Also, we implemented the proposed mechanisms as the essential components of content-based music retrieval systems to verify the usefulness.
PDF KSCI

Learning French Intonation with a Base of the Visualization of Melody (억양의 시각화를 통한 프랑스어의 억양학습)

Lee, Jung-Won
- Speech Sciences
- /
- v.10 no.4
- /
- pp.63-71
- /
- 2003
This study aims to experiment on learning French intonation, based on the visualization of melody, which was employed in the early sixties to reeducate those with communication disorders. The visualization of melody in this paper, however, was used to the foreign language learning and produced successful results in many ways, especially in learning foreign intonation. In this paper, we used the PitchWorks to visualize some French intonation samples and experiment on learning intonation based on the bitmap picture projected on a screen. The students could see the melody curve while listening to the sentences. We could observe great achievement on the part of the students in learning intonations, as verified by the result of this experiment. The students were much more motivated in learning and showed greater improvement in recognizing intonation contour than just learning by hearing. But lack of animation in the bitmap file could make the experiment nothing but a boring pattern practices. It would be better if we can use a sound analyser, as like for instance a PitchWorks, which is designed to analyse the pitch, since the students can actually see their own fluctuating intonation visualized on the screen.
PDF

Search Result 116, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)