Search | Korea Science

Musical Genre Classification Based on Deep Residual Auto-Encoder and Support Vector Machine

Xue Han;Wenzhuo Chen;Changjian Zhou
- Journal of Information Processing Systems
- /
- v.20 no.1
- /
- pp.13-23
- /
- 2024
Music brings pleasure and relaxation to people. Therefore, it is necessary to classify musical genres based on scenes. Identifying favorite musical genres from massive music data is a time-consuming and laborious task. Recent studies have suggested that machine learning algorithms are effective in distinguishing between various musical genres. However, meeting the actual requirements in terms of accuracy or timeliness is challenging. In this study, a hybrid machine learning model that combines a deep residual auto-encoder (DRAE) and support vector machine (SVM) for musical genre recognition was proposed. Eight manually extracted features from the Mel-frequency cepstral coefficients (MFCC) were employed in the preprocessing stage as the hybrid music data source. During the training stage, DRAE was employed to extract feature maps, which were then used as input for the SVM classifier. The experimental results indicated that this method achieved a 91.54% F1-score and 91.58% top-1 accuracy, outperforming existing approaches. This novel approach leverages deep architecture and conventional machine learning algorithms and provides a new horizon for musical genre classification tasks.
https://doi.org/10.3745/JIPS.04.0300 인용 PDF

Deep Learning Music Genre Classification System Model Improvement Using Generative Adversarial Networks (GAN) (생성적 적대 신경망(GAN)을 이용한 딥러닝 음악 장르 분류 시스템 모델 개선)

Bae, Jun
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.24 no.7
- /
- pp.842-848
- /
- 2020
Music markets have entered the era of streaming. In order to select and propose music that suits the taste of music consumers, there is an active demand and research on an automatic music genre classification system. We propose a method to improve the accuracy of genre unclassified songs, which was a lack of the previous system, by using a generative adversarial network (GAN) to further develop the automatic voting system for deep learning music genre using Softmax proposed in the previous paper. In the previous study, if the spectrogram of the song was ambiguous to grasp the genre of the song, it was forced to leave it as an unclassified song. In this paper, we proposed a system that increases the accuracy of genre classification of unclassified songs by converting the spectrogram of unclassified songs into an easy-to-read spectrogram using GAN. And the result of the experiment was able to derive an excellent result compared to the existing method.
https://doi.org/10.6109/jkiice.2020.24.7.842 인용 PDF KSCI

Music Genre Classification based on Musical Features of Representative Segments (대표구간의 음악 특징에 기반한 음악 장르 분류)

Lee, Jong-In;Kim, Byeong-Man
- Journal of KIISE:Software and Applications
- /
- v.35 no.11
- /
- pp.692-700
- /
- 2008
In some previous works on musical genre classification, human experts specify segments of a song for extracting musical features. Although this approach might contribute to performance enhancement, it requires manual intervention and thus can not be easily applied to new incoming songs. To extract musical features without the manual intervention, most of recent researches on music genre classification extract features from a pre-determined part of a song (for example, 30 seconds after initial 30 seconds), which may cause loss of accuracy. In this paper, in order to alleviate the accuracy problem, we propose a new method, which extracts features from representative segments (or main theme part) identified by structure analysis of music piece. The proposed method detects segments with repeated melody in a song and selects representative ones among them by considering their positions and energies. Experimental results show that the proposed method significantly improve the accuracy compared to the approach using a pre-determined part.
PDF KSCI

An Implementation of a Classification and Recommendation Method for a Music Player Using Customized Emotion (맞춤형 감성 뮤직 플레이어를 위한 음악 분류 및 추천 기법 구현)

Song, Yu-Jeong;Kang, Su-Yeon;Ihm, Sun-Young;Park, Young-Ho
- KIPS Transactions on Software and Data Engineering
- /
- v.4 no.4
- /
- pp.195-200
- /
- 2015
Recently, most people use android based smartphones and we can find music players in any smartphones. However, it's hard to find a personalized music player which applies user's preference. In this paper, we propose an emotion-based music player, which analyses and classifies the music with user's emotion, recommends the music, applies the user's preference, and visualizes the music by color. Through the proposed music player, user could be able to select musics easily and use an optimized application.
https://doi.org/10.3745/KTSDE.2015.4.4.195 인용 PDF KSCI

Implementation of Melody Generation Model Through Weight Adaptation of Music Information Based on Music Transformer (Music Transformer 기반 음악 정보의 가중치 변형을 통한 멜로디 생성 모델 구현)

Seunga Cho;Jaeho Lee
- IEMEK Journal of Embedded Systems and Applications
- /
- v.18 no.5
- /
- pp.217-223
- /
- 2023
In this paper, we propose a new model for the conditional generation of music, considering key and rhythm, fundamental elements of music. MIDI sheet music is converted into a WAV format, which is then transformed into a Mel Spectrogram using the Short-Time Fourier Transform (STFT). Using this information, key and rhythm details are classified by passing through two Convolutional Neural Networks (CNNs), and this information is again fed into the Music Transformer. The key and rhythm details are combined by differentially multiplying the weights and the embedding vectors of the MIDI events. Several experiments are conducted, including a process for determining the optimal weights. This research represents a new effort to integrate essential elements into music generation and explains the detailed structure and operating principles of the model, verifying its effects and potentials through experiments. In this study, the accuracy for rhythm classification reached 94.7%, the accuracy for key classification reached 92.1%, and the Negative Likelihood based on the weights of the embedding vector resulted in 3.01.
https://doi.org/10.14372/IEMEK.2023.18.5.217 인용 PDF

Client-driven Music Genre Classification Framework (클라이언트 중심의 음악 장르 분류 프레임워크)

Mujtaba, Ghulam;Park, Eun-Soo;Kim, Seunghwan;Ryu, Eun-Seok
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2020.07a
- /
- pp.714-716
- /
- 2020
We propose a unique client-driven music genre classification solution, that can identify the music genre using a deep convolutional neural network operating on the time-domain signal. The proposed method uses the client device (Jetson TX2) computational resources to identify the music genre. We use the industry famous GTZAN genre collection dataset to get reliable benchmarking performance. HTTP live streaming (HLS) client and server sides are designed locally to validate the effectiveness of the proposed method. HTTP persistent broadcast connection is adapted to reduce corresponding responses and network bandwidth. The proposed model can identify the genre of music files with 97% accuracy. Due to simplicity and it can support a wide range of client hardware.
PDF

Noise Source Localization by Applying MUSIC with Wavelet Transformation (웨이블렛 변환과 MUSIC 기법을 이용한 소음원 추적)

Cho, Tae-Hwan;Ko, Byeong-Sik;Lim, Jong-Myung
- Transactions of the Korean Society of Automotive Engineers
- /
- v.16 no.2
- /
- pp.18-28
- /
- 2008
In inverse acoustic problem with nearfield sources, it is important to separate multiple acoustic sources and to measure the position of each target. This paper proposes a new algorithm by applying MUSIC(Multiple Signal Classification) to the outputs of discrete wavelet transformation with sub-band selection based on the entropy threshold, Some numerical experiments show that the proposed method can estimate the more precise positions than a conventional MUSIC algorithm under moderately correlated signal and relatively low signal-to-noise ratio case.
PDF KSCI

Multiple octave-band based genre classification algorithm for music recommendation (음악추천을 위한 다중 옥타브 밴드 기반 장르 분류기)

Lim, Shin-Cheol;Jang, Sei-Jin;Lee, Seok-Pil;Kim, Moo-Young
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.15 no.7
- /
- pp.1487-1494
- /
- 2011
In this paper, a novel genre classification algorithm is proposed for music recommendation system. Especially, to improve the classification accuracy, the band-pass filter for octave-based spectral contrast (OSC) feature is designed considering the psycho-acoustic model and actual frequency range of musical instruments. The GTZAN database including 10 genres was used for 10-fold cross validation experiments. The proposed multiple-octave based OSC produces better accuracy by 2.26% compared with the conventional OSC. The combined feature vector based on the proposed OSC and mel-frequency cepstral coefficient (MFCC) gives even better accuracy.
https://doi.org/10.6109/jkiice.2011.15.7.1487 인용 PDF KSCI

A Study on ISAR Imaging Algorithm for Radar Target Recognition (표적 구분을 위한 ISAR 영상 기법에 대한 연구)

Park, Jong-Il;Kim, Kyung-Tae
- The Journal of Korean Institute of Electromagnetic Engineering and Science
- /
- v.19 no.3
- /
- pp.294-303
- /
- 2008
ISAR(Inverse Synthetic Aperture Radar) images represent the 2-D(two-dimensional) spatial distribution of RCS (Radar Cross Section) of an object, and they can be applied to the problem of target identification. A traditional approach to ISAR imaging is to use a 2-D IFFT(Inverse Fast Fourier Transform). However, the 2-D IFFT results in low resolution ISAR images especially when the measured frequency bandwidth and angular region are limited. In order to improve the resolution capability of the Fourier transform, various high-resolution spectral estimation approaches have been applied to obtain ISAR images, such as AR(Auto Regressive), MUSIC(Multiple Signal Classification) or Modified MUSIC algorithms. In this study, these high-resolution spectral estimators as well as 2-D IFFT approach are combined with a recently developed ISAR image classification algorithm, and their performances are carefully analyzed and compared in the framework of radar target recognition.
https://doi.org/10.5515/KJKIEES.2008.19.3.294 인용 PDF KSCI

Music Genre Classification using Spikegram and Deep Neural Network (스파이크그램과 심층 신경망을 이용한 음악 장르 분류)

Jang, Woo-Jin;Yun, Ho-Won;Shin, Seong-Hyeon;Cho, Hyo-Jin;Jang, Won;Park, Hochong
- Journal of Broadcast Engineering
- /
- v.22 no.6
- /
- pp.693-701
- /
- 2017
In this paper, we propose a new method for music genre classification using spikegram and deep neural network. The human auditory system encodes the input sound in the time and frequency domain in order to maximize the amount of sound information delivered to the brain using minimum energy and resource. Spikegram is a method of analyzing waveform based on the encoding function of auditory system. In the proposed method, we analyze the signal using spikegram and extract a feature vector composed of key information for the genre classification, which is to be used as the input to the neural network. We measure the performance of music genre classification using the GTZAN dataset consisting of 10 music genres, and confirm that the proposed method provides good performance using a low-dimensional feature vector, compared to the current state-of-the-art methods.
https://doi.org/10.5909/JBE.2017.22.6.693 인용 PDF KSCI KPUBS

Search Result 237, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)