Search | Korea Science

Multiclass Music Classification Approach Based on Genre and Emotion

Jonghwa Kim
- International Journal of Internet, Broadcasting and Communication
- /
- v.16 no.3
- /
- pp.27-32
- /
- 2024
Reliable and fine-grained musical metadata are required for efficient search of rapidly increasing music files. In particular, since the primary motive for listening to music is its emotional effect, diversion, and the memories it awakens, emotion classification along with genre classification of music is crucial. In this paper, as an initial approach towards a "ground-truth" dataset for music emotion and genre classification, we elaborately generated a music corpus through labeling of a large number of ordinary people. In order to verify the suitability of the dataset through the classification results, we extracted features according to MPEG-7 audio standard and applied different machine learning models based on statistics and deep neural network to automatically classify the dataset. By using standard hyperparameter setting, we reached an accuracy of 93% for genre classification and 80% for emotion classification, and believe that our dataset can be used as a meaningful comparative dataset in this research field.
https://doi.org/10.7236/IJIBC.2024.16.3.27 인용 PDF

Background music monitoring framework and dataset for TV broadcast audio

Hyemi Kim;Junghyun Kim;Jihyun Park;Seongwoo Kim;Chanjin Park;Wonyoung Yoo
- ETRI Journal
- /
- v.46 no.4
- /
- pp.697-707
- /
- 2024
Music identification is widely regarded as a solved problem for music searching in quiet environments, but its performance tends to degrade in TV broadcast audio owing to the presence of dialogue or sound effects. In addition, constructing an accurate dataset for measuring the performance of background music monitoring in TV broadcast audio is challenging. We propose a framework for monitoring background music by automatic identification and introduce a background music cue sheet. The framework comprises three main components: music identification, music-speech separation, and music detection. In addition, we introduce the Cue-K-Drama dataset, which includes reference songs, audio tracks from 60 episodes of five Korean TV drama series, and corresponding cue sheets that provide the start and end timestamps of background music. Experimental results on the constructed and existing datasets demonstrate that the proposed framework, which incorporates music identification with music-speech separation and music detection, effectively enhances TV broadcast audio monitoring.
https://doi.org/10.4218/etrij.2023-0249 인용 PDF

Attention-based CNN-BiGRU for Bengali Music Emotion Classification

Subhasish Ghosh;Omar Faruk Riad
- International Journal of Computer Science & Network Security
- /
- v.23 no.9
- /
- pp.47-54
- /
- 2023
For Bengali music emotion classification, deep learning models, particularly CNN and RNN are frequently used. But previous researches had the flaws of low accuracy and overfitting problem. In this research, attention-based Conv1D and BiGRU model is designed for music emotion classification and comparative experimentation shows that the proposed model is classifying emotions more accurate. We have proposed a Conv1D and Bi-GRU with the attention-based model for emotion classification of our Bengali music dataset. The model integrates attention-based. Wav preprocessing makes use of MFCCs. To reduce the dimensionality of the feature space, contextual features were extracted from two Conv1D layers. In order to solve the overfitting problems, dropouts are utilized. Two bidirectional GRUs networks are used to update previous and future emotion representation of the output from the Conv1D layers. Two BiGRU layers are conntected to an attention mechanism to give various MFCC feature vectors more attention. Moreover, the attention mechanism has increased the accuracy of the proposed classification model. The vector is finally classified into four emotion classes: Angry, Happy, Relax, Sad; using a dense, fully connected layer with softmax activation. The proposed Conv1D+BiGRU+Attention model is efficient at classifying emotions in the Bengali music dataset than baseline methods. For our Bengali music dataset, the performance of our proposed model is 95%.
https://doi.org/10.22937/IJCSNS.2023.23.9.6 인용 PDF

Generating Data and Applying Machine Learning Methods for Music Genre Classification (음악 장르 분류를 위한 데이터 생성 및 머신러닝 적용 방안)

Bit-Chan Eom;Dong-Hwi Cho;Choon-Sung Nam
- Journal of Internet Computing and Services
- /
- v.25 no.4
- /
- pp.57-64
- /
- 2024
This paper aims to enhance the accuracy of music genre classification for music tracks where genre information is not provided, by utilizing machine learning to classify a large amount of music data. The paper proposes collecting and preprocessing data instead of using the commonly employed GTZAN dataset in previous research for genre classification in music. To create a dataset with superior classification performance compared to the GTZAN dataset, we extract specific segments with the highest energy level of the onset. We utilize 57 features as the main characteristics of the music data used for training, including Mel Frequency Cepstral Coefficients (MFCC). We achieved a training accuracy of 85% and a testing accuracy of 71% using the Support Vector Machine (SVM) model to classify into Classical, Jazz, Country, Disco, Soul, Rock, Metal, and Hiphop genres based on preprocessed data.
https://doi.org/10.7472/jksii.2024.25.4.57 인용 PDF HTML

Real-time Background Music System for Immersive Dialogue in Metaverse based on Dialogue Emotion (메타버스 대화의 몰입감 증진을 위한 대화 감정 기반 실시간 배경음악 시스템 구현)

Kirak Kim;Sangah Lee;Nahyeon Kim;Moonryul Jung
- Journal of the Korea Computer Graphics Society
- /
- v.29 no.4
- /
- pp.1-6
- /
- 2023
To enhance immersive experiences for metaverse environements, background music is often used. However, the background music is mostly pre-matched and repeated which might occur a distractive experience to users as it does not align well with rapidly changing user-interactive contents. Thus, we implemented a system to provide a more immersive metaverse conversation experience by 1) developing a regression neural network that extracts emotions from an utterance using KEMDy20, the Korean multimodal emotion dataset 2) selecting music corresponding to the extracted emotions from an utterance by the DEAM dataset where music is tagged with arousal-valence levels 3) combining it with a virtual space where users can have a real-time conversation with avatars.
https://doi.org/10.15701/kcgs.2023.29.4.1 인용 PDF

Super-resolution in Music Score Images by Instance Normalization

Tran, Minh-Trieu;Lee, Guee-Sang
- Smart Media Journal
- /
- v.8 no.4
- /
- pp.64-71
- /
- 2019
The performance of an OMR (Optical Music Recognition) system is usually determined by the characterizing features of the input music score images. Low resolution is one of the main factors leading to degraded image quality. In this paper, we handle the low-resolution problem using the super-resolution technique. We propose the use of a deep neural network with instance normalization to improve the quality of music score images. We apply instance normalization which has proven to be beneficial in single image enhancement. It works better than batch normalization, which shows the effectiveness of shifting the mean and variance of deep features at the instance level. The proposed method provides an end-to-end mapping technique between the high and low-resolution images respectively. New images are then created, in which the resolution is four times higher than the resolution of the original images. Our model has been evaluated with the dataset "DeepScores" and shows that it outperforms other existing methods.
https://doi.org/10.30693/SMJ.2019.8.4.64 인용 PDF KSCI

Client-driven Music Genre Classification Framework (클라이언트 중심의 음악 장르 분류 프레임워크)

Mujtaba, Ghulam;Park, Eun-Soo;Kim, Seunghwan;Ryu, Eun-Seok
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2020.07a
- /
- pp.714-716
- /
- 2020
We propose a unique client-driven music genre classification solution, that can identify the music genre using a deep convolutional neural network operating on the time-domain signal. The proposed method uses the client device (Jetson TX2) computational resources to identify the music genre. We use the industry famous GTZAN genre collection dataset to get reliable benchmarking performance. HTTP live streaming (HLS) client and server sides are designed locally to validate the effectiveness of the proposed method. HTTP persistent broadcast connection is adapted to reduce corresponding responses and network bandwidth. The proposed model can identify the genre of music files with 97% accuracy. Due to simplicity and it can support a wide range of client hardware.
PDF

Camera-based Music Score Recognition Using Inverse Filter

Nguyen, Tam;Kim, SooHyung;Yang, HyungJeong;Lee, GueeSang
- International Journal of Contents
- /
- v.10 no.4
- /
- pp.11-17
- /
- 2014
The influence of acquisition environment on music score images captured by a camera has not yet been seriously examined. All existing Optical Music Recognition (OMR) systems attempt to recognize music score images captured by a scanner under ideal conditions. Therefore, when such systems process images under the influence of distortion, different viewpoints or suboptimal illumination effects, the performance, in terms of recognition accuracy and processing time, is unacceptable for deployment in practice. In this paper, a novel, lightweight but effective approach for dealing with the issues caused by camera based music scores is proposed. Based on the staff line information, musical rules, run length code, and projection, all regions of interest are determined. Templates created from inverse filter are then used to recognize the music symbols. Therefore, all fragmentation and deformation problems, as well as missed recognition, can be overcome using the developed method. The system was evaluated on a dataset consisting of real images captured by a smartphone. The achieved recognition rate and processing time were relatively competitive with state of the art works. In addition, the system was designed to be lightweight compared with the other approaches, which mostly adopted machine learning algorithms, to allow further deployment on portable devices with limited computing resources.
https://doi.org/10.5392/IJoC.2014.10.4.011 인용 PDF KSCI KPUBS HTML

An investigation of chroma n-gram selection for cover song search (커버곡 검색을 위한 크로마 n-gram 선택에 관한 연구)

Seo, Jin Soo;Kim, Junghyun;Park, Jihyun
- The Journal of the Acoustical Society of Korea
- /
- v.36 no.6
- /
- pp.436-441
- /
- 2017
Computing music similarity is indispensable in constructing music retrieval system. This paper focuses on the cover song search among various music-retrieval tasks. We investigate the cover song search method based on the chroma n-gram to reduce storage for feature DB and enhance search accuracy. Specifically we propose t-tab n-gram, n-gram selection method, and n-gram set comparison method. Experiments on the widely used music dataset confirmed that the proposed method improves cover song search accuracy as well as reduces feature storage.
https://doi.org/10.7776/ASK.2017.36.6.436 인용 PDF KSCI

A Study on Substitutability and Complementarity of Music Downloading and Streaming and the Moderating Role of LTE Penetration on Its Relationship (디지털 음악의 다운로드와 스트리밍 서비스 간에 보완성과 대체성 및 LTE 보급률의 조절효과에 관한 연구)

Heo, Kyeongseok;Choi, Sukwoong;Kim, Namil;Kim, Wonjoon
- The Journal of the Korea Contents Association
- /
- v.18 no.5
- /
- pp.490-501
- /
- 2018
Rapid technological innovation led by digitization has significantly changed the business of digital content goods, and has led to the emergence of new forms of services, such as music streaming. However, whether the streaming service is a threat to the traditional downloading service is still under debate. In this study, we examine whether music downloading is a substitute for or a complement to music streaming by investigating the moderating effects of LTE technology penetration. Using a unique dataset on the online music market from a dominant music platform in Korea, we found that music downloading services are complementary to music streaming services, but this complementary relationship is significantly and positively moderated by the introduction of LTE technology.
https://doi.org/10.5392/JKCA.2018.18.05.490 인용 PDF KSCI

Search Result 24, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)