• Title/Summary/Keyword: Music classification

Search Result 238, Processing Time 0.024 seconds

Music Genre Classification using Spikegram and Deep Neural Network (스파이크그램과 심층 신경망을 이용한 음악 장르 분류)

  • Jang, Woo-Jin;Yun, Ho-Won;Shin, Seong-Hyeon;Cho, Hyo-Jin;Jang, Won;Park, Hochong
    • Journal of Broadcast Engineering
    • /
    • v.22 no.6
    • /
    • pp.693-701
    • /
    • 2017
  • In this paper, we propose a new method for music genre classification using spikegram and deep neural network. The human auditory system encodes the input sound in the time and frequency domain in order to maximize the amount of sound information delivered to the brain using minimum energy and resource. Spikegram is a method of analyzing waveform based on the encoding function of auditory system. In the proposed method, we analyze the signal using spikegram and extract a feature vector composed of key information for the genre classification, which is to be used as the input to the neural network. We measure the performance of music genre classification using the GTZAN dataset consisting of 10 music genres, and confirm that the proposed method provides good performance using a low-dimensional feature vector, compared to the current state-of-the-art methods.

Enhancement of Speech/Music Classification for 3GPP2 SMV Codec Employing Discriminative Weight Training (변별적 가중치 학습을 이용한 3GPP2 SVM의 실시간 음성/음악 분류 성능 향상)

  • Kang, Sang-Ick;Chang, Joon-Hyuk;Lee, Seong-Ro
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.6
    • /
    • pp.319-324
    • /
    • 2008
  • In this paper, we propose a novel approach to improve the performance of speech/music classification for the selectable mode vocoder (SMV) of 3GPP2 using the discriminative weight training which is based on the minimum classification error (MCE) algorithm. We first present an effective analysis of the features and the classification method adopted in the conventional SMV. And then proposed the speech/music decision rule is expressed as the geometric mean of optimally weighted features which are selected from the SMV. The performance of the proposed algorithm is evaluated under various conditions and yields better results compared with the conventional scheme of the SMV.

A Robust Content-Based Music Retrieval System

  • Lee Kang-Kyu;Yoon Won-Jung;Park Kyu-Sik
    • Proceedings of the IEEK Conference
    • /
    • summer
    • /
    • pp.229-232
    • /
    • 2004
  • In this paper, we propose a robust music retrieval system based on the content analysis of music. New feature extraction method called Multi-Feature Clustering (MFC) is proposed for the robust and optimum performance of the music retrieval system. It is demonstrated that the use of MFC significantly improves the system stability of music retrieval with better classification accuracy.

  • PDF

Extraction and classification of tempo stimuli from electroencephalography recordings using convolutional recurrent attention model

  • Lee, Gi Yong;Kim, Min-Soo;Kim, Hyoung-Gook
    • ETRI Journal
    • /
    • v.43 no.6
    • /
    • pp.1081-1092
    • /
    • 2021
  • Electroencephalography (EEG) recordings taken during the perception of music tempo contain information that estimates the tempo of a music piece. If information about this tempo stimulus in EEG recordings can be extracted and classified, it can be effectively used to construct a music-based brain-computer interface. This study proposes a novel convolutional recurrent attention model (CRAM) to extract and classify features corresponding to tempo stimuli from EEG recordings of listeners who listened with concentration to the tempo of musics. The proposed CRAM is composed of six modules, namely, network inputs, two-dimensional convolutional bidirectional gated recurrent unit-based sample encoder, sample-level intuitive attention, segment encoder, segment-level intuitive attention, and softmax layer, to effectively model spatiotemporal features and improve the classification accuracy of tempo stimuli. To evaluate the proposed method's performance, we conducted experiments on two benchmark datasets. The proposed method achieves promising results, outperforming recent methods.

Improving SVM with Second-Order Conditional MAP for Speech/Music Classification (음성/음악 분류 향상을 위한 2차 조건 사후 최대 확률기법 기반 SVM)

  • Lim, Chung-Soo;Chang, Joon-Hyuk
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.48 no.5
    • /
    • pp.102-108
    • /
    • 2011
  • Support vector machines are well known for their outstanding performance in pattern recognition fields. One example of their applications is music/speech classification for a standardized codec such as 3GPP2 selectable mode vocoder. In this paper, we propose a novel scheme that improves the speech/music classification of support vector machines based on the second-order conditional maximum a priori. While conventional support vector machine optimization techniques apply during training phase, the proposed technique can be adopted in classification phase. In this regard, the proposed approach can be developed and employed in parallel with conventional optimizations, resulting in synergistic boost in classification performance. According to experimental results, the proposed algorithm shows its compatibility and potential for improving the performance of support vector machines.

Fine-tuning SVM for Enhancing Speech/Music Classification (SVM의 미세조정을 통한 음성/음악 분류 성능향상)

  • Lim, Chung-Soo;Song, Ji-Hyun;Chang, Joon-Hyuk
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.48 no.2
    • /
    • pp.141-148
    • /
    • 2011
  • Support vector machines have been extensively studied and utilized in pattern recognition area for years. One of interesting applications of this technique is music/speech classification for a standardized codec such as 3GPP2 selectable mode vocoder. In this paper, we propose a novel approach that improves the speech/music classification of support vector machines. While conventional support vector machine optimization techniques apply during training phase, the proposed technique can be adopted in classification phase. In this regard, the proposed approach can be developed and employed in parallel with conventional optimizations, resulting in synergistic boost in classification performance. We first analyze the impact of kernel width parameter on the classifications made by support vector machines. From this analysis, we observe that we can fine-tune outputs of support vector machines with the kernel width parameter. To make the most of this capability, we identify strong correlation among neighboring input frames, and use this correlation information as a guide to adjusting kernel width parameter. According to the experimental results, the proposed algorithm is found to have potential for improving the performance of support vector machines.

Target signal detection using MUSIC spectrum in noise environments (MUSIC 스펙트럼을 이용한 잡음환경에서의 목표 신호 구간 검출)

  • Park, Sang-Jun;Jeong, Sang-Bae
    • Phonetics and Speech Sciences
    • /
    • v.4 no.3
    • /
    • pp.103-110
    • /
    • 2012
  • In this paper, a target signal detection method using multiple signal classification (MUSIC) algorithm is proposed. The MUSIC algorithm is a subspace-based direction of arrival (DOA) estimation method. Using the inverse of the eigenvalue-weighted eigen spectra, the algorithm detects the DOAs of multiple sources. To apply the algorithm in target signal detection for GSC-based beamforming, we utilize its spectral response for the DOA of the target source in noisy conditions. The performance of the proposed target signal detection method is compared with those of the normalized cross-correlation (NCC), the fixed beamforming, and the power ratio method. Experimental results show that the proposed algorithm significantly outperforms the conventional ones in receiver operating characteristics (ROC) curves.

Comparison of ICA-based and MUSIC-based Approaches Used for the Extraction of Source Time Series and Causality Analysis (뇌 신호원의 시계열 추출 및 인과성 분석에 있어서 ICA 기반 접근법과 MUSIC 기반 접근법의 성능 비교 및 문제점 진단)

  • Jung, Young-Jin;Kim, Do-Won;Lee, Jin-Young;Im, Chang-Hwan
    • Journal of Biomedical Engineering Research
    • /
    • v.29 no.4
    • /
    • pp.329-336
    • /
    • 2008
  • Recently, causality analysis of source time series extracted from EEG or MEG signals is becoming of great importance in human brain mapping studies and noninvasive diagnosis of various brain diseases. Two approaches have been widely used for the analyses: one is independent component analysis (ICA), and the other is multiple signal classification (MUSIC). To the best of our knowledge, however, any comparison studies to reveal the difference of the two approaches have not been reported. In the present study, we compared the performance of the two different techniques, ICA and MUSIC, especially focusing on how accurately they can estimate and separate various brain electrical signals such as linear, nonlinear, and chaotic signals without a priori knowledge. Results of the realistic simulation studies, adopting directed transfer function (DTF) and Granger causality (GC) as measures of the accurate extraction of source time series, demonstrated that the MUSIC-based approach is more reliable than the ICA-based approach.

A New Tempo Feature Extraction Based on Modulation Spectrum Analysis for Music Information Retrieval Tasks

  • Kim, Hyoung-Gook
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.6 no.2
    • /
    • pp.95-106
    • /
    • 2007
  • This paper proposes an effective tempo feature extraction method for music information retrieval. The tempo information is modeled by the narrow-band temporal modulation components, which are decomposed into a modulation spectrum via joint frequency analysis. In implementation, the tempo feature is directly extracted from the modified discrete cosine transform coefficients, which is the output of partial MP3(MPEG 1 Layer 3) decoder. Then, different features are extracted from the amplitudes of modulation spectrum and applied to different music information retrieval tasks. The logarithmic scale modulation frequency coefficients are employed in automatic music emotion classification and music genre classification. The classification precision in both systems is improved significantly. The bit vectors derived from adaptive modulation spectrum is used in audio fingerprinting task That is proved to be able to achieve high robustness in this application. The experimental results in these tasks validate the effectiveness of the proposed tempo feature.

  • PDF

Design of MUSIC-based DoA Estimator for Bluetooth Applications (Bluetooth 응용을 위한 MUSIC 알고리즘 기반 DoA 추정기의 설계)

  • Kim, Jongmin;Oh, Dongjae;Park, Sanghoon;Lee, Seunghyeok;Jung, Yunho
    • Journal of IKEEE
    • /
    • v.24 no.1
    • /
    • pp.339-346
    • /
    • 2020
  • In this paper, we propose an angle estimator that is designed to be applied to Bluetooth low-power application technology based on multiple signal classification (MUSIC) algorithm, and present the result of implementation in FPGA. The MUSIC algorithm is designed for H/W high-speed design because it requires a lot of calculations due to high accuracy, and the snapshot variable is designed to cope with various resolution requirements of indoor systems. As a result of the implementation with Xilinx zynq-7000, it was confirmed that 9,081 LUTs were implemented, and it was designed to operate at =the operating frequency of 100MHz.