• Title/Summary/Keyword: 음성/음악 분류

Search Result 37, Processing Time 0.035 seconds

The Comparison of features for Speech/Music Discrimination (음성/음악 분류를 위한 특징 비교)

  • Lee Kyong Rok;Seo Bong Su;Kim Jin Young
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.157-160
    • /
    • 2000
  • 본 논문에서는 멀티미디어 정보에서 원하는 정보를 추출하는 멀티미디어 인덱싱 중 오디오 인덱싱의 전처리 부격인 음성/음악 분류실험을 하였다. 오디오 인덱싱에 있어서 음성/음악 분류기는 원 오디오 신호에서 정보를 가진 음성 부분을 분리하는 역할을 한다. 실험에서는 음성/음악 분류에서 널리 쓰이는 멜캡스트럼(Mel Cepstrum), 정규화 로그 에너지(normalized log energy), 영교차(Zero-Crossings)를 특징 파라미터로 사용하였다[l, 2, 3]. 특징공간은 GMM(Gaussian Mixture Model)에 의해 모델링 되었고, 오디오 신호의 분류는 각각 3가지 분류항목(음성, 음악, 음성+음악)과 2가지 분류항목(음성, 음악)을 적용하였다. 실험결과 3가지 분류항목 적용시와 2가지 분류항목 적용시 모두 멜캡스트럼을 사용하였을 때 가장 좋은 결과를 보였다.

  • PDF

Analysis and Implementation of Speech/Music Classification for 3GPP2 SMV Based on GMM (3GPP2 SMV의 실시간 음성/음악 분류 성능 향상을 위한 Gaussian Mixture Model의 적용)

  • Song, Ji-Hyun;Lee, Kye-Hwan;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.26 no.8
    • /
    • pp.390-396
    • /
    • 2007
  • In this letter, we propose a novel approach to improve the performance of speech/music classification for the selectable mode vocoder(SMV) of 3GPP2 using the Gaussian mixture model(GMM) which is based on the expectation-maximization(EM) algorithm. We first present an effective analysis of the features and the classification method adopted in the conventional SMV. And then feature vectors which are applied to the GMM are selected from relevant Parameters of the SMV for the efficient speech/music classification. The performance of the proposed algorithm is evaluated under various conditions and yields better results compared with the conventional scheme of the SMV.

A Comparison of Speech/Music Discrimination Features for Audio Indexing (오디오 인덱싱을 위한 음성/음악 분류 특징 비교)

  • 이경록;서봉수;김진영
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.2
    • /
    • pp.10-15
    • /
    • 2001
  • In this paper, we describe the comparison between the combination of features using a speech and music discrimination, which is classifying between speech and music on audio signals. Audio signals are classified into 3classes (speech, music, speech and music) and 2classes (speech, music). Experiments carried out on three types of feature, Mel-cepstrum, energy, zero-crossings, and try to find a best combination between features to speech and music discrimination. We using a Gaussian Mixture Model (GMM) for discrimination algorithm and combine different features into a single vector prior to modeling the data with a GMM. In 3classes, the best result is achieved using Mel-cepstrum, energy and zero-crossings in a single feature vector (speech: 95.1%, music: 61.9%, speech & music: 55.5%). In 2classes, the best result is achieved using Mel-cepstrum, energy and Mel-cepstrum, energy, zero-crossings in a single feature vector (speech: 98.9%, music: 100%).

  • PDF

Comparison & Analysis of Speech/Music Discrimination Features through Experiments (실험에 의한 음성·음악 분류 특징의 비교 분석)

  • Lee, Kyung-Rok;Ryu, Shi-Woo;Gwark, Jae-Young
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2004.11a
    • /
    • pp.308-313
    • /
    • 2004
  • In this paper, we compared and analyzed the discrimination performance of speech/music about combinations of each features parameter. Audio signals are classified into 3 classes (speech, music, speech and music). On three types of features, Mel-cepstrum, energy, zero-crossings used to the experiments. Then compared and analyzed the best of the combinations between features to speech/ music discrimination performance. The best result is achieved using Mel-cepstrum, energy and zero-crossings in a single feature vector (speech: 95.1%, music: 61.9%, speech & music: 55.5%).

  • PDF

Enhancement of Speech/Music Classification for 3GPP2 SMV Codec Employing Discriminative Weight Training (변별적 가중치 학습을 이용한 3GPP2 SVM의 실시간 음성/음악 분류 성능 향상)

  • Kang, Sang-Ick;Chang, Joon-Hyuk;Lee, Seong-Ro
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.6
    • /
    • pp.319-324
    • /
    • 2008
  • In this paper, we propose a novel approach to improve the performance of speech/music classification for the selectable mode vocoder (SMV) of 3GPP2 using the discriminative weight training which is based on the minimum classification error (MCE) algorithm. We first present an effective analysis of the features and the classification method adopted in the conventional SMV. And then proposed the speech/music decision rule is expressed as the geometric mean of optimally weighted features which are selected from the SMV. The performance of the proposed algorithm is evaluated under various conditions and yields better results compared with the conventional scheme of the SMV.

Speech/Music Discrimination Using Spectrum Analysis and Neural Network (스펙트럼 분석과 신경망을 이용한 음성/음악 분류)

  • Keum, Ji-Soo;Lim, Sung-Kil;Lee, Hyon-Soo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.26 no.5
    • /
    • pp.207-213
    • /
    • 2007
  • In this research, we propose an efficient Speech/Music discrimination method that uses spectrum analysis and neural network. The proposed method extracts the duration feature parameter(MSDF) from a spectral peak track by analyzing the spectrum, and it was used as a feature for Speech/Music discriminator combined with the MFSC. The neural network was used as a Speech/Music discriminator, and we have reformed various experiments to evaluate the proposed method according to the training pattern selection, size and neural network architecture. From the results of Speech/Music discrimination, we found performance improvement and stability according to the training pattern selection and model composition in comparison to previous method. The MSDF and MFSC are used as a feature parameter which is over 50 seconds of training pattern, a discrimination rate of 94.97% for speech and 92.38% for music. Finally, we have achieved performance improvement 1.25% for speech and 1.69% for music compares to the use of MFSC.

Analysis and Implementation of Speech/Music Classification for 3GPP2 SMV Codec Based on Support Vector Machine (SMV코덱의 음성/음악 분류 성능 향상을 위한 Support Vector Machine의 적용)

  • Kim, Sang-Kyun;Chang, Joon-Hyuk
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.45 no.6
    • /
    • pp.142-147
    • /
    • 2008
  • In this paper, we propose a novel a roach to improve the performance of speech/music classification for the selectable mode vocoder (SMV) of 3GPP2 using the support vector machine (SVM). The SVM makes it possible to build on an optimal hyperplane that is separated without the error where the distance between the closest vectors and the hyperplane is maximal. We first present an effective analysis of the features and the classification method adopted in the conventional SMV. And then feature vectors which are a lied to the SVM are selected from relevant parameters of the SMV for the efficient speech/music classification. The performance of the proposed algorithm is evaluated under various conditions and yields better results compared with the conventional scheme of the SMV.

Improvement of Speech/Music Classification Based on RNN in EVS Codec for Hearing Aids (EVS 코덱에서 보청기를 위한 RNN 기반의 음성/음악 분류 성능 향상)

  • Kang, Sang-Ick;Lee, Sang Min
    • Journal of rehabilitation welfare engineering & assistive technology
    • /
    • v.11 no.2
    • /
    • pp.143-146
    • /
    • 2017
  • In this paper, a novel approach is proposed to improve the performance of speech/music classification using the recurrent neural network (RNN) in the enhanced voice services (EVS) of 3GPP for hearing aids. Feature vectors applied to the RNN are selected from the relevant parameters of the EVS for efficient speech/music classification. The performance of the proposed algorithm is evaluated under various conditions and large speech/music data. The proposed algorithm yields better results compared with the conventional scheme implemented in the EVS.

A Technique to Improve the Practicality of SVM-based Speech/Music Classifiers Through Hierarchical Classification (계층구조의 분류를 통한 서포트벡터머신 기반의 음성/음악 분류기의 실용도 향상기법)

  • Choi, Seokhwan;Cho, Youngok;Cho, Jiu;Lim, Chungsoo;Lee, Yeonwoo;Lee, Seong Ro
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2012.04a
    • /
    • pp.1033-1034
    • /
    • 2012
  • 본 논문은 제한된 대역폭의 효율적인 활용을 위한 가변 전송률 코덱을 목표로 제안된 서포트벡터머신 기반의 음성/음악 분류기의 실용도를 높이기 위한 기법을 제안한다. 서포트벡터머신 기반의 음성/음악 분류기는 높은 분류능력을 가지고 있지만 많은 계산량을 요구하기 때문에 실시간으로 사용하기에는 부적합한 면이 있다. 따라서 계층적 분류를 통해 서포트벡터머신 기반의 음성/음악 분류기의 실용성을 향상시키는 기법을 제안한다.

Efficient Implementation of SVM-Based Speech/Music Classification on Embedded Systems (SVM 기반 음성/음악 분류기의 효율적인 임베디드 시스템 구현)

  • Lim, Chung-Soo;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.30 no.8
    • /
    • pp.461-467
    • /
    • 2011
  • Accurate classification of input signals is the key prerequisite for variable bit-rate coding, which has been introduced in order to effectively utilize limited communication bandwidth. Especially, recent surge of multimedia services elevate the importance of speech/music classification. Among many speech/music classifier, the ones based on support vector machine (SVM) have a strong selling point, high classification accuracy, but their computational complexity and memory requirement hinder their way into actual implementations. Therefore, techniques that reduce the computational complexity and the memory requirement is inevitable, particularly for embedded systems. We first analyze implementation of an SVM-based classifier on embedded systems in terms of execution time and energy consumption, and then propose two techniques that alleviate the implementation requirements: One is a technique that removes support vectors that have insignificant contribution to the final classification, and the other is to skip processing some of input signals by virtue of strong correlations in speech/music frames. These are post-processing techniques that can work with any other optimization techniques applied during the training phase of SVM. With experiments, we validate the proposed algorithms from the perspectives of classification accuracy, execution time, and energy consumption.