A Comparison of Speech/Music Discrimination Features for Audio Indexing

;;;

The Journal of the Acoustical Society of Korea (한국음향학회지)

Volume 20 Issue 2
/
Pages.10-15
/
2001
/
1225-4428(pISSN)
/
2287-3775(eISSN)

The Acoustical Society of Korea (한국음향학회)

A Comparison of Speech/Music Discrimination Features for Audio Indexing

오디오 인덱싱을 위한 음성/음악 분류 특징 비교

이경록 (전남대학교 전자공학과) ;
서봉수 (전남대학교 전자공학과) ;
김진영 (전남대학교 전자공학과)

Published : 2001.02.01

PDF

Download PDF

⟨ Previous Next ⟩

Abstract

In this paper, we describe the comparison between the combination of features using a speech and music discrimination, which is classifying between speech and music on audio signals. Audio signals are classified into 3classes (speech, music, speech and music) and 2classes (speech, music). Experiments carried out on three types of feature, Mel-cepstrum, energy, zero-crossings, and try to find a best combination between features to speech and music discrimination. We using a Gaussian Mixture Model (GMM) for discrimination algorithm and combine different features into a single vector prior to modeling the data with a GMM. In 3classes, the best result is achieved using Mel-cepstrum, energy and zero-crossings in a single feature vector (speech: 95.1％, music: 61.9％, speech & music: 55.5％). In 2classes, the best result is achieved using Mel-cepstrum, energy and Mel-cepstrum, energy, zero-crossings in a single feature vector (speech: 98.9％, music: 100％).

본 논문에서 우리는 음향신호에서 음성과 음악을 분류하는 음성/음악 분류실험에 사용되는 특징들간의 상호조합을 비교하였다. 음향신호는 3가지 (음성, 음악, 음성+음악)와 2가지 (음성, 음악)로 분류하였다. 실험은 멜캡스트럼, 에너지, 영교차를 특징으로 사용하였고, 음성/음악 분류성능이 가장 좋은 특징간 상호조합을 모색하였다. 분류 알고리즘으로는 Gaussian Mixture Model (GMM)을 이용하였으며, GMM에 의한 데이터 모델링 전에 각기 다른 특징들을 하나의 특징공간에서 결합하였다. 실험결과 3가지 분류기준 적용시에는 멜캡스트럼, 영교차 조합이 가장 좋은 결과 (음성: 95.1％, 음악: 61.9％, 음성+음악: 55.5％)를 보였고, 2가지 분류기준 적용시에는 멜캡스트럼, 에너지 조합과 멜캡스트럼, 에너지, 영교차 조합이 가장 좋은 결과 (음성: 98.9％, 음악: 100％)를 보였다.

Keywords

References

Proc. ICASSP v.1 A Comparison of Features for Speech, Music Discrimination Michael J. Carey
Proc. ICASSP Real-Time Discrimination of Broadcast Speech/Music John Saunders
Proc. ICASSP Construction and Evaluation of a Robust Multifeature Speech/Music Discriminator E. Scheier;M. Slaney
Proc. IEEE Detection of Human Speech in Structured Noise John D. Hoyt
Proc. IEEE v.74 no.11 Spectral Analysis and discrimination by Zero Crossing B. Kedam
Proceedings of the Broadcast News Transcription and Understanding Workshop Segment Generation and Clustering in the HTK Broadcast News Transcription System T. Hain
Proc. ICSLP Partitioning and Transcription of Broadcast News Data Jean-Luc Gauvain;Lori Lamel
한국음향학회 학술발표대회논문집 v.19 음성음악 분류를 위한 특징 비교 이경록;서봉수;김진영

The Journal of the Acoustical Society of Korea (한국음향학회지)

A Comparison of Speech/Music Discrimination Features for Audio Indexing

오디오 인덱싱을 위한 음성/음악 분류 특징 비교

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)