• Title/Summary/Keyword: cepstral coefficients

Search Result 113, Processing Time 0.033 seconds

Design of a Quantization Algorithm of the Speech Feature Parameters for the Distributed Speech Recognition (분산 음성 인식 시스템을 위한 특징 계수 양자화 방식 설계)

  • Lee Joonseok;Yoon Byungsik;Kang Sangwon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.4
    • /
    • pp.217-223
    • /
    • 2005
  • In this paper, we propose a predictive block constrained trellis coded quantization (BC-TCQ) to quantize cepstral coefficients for the distributed speech recognition. For Prediction of the cepstral coefficients. the 1st order auto-regressive (AR) predictor is used. To quantize the prediction error signal effectively. we use a BC-TCQ. The performance is compared to the split vector quantizers used in the ETSI standard, demonstrating reduction in the cepstral distance and computational complexity.

Spectrum Representation Based on LPC Cepstral VQ for Low Bit Rate CELP Coder (LPC Cepstral 벡터 양자화에 의한 저 전송율 CELP 음성부호기의 스펙트럼 표기)

  • 정재호
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.19 no.4
    • /
    • pp.761-771
    • /
    • 1994
  • This paper focuses on how spectrum information can be represented efficiently in a very low bit rate CELP speech coder. To achieve the goal, an LPC cepstral coefficients VQ scheme representing the spectrum information in a CELP coder is proposed. To represent the spectrum information using LPC cepstrums, three different cepstral distance measures having different spectral meanings in the frequency domain are considered, and their performances are compared and analyzed. The experimental results show that spectrum information in low bit rate CELP coders can be represented very efficiently using the proposed LPC cepstral vector quantization scheme.

  • PDF

A New Feature for Speech Segments Extraction with Hidden Markov Models (숨은마코프모형을 이용하는 음성구간 추출을 위한 특징벡터)

  • Hong, Jeong-Woo;Oh, Chang-Hyuck
    • Communications for Statistical Applications and Methods
    • /
    • v.15 no.2
    • /
    • pp.293-302
    • /
    • 2008
  • In this paper we propose a new feature, average power, for speech segments extraction with hidden Markov models, which is based on mel frequencies of speech signals. The average power is compared with the mel frequency cepstral coefficients, MFCC, and the power coefficient. To compare performances of three types of features, speech data are collected for words with explosives which are generally known hard to be detected. Experiments show that the average power is more accurate and efficient than MFCC and the power coefficient for speech segments extraction in environments with various levels of noise.

Noise Robust Text-Independent Speaker Identification for Ubiquitous Robot Companion (지능형 서비스 로봇을 위한 잡음에 강인한 문맥독립 화자식별 시스템)

  • Kim, Sung-Tak;Ji, Mi-Kyoung;Kim, Hoi-Rin;Kim, Hye-Jin;Yoon, Ho-Sub
    • 한국HCI학회:학술대회논문집
    • /
    • 2008.02a
    • /
    • pp.190-194
    • /
    • 2008
  • This paper presents a speaker identification technique which is one of the basic techniques of the ubiquitous robot companion. Though the conventional mel-frequency cepstral coefficients guarantee high performance of speaker identification in clean condition, the performance is degraded dramatically in noise condition. To overcome this problem, we employed the relative autocorrelation sequence mel-frequency cepstral coefficient which is one of the noise robust features. However, there are two problems in relative autocorrelation sequence mel-frequency cepstral coefficient: 1) the limited information problem. 2) the residual noise problem. In this paper, to deal with these drawbacks, we propose a multi-streaming method for the limited information problem and a hybrid method for the residual noise problem. To evaluate proposed methods, noisy speech is used in which air conditioner noise, classic music, and vacuum noise are artificially added. Through experiments, proposed methods provide better performance of speaker identification than the conventional methods.

  • PDF

A Study on Function Recognition of EMG Signal Using LPC Cepstrum Coefficients (LPC 켑스트럼 계수를 이용한 EMG 신호의 기능 인식에 관한 연구)

  • Wang, Sung-Moon;Chung, Tae-Yun;Choi, Yun-Ho;Byun, Youn-Shik;Park, Sang-Hui
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.27 no.2
    • /
    • pp.126-134
    • /
    • 1990
  • In this study, eight function discrimination and recognition of the EMG signal from the biceps and triceps of 4 subjects were executed, using the Euclidean and weighted cepstral distance measure with LPC cepstrum coefficients. In case of Euclidean cepstral distance measure, as the number of LPC cepstrum coefficients was increased in 8, 10, 12, 14 the recognition rates of functions are 94.69, 95.63, 96.56, and 96.88[%], respectively, but increasing rates of recognition were inclined to decrease. In case of weighted cepstral distance measure, when the number of LPC cepstrum coefficients was 8, 10, 12 and 14, the recognition rates of functions were 91.88, 95, 99.69, and 96.63[%], respectively.

  • PDF

Speaker Identification Using Incremental Learning

  • Kim, Jinsu;Son, Sung-Han;Cho, Byungsun;Park, Kang-Bak;Tsuji, Teruo;Hanamoto, Tsuyoshi
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2002.10a
    • /
    • pp.75.5-75
    • /
    • 2002
  • $\textbullet$ FFT $\textbullet$ Autocorrelation $\textbullet$ Levinson_Durbin resolution $\textbullet$ LP coefficients $\textbullet$ LP cepstral Coefficients $\textbullet$ Incremental Learning

  • PDF

A study on Effective Feature Parameters Comparison for Speaker Recognition (화자인식에 효과적인 특징벡터에 관한 비교연구)

  • Park TaeSun;Kim Sang-Jin;Kwang Moon;Hahn Minsoo
    • Proceedings of the KSPS conference
    • /
    • 2003.05a
    • /
    • pp.145-148
    • /
    • 2003
  • In this paper, we carried out comparative study about various feature parameters for the effective speaker recognition such as LPC, LPCC, MFCC, Log Area Ratio, Reflection Coefficients, Inverse Sine, and Delta Parameter. We also adopted cepstral liftering and cepstral mean subtraction methods to check their usefulness. Our recognition system is HMM based one with 4 connected-Korean-digit speech database. Various experimental results will help to select the most effective parameter for speaker recognition.

  • PDF

Digital Isolated Word Recognition System based on MFCC and DTW Algorithm (MFCC와 DTW에 알고리즘을 기반으로 한 디지털 고립단어 인식 시스템)

  • Zang, Xian;Chong, Kil-To
    • Proceedings of the KIEE Conference
    • /
    • 2008.10b
    • /
    • pp.290-291
    • /
    • 2008
  • The most popular speech feature used in speech recognition today is the Mel-Frequency Cepstral Coefficients (MFCC) algorithm, which could reflect the perception characteristics of the human ear more accurately than other parameters. This paper adopts MFCC and its first order difference, which could reflect the dynamic character of speech signal, as synthetical parametric representation. Furthermore, we quote Dynamic Time Warping (DTW) algorithm to search match paths in the pattern recognition process. We use the software "GoldWave" to record English digitals in the lab environments and the simulation results indicate the algorithm has higher recognition accuracy than others using LPCC, etc. as character parameters in the experiment for Digital Isolated Word Recognition (DIWR) system.

  • PDF

Performance Comparison of Automatic Detection of Laryngeal Diseases by Voice (후두질환 음성의 자동 식별 성능 비교)

  • Kang Hyun Min;Kim Soo Mi;Kim Yoo Shin;Kim Hyung Soon;Jo Cheol-Woo;Yang Byunggon;Wang Soo-Geun
    • MALSORI
    • /
    • no.45
    • /
    • pp.35-45
    • /
    • 2003
  • Laryngeal diseases cause significant changes in the quality of speech production. Automatic detection of laryngeal diseases by voice is attractive because of its nonintrusive nature. In this paper, we apply speech recognition techniques to detection of laryngeal cancer, and investigate which feature parameters and classification methods are appropriate for this purpose. Linear Predictive Cepstral Coefficients (LPCC) and Mel-Frequency Cepstral Coefficients (MFCC) are examined as feature parameters, and parameters reflecting the periodicity of speech and its perturbation are also considered. As for classifier, multilayer perceptron neural networks and Gaussian Mixture Models (GMM) are employed. According to our experiments, higher order LPCC with the periodic information parameters yields the best performance.

  • PDF

Discriminative Weight Training for Gender Identification (변별적 가중치 학습을 적용한 성별인식 알고리즘)

  • Kang, Sang-Ick;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.5
    • /
    • pp.252-255
    • /
    • 2008
  • In this paper, we apply a discriminative weight training to a support vector machine (SVM) based gender identification. In our approach, the gender decision rule is expressed as the SVM of optimally weighted mel-frequency cepstral coefficients (MFCC) based on a minimum classification error (MCE) method which is different from the previous works in that different weights are assigned to each MFCC filter bank which is considered more realistic. According to the experimental results, the proposed approach is found to be effective for gender identification using SVM.