• Title/Summary/Keyword: speech feature parameters

Search Result 130, Processing Time 0.023 seconds

A study on Effective Feature Parameters Comparison for Speaker Recognition (화자인식에 효과적인 특징벡터에 관한 비교연구)

  • Park TaeSun;Kim Sang-Jin;Kwang Moon;Hahn Minsoo
    • Proceedings of the KSPS conference
    • /
    • 2003.05a
    • /
    • pp.145-148
    • /
    • 2003
  • In this paper, we carried out comparative study about various feature parameters for the effective speaker recognition such as LPC, LPCC, MFCC, Log Area Ratio, Reflection Coefficients, Inverse Sine, and Delta Parameter. We also adopted cepstral liftering and cepstral mean subtraction methods to check their usefulness. Our recognition system is HMM based one with 4 connected-Korean-digit speech database. Various experimental results will help to select the most effective parameter for speaker recognition.

  • PDF

Speech Query Recognition for Tamil Language Using Wavelet and Wavelet Packets

  • Iswarya, P.;Radha, V.
    • Journal of Information Processing Systems
    • /
    • v.13 no.5
    • /
    • pp.1135-1148
    • /
    • 2017
  • Speech recognition is one of the fascinating fields in the area of Computer science. Accuracy of speech recognition system may reduce due to the presence of noise present in speech signal. Therefore noise removal is an essential step in Automatic Speech Recognition (ASR) system and this paper proposes a new technique called combined thresholding for noise removal. Feature extraction is process of converting acoustic signal into most valuable set of parameters. This paper also concentrates on improving Mel Frequency Cepstral Coefficients (MFCC) features by introducing Discrete Wavelet Packet Transform (DWPT) in the place of Discrete Fourier Transformation (DFT) block to provide an efficient signal analysis. The feature vector is varied in size, for choosing the correct length of feature vector Self Organizing Map (SOM) is used. As a single classifier does not provide enough accuracy, so this research proposes an Ensemble Support Vector Machine (ESVM) classifier where the fixed length feature vector from SOM is given as input, termed as ESVM_SOM. The experimental results showed that the proposed methods provide better results than the existing methods.

A Korean Speech Recognition Using Fuzzy Rule Base (Fuzzy Rule Base를 이용한 한국어 연속 음성인식)

  • Song, Jeong-Young
    • The Journal of Engineering Research
    • /
    • v.2 no.1
    • /
    • pp.13-21
    • /
    • 1997
  • This paper describes how to represent varations of feature parameters to improve recognition of continuous speech. For speech recognition, feature parameters, which are formant frequencies, pitches, logarithmic energies and zero crossing retes are used in general. But, their values and variations depend on speakers, for example disparities between man and woman, and on their age. It is difficult to decide a priority the value of the variation width. Hence, we try to represent this variation by introducing fuzziness and recognize a continuous speech by fuzzy inference using fuzzy production rules.

  • PDF

Voice Activity Detection Algorithm Using Speech Periodicity and QSNR in Noisy Environment (음성의 주기성과 QSNR을 이용한 잡음환경에서의 음성검출 알고리즘)

  • Jeong, Ju-Hyun;Song, Hwa-Jeon;Kim, Hyung-Soon
    • Proceedings of the KSPS conference
    • /
    • 2005.11a
    • /
    • pp.59-62
    • /
    • 2005
  • Voice activity detection (VAD) is important in many areas of speech processing technology. Speech/nonspeech discrimination in noisy environments is a difficult task because the feature parameters used for the VAD are sensitive to the surrounding environments. Thus the VAD performance is severely degraded at low signal-to-noise ratios (SNRs). In this paper, a new VAD algorithm is proposed based on the degree of voicing and Quantile SNR (QSNR). These two feature parameters are more robust than other features such as energy and spectral entropy in noisy environments. The effectiveness of proposed algorithm is evaluated under the diverse noisy environments in the Aurora2 DB. According to out experiment, the proposed VAD outperforms the ETSI Advanced Frontend VAD.

  • PDF

Enhancement of Ship's Wheel Order Recognition System using Speaker's Intention Predictive Parameters (화자의도예측 파라미터를 이용한 조타명령 음성인식 시스템의 개선)

  • Moon, Serng-Bae
    • Journal of Advanced Marine Engineering and Technology
    • /
    • v.32 no.5
    • /
    • pp.791-797
    • /
    • 2008
  • The officer of the deck(OOD) may sometimes have to carry out lookout as well as handling of auto pilot without a quartermaster at sea. The purpose of this paper is to develop the ship's auto pilot control module using speech recognition in order to reduce the potential risk of one man bridge system. The feature parameters predicting the OOD's intention was extracted from the sample wheel orders written in SMCP(IMO Standard Marine Communication Phrases). We designed a pre-recognition procedure which could make some candidate words using DTW(Dynamic Time Warping) algorithm, a post-recognition procedure which made a final decision from the candidate words using the feature parameters. To evaluate the effectiveness of these procedures the experiment was conducted with 500 wheel orders.

Speaker Change Detection by Normalization of Phonetic Characteristics (음소 특성 정규화를 통한 화자 변화 검출)

  • Kim Hyung Soon;Park Hae Young;Park Sun Young
    • MALSORI
    • /
    • no.47
    • /
    • pp.97-107
    • /
    • 2003
  • Speaker change detection is to detect automatically a point of time at which speaker was replaced. Since feature parameters used for speaker change detection depend not only on speaker characteristics but also on phonetic characteristics, spoken contents included in the feature parameters inevitably causes performance degradation of speaker change detection. In this paper, to alleviate this problem, a method to normalize phonetic variations in speech feature parameters is proposed for emphasizing changes due to speaker characteristics. Experimental results show that the proposed method improves the performance of speaker change detection.

  • PDF

Adoption of Support Vector Machine and Independent Component Analysis for Implementation of Speech Recognizer (음성인식기 구현을 위한 SVM과 독립성분분석 기법의 적용)

  • 박정원;김평환;김창근;허강인
    • Proceedings of the IEEK Conference
    • /
    • 2003.07e
    • /
    • pp.2164-2167
    • /
    • 2003
  • In this paper we propose effective speech recognizer through recognition experiments for three feature parameters(PCA, ICA and MFCC) using SVM(Support Vector Machine) classifier In general, SVM is classification method which classify two class set by finding voluntary nonlinear boundary in vector space and possesses high classification performance under few training data number. In this paper we compare recognition result for each feature parameter and propose ICA feature as the most effective parameter

  • PDF

Parameters Comparison in the speaker Identification under the Noisy Environments (화자식별을 위한 파라미터의 잡음환경에서의 성능비교)

  • Choi, Hong-Sub
    • Speech Sciences
    • /
    • v.7 no.3
    • /
    • pp.185-195
    • /
    • 2000
  • This paper seeks to compare the feature parameters used in speaker identification systems under noisy environments. The feature parameters compared are LP cepstrum (LPCC), Cepstral mean subtraction(CMS), Pole-filtered CMS(PFCMS), Adaptive component weighted cepstrum(ACW) and Postfilter cepstrum(PF). The GMM-based text independent speaker identification system is designed for this target. Some series of experiments show that the LPCC parameter is adequate for modelling the speaker in the matched environments between train and test stages. But in the mismatched training and testing conditions, modified parameters are preferable the LPCC. Especially CMS and PFCMS parameters are more effective for the microphone mismatching conditions while the ACW and PF parameters are good for more noisy mismatches.

  • PDF

A study on the recognition performance of connected digit telephone speech for MFCC feature parameters obtained from the filter bank adapted to training speech database (훈련음성 데이터에 적응시킨 필터뱅크 기반의 MFCC 특징파라미터를 이용한 전화음성 연속숫자음의 인식성능 향상에 관한 연구)

  • Jung Sung Yun;Kim Min Sung;Son Jong Mok;Bae Keun Sung;Kang Jeom Ja
    • Proceedings of the KSPS conference
    • /
    • 2003.05a
    • /
    • pp.119-122
    • /
    • 2003
  • In general, triangular shape filters are used in the filter bank when we get the MFCCs from the spectrum of speech signal. In [1], a new feature extraction approach is proposed, which uses specific filter shapes in the filter bank that are obtained from the spectrum of training speech data. In this approach, principal component analysis technique is applied to the spectrum of the training data to get the filter coefficients. In this paper, we carry out speech recognition experiments, using the new approach given in [1], for a large amount of telephone speech data, that is, the telephone speech database of Korean connected digit released by SITEC. Experimental results are discussed with our findings.

  • PDF

Qualitative Classification of Voice Quality of Normal Speech and Derivation of its Correlation with Speech Features (정상 음성의 목소리 특성의 정성적 분류와 음성 특징과의 상관관계 도출)

  • Kim, Jungin;Kwon, Chulhong
    • Phonetics and Speech Sciences
    • /
    • v.6 no.1
    • /
    • pp.71-76
    • /
    • 2014
  • In this paper voice quality of normal speech is qualitatively classified by five components of breathy, creaky, rough, nasal, and thin/thick voice. To determine whether a correlation exists between a subjective measure of voice and an objective measure of voice, each voice is perceptually evaluated using the 1/2/3 scale by speech processing specialists and acoustically analyzed using speech analysis tools such as the Praat, MDVP, and VoiceSauce. The speech parameters include features related to speech source and vocal tract filter. Statistical analysis uses a two-independent-samples non-parametric test. Experimental results show that statistical analysis identified a significant correlation between the speech feature parameters and the components of voice quality.