• Title/Summary/Keyword: PFCMS

Search Result 3, Processing Time 0.017 seconds

Channel Compensation for Cepstrum-Based Detection of Laryngeal Diseases (켑스트럼 기반의 후두암 감별을 위한 채널보상)

  • Kim Young Kuk;Kim Su Mi;Kim Hyung Soon;Wang Soo-Geun;Jo Cheol-Woo;Yang Byung-Gon
    • MALSORI
    • /
    • no.50
    • /
    • pp.111-122
    • /
    • 2004
  • Automatic detection of laryngeal diseases by voice is attractive because of its non-intrusive nature. Cepstrum based approach to detect laryngeal cancer shows reliable performance even when the periodicity of voice signals is severely lost, but it has a drawback that it is not robust to channel mismatch due to different microphone characteristics. In this paper, to deal with mismatched training and test microphone conditions, we investigate channel compensation techniques such as Cepstral Mean Subtraction (CMS) and Pole Filtered CMS (PFCMS). According to our experiments, PFCMS yields better performance than CMS. By using PFCMS, we obtained 12% and 40% error reduction over baseline and CMS, respectively.

  • PDF

Parameters Comparison in the speaker Identification under the Noisy Environments (화자식별을 위한 파라미터의 잡음환경에서의 성능비교)

  • Choi, Hong-Sub
    • Speech Sciences
    • /
    • v.7 no.3
    • /
    • pp.185-195
    • /
    • 2000
  • This paper seeks to compare the feature parameters used in speaker identification systems under noisy environments. The feature parameters compared are LP cepstrum (LPCC), Cepstral mean subtraction(CMS), Pole-filtered CMS(PFCMS), Adaptive component weighted cepstrum(ACW) and Postfilter cepstrum(PF). The GMM-based text independent speaker identification system is designed for this target. Some series of experiments show that the LPCC parameter is adequate for modelling the speaker in the matched environments between train and test stages. But in the mismatched training and testing conditions, modified parameters are preferable the LPCC. Especially CMS and PFCMS parameters are more effective for the microphone mismatching conditions while the ACW and PF parameters are good for more noisy mismatches.

  • PDF

Formant-broadened CMS Using the Log-spectrum Transformed from the Cepstrum (켑스트럼으로부터 변환된 로그 스펙트럼을 이용한 포먼트 평활화 켑스트럴 평균 차감법)

  • 김유진;정혜경;정재호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.4
    • /
    • pp.361-373
    • /
    • 2002
  • In this paper, we propose a channel normalization method to improve the performance of CMS (cepstral mean subtraction) which is widely adopted to normalize a channel variation for speech and speaker recognition. CMS which estimates the channel effects by averaging long-term cepstrum has a weak point that the estimated channel is biased by the formants of voiced speech which include a useful speech information. The proposed Formant-broadened Cepstral Mean Subtraction (FBCMS) is based on the facts that the formants can be found easily in log spectrum which is transformed from the cepstrum by fourier transform and the formants correspond to the dominant poles of all-pole model which is usually modeled vocal tract. The FBCMS evaluates only poles to be broadened from the log spectrum without polynomial factorization and makes a formant-broadened cepstrum by broadening the bandwidths of formant poles. We can estimate the channel cepstrum effectively by averaging formant-broadened cepstral coefficients. We performed the experiments to compare FBCMS with CMS, PFCMS using 4 simulated telephone channels. In the experiment of channel estimation, we evaluated the distance cepstrum of real channel from the cepstrum of estimated channel and found that we were able to get the mean cepstrum closer to the channel cepstrum due to an softening the bias of mean cepstrum to speech. In the experiment of text-independent speaker identification, we showed the result that the proposed method was superior than the conventional CMS and comparable to the pole-filtered CMS. Consequently, we showed the proposed method was efficiently able to normalize the channel variation based on the conventional CMS.