• Title/Summary/Keyword: 잡음첨가 음성 신호

Search Result 7, Processing Time 0.026 seconds

Extraction of Unvoiced Consonant Regions from Fluent Korean Speech in Noisy Environments (잡음환경에서 우리말 연속음성의 무성자음 구간 추출 방법)

  • 박정임;하동경;신옥근
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.4
    • /
    • pp.286-292
    • /
    • 2003
  • Voice activity detection (VAD) is a process that separates the noise region from silence or noise region of input speech signal. Since unvoiced consonant signals have very similar characteristics to those of noise signals, it may result in serious distortion of unvoiced consonants, or in erroneous noise estimation to can out VAD without paying special attention on unvoiced consonants. In this paper, we propose a method to extract in an explicit way the boundaries between unvoiced consonant and noise in fluent speech so that more exact VAD could be performed. The proposed method is based on histogram in frequency domain which was successfully used by Hirsch for noise estimation, and a1so on similarity measure of frequency components between adjacent frames, To evaluate the performance of the proposed method, experiments on unvoiced consonant boundary extraction was performed on seven kinds of noisy speech signals of 10 ㏈ and 15 ㏈ SNR respectively.

Automatic Syllable Segmentation Algorithm in Noise Additional Continuous Speech (잡음이 첨가된 연속음성에서의 자동 음절분할 알고리즘)

  • Kim, Young-Sub;Cha, Young-Dong;Kim, Chang-Keun;Lee, Kwang-Seok;Hur, Kang-In
    • Proceedings of the Korea Institute of Convergence Signal Processing
    • /
    • 2006.06a
    • /
    • pp.17-20
    • /
    • 2006
  • 본 논문에서는 잡음이 첨가된 연속음성에서의 자동 음절분할을 위해 기존에 사용되고 있는 특징 파라미터인 단구간 에너지 이외에 잡음에 강인한 특성을 가지고 있는 새로운 특징인 스펙트럼 밀도비교척도와 의사역행렬을 이용한 선형판별함수를 제안한다. 기존에 사용되는 단구간 에너지는 잡음이 없는 환경에서는 좋은 성능을 나타내지만 잡음환경에서는 그렇지 못하다. 반면에 논문에서 제안한 척도들은 반대의 성능을 가지므로 주변잡음의 크기에 따라 각각의 파라미터를 적절한 가중치로 조합하는 음절구간 결정함수와 유한상태 머신을 추가로 사용면 무 잡음 환경뿐만 아니라, 잡음이 첨가된 연속음성에서도 일정수준 이상의 음절구간을 분리해 낼 수 있다.

  • PDF

Recognition of Corrupted Speech by Noise using Wavelet Packets (웨이블릿 페킷을 이용한 잡음에 손상된 음성신호 인식에 관한 연구)

  • Koh Kwang-hyun;Chang Sungwook;Yang Sung-il;Kwon Y.
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • autumn
    • /
    • pp.89-92
    • /
    • 1999
  • 인식기 훈련과정에서 발생하지 않았던 잡음이 인식과정에서 신호를 손상할 경우 인식률의 저하가 발생한다. 본 논문에서는 음성의 질을 떨어뜨리는 이러한 잡음을 Wavelet Packets을 이용하여 전처리함으로서 인식률을 향상시키는 방법을 제안한다. 인식기로는 Hidden Markov Model을 사용하였고, 시스템에 사용된 특징 파라미터로는 15차 Cepstrum을 사용하였다. 11 kHz로 샘플링된 숫자음에 Additive White Gaussian Noise를 첨가한 손상된 음성신호를 인식실험에 사용하였다. 화자독립으로 진행된 실험에서 잡음에 의해 손상된 SNR 20dB의 음성신호에 대하여 Wavelet Packets로 잡음을 제거한 후 복원된 음성신호 의 인식률은 약 $10\%$ 향상됨을 확인하였다.

  • PDF

The suppression of noise-induced speech distortions for speech recognition (음성인식을 위한 잡음하의 음성왜곡제거)

  • Chi, Sang-Mun;Oh, Yung-Hwan
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.35S no.12
    • /
    • pp.93-102
    • /
    • 1998
  • In noisy environments, human speech productions are influenced by noises(Lombard effect), and speech signals are contaminated. These distortions dramatically reduce the performance of speech recognition systems. This paper proposes a method of the Lombard effect compensation and noise suppression in order to improve speech recognition performance in noise environments. To estimate the intensity of the Lombard effect which is a nonlinear distortion depending on the ambient noise levels, speakers, and phonetic units, we formulate the measure of the Lombard effect level based on the acoustic speech signal, and the measure is used to compensate the Lombard effect. The distortions of speech under noisy environments are cancelled out as follows. First, spectral subtraction and band-pass filtering are used to cancel out noise. Second, energy nomalization is proposed to cancel out the variation of vocal intensity by the Lombard effect. Finally, the Lombard effect level controls the transform which converts Lombard speech cepstrum to clean speech cepstrum. The proposed method was validated on 50 korean word recognition. Average recognition rates were 82.6%, 95.7%, 97.6% with the proposed method, while 46.3%, 75.5%, 87.4% without any compensation at SNR 0, 10, 20 dB, respectively.

  • PDF

Normalization of Spectral Magnitude and Cepstral Transformation for Compensation of Lombard Effect (롬바드 효과의 보정을 위한 스펙트럼 크기의 정규화와 켑스트럼 변환)

  • Chi, Sang-Mun;Oh, Yung-Hwan
    • The Journal of the Acoustical Society of Korea
    • /
    • v.15 no.4
    • /
    • pp.83-92
    • /
    • 1996
  • This paper describes Lombard effect compensation and noise suppression so as to reduce speech recognition error in noisy environments. Lombard effect is represented by the variation of spectral envelope of energy normalized word and the variation of overall vocal intensity. The variation of spectral envelope can be compensated by linear transformation in cepstral domain. The variation of vocal intensity is canceled by spectral magnitude normalization. Spectral subtraction is use to suppress noise contamination, and band-pass filtering is used to emphasize dynamic features. To understand Lombard effect and verify the effectiveness of the proposed method, speech data are collected in simulated noisy environments. Recognition experiments were conducted with contamination by noise from automobile cabins, an exhibition hall, telephone booths in down town, crowded streets, and computer rooms. From the experiments, the effectiveness of the proposed method has been confirmed.

  • PDF

Speech Recognition in Noisy environment using Transition Constrained HMM (천이 제한 HMM을 이용한 잡음 환경에서의 음성 인식)

  • Kim, Weon-Goo;Shin, Won-Ho;Youn, Dae-Hee
    • The Journal of the Acoustical Society of Korea
    • /
    • v.15 no.2
    • /
    • pp.85-89
    • /
    • 1996
  • In this paper, transition constrained Hidden Markov Model(HMM) in which the transition between states occur only within prescribed time slot is proposed and the performance is evaluated in the noisy environment. The transition constrained HMM can explicitly limit the state durations and accurately de scribe the temporal structure of speech signal simply and efficiently. The transition constrained HMM is not only superior to the conventional HMM but also require much less computation time. In order to evaluate the performance of the transition constrained HMM, speaker independent isolated word recognition experiments were conducted using semi-continuous HMM with the noisy speech for 20, 10, 0 dB SNR. Experiment results show that the proposed method is robust to the environmental noise. The 81.08% and 75.36% word recognition rates for conventional HMM was increased by 7.31% and 10.35%, respectively, by using transition constrained HMM when two kinds of noises are added with 10dB SNR.

  • PDF

A Robust Speaker Identification Method Based on the Wavelet Filter Banks (웨이블렛 필터뱅크에 기반을 둔 강인한 화자식별 기법)

  • Lee, Dae-Jong;Gwak, Geun-Chang;Yu, Jeong-Ung;Jeon, Myeong-Geun
    • The KIPS Transactions:PartC
    • /
    • v.9C no.4
    • /
    • pp.459-466
    • /
    • 2002
  • This paper proposes a robust speaker identification algorithm based on the wavelet filter banks and multiple decision-making scheme. Since the proposed speaker identification algorithm has a structure performing the identification algorithm independently for each subband, the noise effect of an subband can be localized. Through this process, we can obtain more robust results for the environmental noises which generally have band limited frequency. In the experiments, the proposed method showed more 15∼60% improvement than the vector quantization method for the various noisy environments.