• 제목/요약/키워드: speech distortion

검색결과 227건 처리시간 0.024초

Spectral subtraction based on speech state and masking effect

  • 김우일;강선미;고한석
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 1998년도 하계종합학술대회논문집
    • /
    • pp.599-602
    • /
    • 1998
  • In this paper, a speech enhancement method based on phonemic properties and masking effect is propsoed. It is a modified type of spectral subtraction wherein the spectral sharpening process is exploited in unvoiced state considering the phonemic properties. The masking threshold is used to remove the residual noise. The proposed spectral subtraction shows similar performance as that of the classical spectral subtraction method in view of the SNR. But by the prposed scheme, the unvoiced sound region is shown to exhibit relatively less signal distortion in the enhanced speech.

  • PDF

MMSE Estimator 기반의 적응 콤 필터링을 이용한 잡음 제거 (Noise Reduction Using MMSE Estimator-based Adaptive Comb Filtering)

  • 박정식;오영환
    • 대한음성학회지:말소리
    • /
    • 제60호
    • /
    • pp.181-190
    • /
    • 2006
  • This paper describes a speech enhancement scheme that leads to significant improvements in recognition performance when used in the ASR front-end. The proposed approach is based on adaptive comb filtering and an MMSE-related parameter estimator. While adaptive comb filtering reduces noise components remarkably, it is rarely effective in reducing non-stationary noises. Furthermore, due to the uniformly distributed frequency response of the comb-filter, it can cause serious distortion to clean speech signals. This paper proposes an improved comb-filter that adjusts its spectral magnitude to the original speech, based on the speech absence probability and the gain modification function. In addition, we introduce the modified comb filtering-based speech enhancement scheme for ASR in mobile environments. Evaluation experiments carried out using the Aurora 2 database demonstrate that the proposed method outperforms conventional adaptive comb filtering techniques in both clean and noisy environments.

  • PDF

A STUDY ON THE SPEECH SYNTHESIS-BY-RULE SYSTEM APPLIED MULTIBAND EXCITATION SIGNAL

  • Kyung, Younjeong;Kim, Geesoon;Lee, Hwangsoo;Lee, Yanghee
    • 한국음향학회:학술대회논문집
    • /
    • 한국음향학회 1994년도 FIFTH WESTERN PACIFIC REGIONAL ACOUSTICS CONFERENCE SEOUL KOREA
    • /
    • pp.1098-1103
    • /
    • 1994
  • In this paper, we design and implement the Korean speech synthesis by rule system. This system is applied the multiband excitation signal on voiced sounds. The multiband excitation signal is obtained by mixing impluse spectrum and which noise spectrum. We find that the quality of synthesized speech is improved using this application. Also, we classify the voiced sounds by cepstral euclidian distance measure for reducing overhead memory. The representative excitation signal of the same group's voiced sounds is used as excitation signal on synthesis. This method does not affect the quality of synthesized speech. As the result of experiment, this method eliminates the "buzziness" of synthesized speech and reduces the spectral distortion of synthesized speech.ed speech.

  • PDF

디지털 보청기에서의 포먼트 강조에 의한 마스킹 효과 연구 (A Study of Acoustic Masking Effect from Formant Enhancement in Digital Hearing Aid)

  • 전유용;길세기;윤광섭;이상민
    • 전자공학회논문지SC
    • /
    • 제45권5호
    • /
    • pp.13-20
    • /
    • 2008
  • 청력 손실을 보상하고 난청인이 다른 사람들과 대화할 수 있도록 디지털 보청기 알고리즘은 개발 되어 왔음에도 불구하고, 디지털 보청기 사용자들은 음성을 듣는데 어려움이 있다고 불만을 토로한다. 그 이유는 피드백이나 잔여 노이즈 등에 의해 디지털 보청기를 통한 음성의 질이 이해하기에 불충분하기 때문이다. 또 다른 이유로 포먼트들 사이에서 일어나는 마스킹 현상이 될 수 있다. 이 연구에서 정상 청각 피험자와 노인성 난청을 갖고 있는 난청인 피험자의 마스킹 특성을 측정하여 음성에서의 마스킹에 의한 음성 인지 저하를 확인하기 위한 실험을 하였다. 실험은 순음검사, 어음 청취 역치 검사, 낱말 분별력 검사, 수음 마스킹 검사, 어음 마스킹 검사의 5개 테스트로 이루어졌다. 어음 마스킹 검사에서 각각 어음 세트에 25개의 어음이 사용되었다. 각 어음의 왜곡을 객관적으로 평가하기 위해서 log likelihood ratio (LLR)를 도입하였다. 결과적으로 포먼트 향상의 양을 늘리면 늘릴수록 어음 인지는 낮아졌고, 각 어음 세트에서 각각의 향상된 어음은 통계적으로 비슷한 LLR을 갖지만 어음인지는 그렇지 않게 나타났다. 이것은 왜곡이 아닌 음향 마스킹이 어음 인지에 영향을 준다는 것을 의미한다. 실제로 피험자들 대부분이 맞추지 못한 음성을 주파수 분석한 결과 첫 번째와 두 번째 포먼트 사이의 레벨 차이가 약 35dB이며 이는 순음 마스킹 실험 결과(정상 청각 피험자:36.36dB, 난청인 피험자:32.86dB)와 비슷한 양상을 보였다. 실험 결과에서 볼 수 있듯이 음향 마스킹의 특성은 정상 청각인과 난청인 사이에서 다르게 나타난다. 그렇기 때문에 보청기 착용 전 마스킹 특성을 검사하고, 피팅 시에 적용해야 한다.

Speech Denoising via Low-Rank and Sparse Matrix Decomposition

  • Huang, Jianjun;Zhang, Xiongwei;Zhang, Yafei;Zou, Xia;Zeng, Li
    • ETRI Journal
    • /
    • 제36권1호
    • /
    • pp.167-170
    • /
    • 2014
  • In this letter, we propose an unsupervised framework for speech noise reduction based on the recent development of low-rank and sparse matrix decomposition. The proposed framework directly separates the speech signal from noisy speech by decomposing the noisy speech spectrogram into three submatrices: the noise structure matrix, the clean speech structure matrix, and the residual noise matrix. Evaluations on the Noisex-92 dataset show that the proposed method achieves a signal-to-distortion ratio approximately 2.48 dB and 3.23 dB higher than that of the robust principal component analysis method and the non-negative matrix factorization method, respectively, when the input SNR is -5 dB.

한국어 연속 숫자음 전화 음성 인식에서의 오인식 유형 분석 (Analysis of Error Patterns in ]Korean Connected Digit Telephone Speech Recognition)

  • 김민성;정성윤;손종목;배건성;김상훈
    • 대한음성학회지:말소리
    • /
    • 제46호
    • /
    • pp.77-86
    • /
    • 2003
  • Channel distortion and coarticulation effect in the Korean connected digit telephone speech make it difficult to achieve high performance of connected digit recognition in the telephone environment. In this paper, as a basic research to improve the recognition performance of Korean connected digit telephone speech, recognition error patterns are investigated and analyzed. Korean connected digit telephone speech database released by SiTEC and HTK system are used for recognition experiments. Both DWFBA and MRTCN methods are used for feature extraction and channel compensation, respectively. Experimental results are discussed with our findings.

  • PDF

음질 개선을 위한 돌발잡음 제거와 음성복원 (Abrupt Noise Cancellation and Speech Restoration for Speech Enhancement)

  • 손백권;한민수
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2003년도 10월 학술대회지
    • /
    • pp.101-104
    • /
    • 2003
  • In this paper, speech quality is improved by removing abrupt noise intervals and then substituting the gaps with estimates of the previous speech waveform. An abrupt noise detection signal has been proposed as a prediction error signal by utilizing LP coefficients of the previous frame. Abrupt noise intervals are estimated by using spectral energy. After removing estimated noise intervals, we applied several waveform substitution techniques such as zero substitution, previous frame repetition, pattern matching, and pitch waveform replication. To prove the validity of our algorithm, the LPC spectral distortion test and the recognition test are executed and, the results show that the speech quality is fairly well improved.

  • PDF

강인한 음성인식을 위한 SPLICE 기반 잡음 보상의 성능향상 (Performance Improvement of SPLICE-based Noise Compensation for Robust Speech Recognition)

  • 김형순;김두희
    • 음성과학
    • /
    • 제10권3호
    • /
    • pp.263-277
    • /
    • 2003
  • One of major problems in speech recognition is performance degradation due to the mismatch between the training and test environments. Recently, Stereo-based Piecewise LInear Compensation for Environments (SPLICE), which is frame-based bias removal algorithm for cepstral enhancement using stereo training data and noisy speech model as a mixture of Gaussians, was proposed and showed good performance in noisy environments. In this paper, we propose several methods to improve the conventional SPLICE. First we apply Cepstral Mean Subtraction (CMS) as a preprocessor to SPLICE, instead of applying it as a postprocessor. Secondly, to compensate residual distortion after SPLICE processing, two-stage SPLICE is proposed. Thirdly we employ phonetic information for training SPLICE model. According to experiments on the Aurora 2 database, proposed method outperformed the conventional SPLICE and we achieved a 50% decrease in word error rate over the Aurora baseline system.

  • PDF

DOA 기반 학습률 조절을 이용한 다채널 음성개선 알고리즘 (Multi-Channel Speech Enhancement Algorithm Using DOA-based Learning Rate Control)

  • 김수환;이영재;김영일;정상배
    • 말소리와 음성과학
    • /
    • 제3권3호
    • /
    • pp.91-98
    • /
    • 2011
  • In this paper, a multi-channel speech enhancement method using the linearly constrained minimum variance (LCMV) algorithm and a variable learning rate control is proposed. To control the learning rate for adaptive filters of the LCMV algorithm, the direction of arrival (DOA) is measured for each short-time input signal and the likelihood function of the target speech presence is estimated to control the filter learning rate. Using the likelihood measure, the learning rate is increased during the pure noise interval and decreased during the target speech interval. To optimize the parameter of the mapping function between the likelihood value and the corresponding learning rate, an exhaustive search is performed using the Bark's scale distortion (BSD) as the performance index. Experimental results show that the proposed algorithm outperforms the conventional LCMV with fixed learning rate in the BSD by around 1.5 dB.

  • PDF

치조 마찰음 왜곡 오류 유무에 따른 아동 발화 적률분석 비교 (Spectral moment analysis of distortion errors in alveolar fricatives in Korean children)

  • 한윤주;김도형;황자은;장대현;김재원
    • 말소리와 음성과학
    • /
    • 제16권1호
    • /
    • pp.33-40
    • /
    • 2024
  • 본 연구는 치조 마찰음의 왜곡 오류인 치간음화, 구개음화, 설측음화가 적률분석의 변인인 무게중심, 분산, 왜도, 첨도에서 정조음과 보이는 음향학적 차이를 확인하고자 하였다. 이를 위해 61명 아동(평균연령: 5.6±1.5세, 여아 19명, 남아 42명)을 대상으로 얻어진 조음음운평가(Assessment of Phonology & Articulation for Children, APAC; Urimal-test of Articulation and Phonology I, U-TAP I) 결과, 음성 중 치조 마찰음을 포함하고 있는 목표 단어에서 치조마찰음 왜곡 오류를 보인 음성과 정조음 음성을 추출하여 후향적 연구를 진행하였다. 총 169개의 음성이 적률분석에 사용되었다. 그 결과, 무게중심에서 정조음이 구개음화보다 값이 높았으며, 구개음화는 치간음화보다 값이 낮았다. 치간음화의 분산이 정조음과 구개음화보다 높았다. 치간음화가 정조음보다 높은 왜도를 보였으며 구개음화의 왜도가 정조음보다 높았다. 마지막으로 구개음화의 첨도가 정조음과 치간음화보다 높았다. 각 왜곡 오류 유형에서 적률분석의 모든 변인들에 대해 조음위치(어두초성, 어중초성), 발성유형(평음, 경음)에 따른 유의한 차이는 관찰되지 않았다. 본 연구는 치조 마찰음의 유형에 따라 무게중심, 분산, 왜도, 첨도에서 다른 패턴이 나타남을 확인하였으며, 본 연구에서 제시한 객관적 수치는 추후 임상에서 청지각 평가를 도와 치조 마찰음 왜곡 오류의 진단과정에 기초 자료로 사용될 수 있을 것이다.