• 제목/요약/키워드: speech distortion

검색결과 227건 처리시간 0.02초

병적 음성과 정상 음성의 음향학적 파라미터 분포에 대한 통계적 분석 (An analysis of a statistical difference of acoustic Parameters' distribution between normal voice and pathological voice)

  • 김용주;권순복;김기련;신민철;조철우;왕수건
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2001년도 하계종합학술대회 논문집(4)
    • /
    • pp.249-252
    • /
    • 2001
  • The most basic means of communication among humans is a voice. Without speaking of voice technologies, we found it is important and convenient to use a voice in everyday life. But. in consideration to speech recognition systems, we can't always desire a normal voice input as input signal to the system. Generally speaking. a pathological voice as against a normal which is a voice with a problem in the larynx. could be also special case of input voice. Of course, but the distortion of a speech signal by environmental effects i.e., noise or transmission channel was a raised problem. we will take up a pathological voices with laryngeal disease which is essential distortion factor in voice. Also, we are to find out the difference of acoustic parameters distribution between normal and pathological voice by a statistical method in our research.

  • PDF

Encoding of Speech Spectral Parameters Using Adaptive Vector-Scalar Quantization Methods for Mobile Communication Systems

  • Lee, In-Sung;Kim, Jong-Hark
    • The Journal of the Acoustical Society of Korea
    • /
    • 제17권4E호
    • /
    • pp.35-40
    • /
    • 1998
  • In this paper, an efficient quantization method of line spectrum pairs(LSP) with cascaded structure of vector quantizer and scalar quantizer is proposed. First, input LSP parameters is vector-quantized using a codebook a with a moderate number of entries. In the second stage of quantization, the components of residual vector are individually quantized by the scalar quantizer. The utilization of ordering property of LSP parameters and the inclusion of interframe prediction improve the quantizer performance and remove the stability check routine after quantization procedure. The new vector-scalar hybrid quantizer using 26 bits/frame shows a transparent quality of speech that an average spectral distortion is 1 dB and the frame proportion with above 2 dB spectral distortion is less than 2%. The performances of proposed quantization method is evaluated in the transmission errors.

  • PDF

한국어 숫자음 전화음성의 채널왜곡에 따른 특징파라미터의 변이 분석 (Variation Analysis of Feature Parameters According to the Channel Distortion of Korean Telephone Digit Speech)

  • 정성윤;손종목;김민성;배건성
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2002년도 하계종합학술대회 논문집(4)
    • /
    • pp.191-194
    • /
    • 2002
  • The final purpose of this paper is the enhancement of speech recognition rate under the matched telephone environment between training data and test data. To analyze the effect by the distortion of the changing telephone channel on every call, MFCC is used as the feature parameter and CMN, RTCN, and RASTA are used as channel compensation techniques. For each case, the variation of feature parameters of all phones is analyzed. And, we find recognition rates according to each compensation method using the continuous HMM recognizer, and examine the relationship between variation and recognition rate.

  • PDF

Noise Reduction Using the Standard Deviation of the Time-Frequency Bin and Modified Gain Function for Speech Enhancement in Stationary and Nonstationary Noisy Environments

  • Lee, Soo-Jeong;Kim, Soon-Hyob
    • The Journal of the Acoustical Society of Korea
    • /
    • 제26권3E호
    • /
    • pp.87-96
    • /
    • 2007
  • In this paper we propose a new noise reduction algorithm for stationary and nonstationary noisy environments. Our algorithm classifies the speech and noise signal contributions in time-frequency bins, and is not based on a spectral algorithm or a minimum statistics approach. It relies on calculating the ratio of the standard deviation of the noisy power spectrum in time-frequency bins to its normalized time-frequency average. We show that good quality can be achieved for enhancement speech signal by choosing appropriate values for ${\delta}_t\;and\;{\delta}_f$. The proposed method greatly reduces the noise while providing enhanced speech with lower residual noise and somewhat higher mean opinion score (MOS), background intrusiveness (BAK) and signal distortion (SIG) scores than conventional methods.

Noise Suppression Using Normalized Time-Frequency Bin Average and Modified Gain Function for Speech Enhancement in Nonstationary Noisy Environments

  • Lee, Soo-Jeong;Kim, Soon-Hyob
    • The Journal of the Acoustical Society of Korea
    • /
    • 제27권1E호
    • /
    • pp.1-10
    • /
    • 2008
  • A noise suppression algorithm is proposed for nonstationary noisy environments. The proposed algorithm is different from the conventional approaches such as the spectral subtraction algorithm and the minimum statistics noise estimation algorithm in that it classifies speech and noise signals in time-frequency bins. It calculates the ratio of the variance of the noisy power spectrum in time-frequency bins to its normalized time-frequency average. If the ratio is greater than an adaptive threshold, speech is considered to be present. Our adaptive algorithm tracks the threshold and controls the trade-off between residual noise and distortion. The estimated clean speech power spectrum is obtained by a modified gain function and the updated noisy power spectrum of the time-frequency bin. This new algorithm has the advantages of simplicity and light computational load for estimating the noise. This algorithm reduces the residual noise significantly, and is superior to the conventional methods.

연속음성에서 천이구간의 탐색, 추출, 근사합성에 관한 연구 (A Study on a Searching, Extraction and Approximation-Synthesis of Transition Segment in Continuous Speech)

  • 이시우
    • 한국정보처리학회논문지
    • /
    • 제7권4호
    • /
    • pp.1299-1304
    • /
    • 2000
  • In a speed coding system using excitation source of voiced and unvoiced, it would be involved a distortion of speech quality in case coexist with a voiced and an unvoiced consonants in a frame. So, I propose TSIUVC(Transition Segment Including UnVoiced Consonant) searching, extraction ad approximation-synthesis method in order to uncoexistent with a voiced and unvoiced consonants in a frame. This method based on a zerocrossing rate and pitch detector using FIR-STREAK Digital Filter. As a result, the extraction rates of TSIUVC are 84.8% (plosive), 94.9%(fricative), 92.3%(affricative) in female voice, and 88%(plosive), 94.9%(fricative), 92.3%(affricative) in male voice respectively, Also, I obain a high quality approximation-synthesis waveforms within TSIUVC by using frequency information of 0.547kHz below and 2.813kHz above. This method has the capability of being applied to speech coding of low bit rate, speech analysis and speech synthesis.

  • PDF

최고도이상의 청력손실을 가진 아동의 모음음형대 분석 (An Acoustic Analysis of Vowels for Severe-profound Hearing Impaired Children)

  • 허명진
    • 음성과학
    • /
    • 제14권2호
    • /
    • pp.65-71
    • /
    • 2007
  • The severe-profound hearing impaired children have various disorders in everday communication due to the lack of hearing feedback. Especially, their speech produced unstable voice, omission and distortion of articulation, pitch break, cul-de-sac voice, and so on so that they were difficult to accurately deliver an intended message. This study attempts to analyze the acoustic characteristics of 4 vowel sounds produced by 35 severe-profound hearing impaired children using CSL(Computerized Speech Lab, Model 4300b). The formant data were obtained from the spectrogram and analyzed data by 12 formant filter and auto-correlation among the formants. Results showed that the hearing impaired children's formant values came out very high. They produced the vowels at the mode of hypertension with unstable voice. In order to improve their speech, they would need some adequate auditory feedback.

  • PDF

특징 맵 중요도 기반 어텐션을 적용한 복소 스펙트럼 기반 음성 향상에 관한 연구 (A study on speech enhancement using complex-valued spectrum employing Feature map Dependent attention gate)

  • 정재희;김우일
    • 한국음향학회지
    • /
    • 제42권6호
    • /
    • pp.544-551
    • /
    • 2023
  • 잡음 음성의 지각적 품질과 명료도 향상을 위해 활용되는 음성 향상은 크기 스펙트럼을 이용한 방법에서 크기와 위상을 같이 향상시킬 수 있는 복소 스펙트럼을 이용한 방법으로 연구되어왔다. 본 논문에서는 잡음 음성의 명료도와 품질을 더욱 향상시키기 위해 복소 스펙트럼 기반 음성 향상 시스템에 어텐션 기법을 적용하는 방안에 관해 연구를 수행하였다. 어텐션 기법은 additive attention을 기반으로 수행하며 복소 스펙트럼의 특성을 고려하여 어텐션 가중치를 계산할 수 있도록 하였다. 또한 특징 맵의 중요도를 고려하기 위해 전역 평균 풀링 연산을 같이 사용하였다. 복소 스펙트럼 기반 음성 향상은 Deep Complex U-Net(DCUNET) 모델을 기반으로 수행하였으며, additive attention은 Attention U-Net 모델에서 제안된 방법을 기반으로 연구를 수행하였다. 거실 환경의 잡음 데이터에 대해 음성 향상을 수행한 결과, 제안한 방법이 Source to Distortion Ratio(SDR), Perceptual Evaluation of Speech Quality(PESQ), Short Time Objective Intelligibility(STOI) 평가 지표에서 기준 모델보다 개선된 성능을 보였으며, 낮은 Signal-to-Noise Ratio(SNR) 조건의 다양한 배경 잡음 환경에 대해서도 일관된 성능 향상을 보였다. 이를 통해 제안한 음성 향상 시스템이 효과적으로 잡음 음성의 명료도와 품질을 향상시킬 수 있음을 보여주었다.

Low Bit Rate을 고려한 LMS-MPC 방식에 관한 연구 (A Study on LMS-MPC Method Considering Low Bit Rate)

  • 이시우
    • 디지털융복합연구
    • /
    • 제10권5호
    • /
    • pp.233-238
    • /
    • 2012
  • 유성음원과 무성음원을 시용하는 음성부호화 방식에 있어서, 같은 프레임 안에 모음과 무성자음이 있는 경우에 음성 파형에 일그러짐이 나타난다. 이것을 해결하기 위하여 본 논문에서는 개별피치와 LMS(Least Mean Square)를 적용한 LMS-MPC를 제시하였으며, 기존의 MPC와 LMS-MPC의 SNRseg를 평가한 결과, LMS-MPC의 남자음성에서 1.5dB, 여자음성에서 1.3dB 개선된 것을 확인할 수 있었다. 결국, MPC에 비해 LMS-MPC의 SNRseg가 개선되어 음성파형의 일그러짐을 제어할 수 있었으며, 본 방법은 셀룰러폰이나 스마트폰과 같이 Low Bit Rate의 음원을 사용하여 음성신호를 부호화 하는 방식에 활용할 수 있을 것으로 기대된다.

음성강화를 위한 시간 및 주파수 도메인의 분산정규화 기반 잡음예측 및 저감방법 (Nose Estimation and Suppression methods based on Normalized Variance in Time-Frequency for Speech Enhancement)

  • 이수정;김순협
    • 대한전자공학회논문지SP
    • /
    • 제46권1호
    • /
    • pp.87-94
    • /
    • 2009
  • 잡음예측 및 저감방법은 음성통신과 인식분야의 중요한 핵심기술이다. 본 논문에서는 다양한 잡음환경에 적용할 수 있는 새로운 잡음예측 및 저감 방법을 제안한다. 제안된 알고리즘은 시간 및 주파수영역의 noisy power spectrum 의 분산과 그 값의 정규화 ratio를 기반으로 한다. 제안한 방법은 다양한 잡음환경에서 잘 동작 할 수 있도록 적응추적 임계값을 사용하며, 이 임계값은 음성왜곡과 잔여잡음 사이의 trade-off를 제어한다. 새로운 알고리즘의 성능은 다양한 잡음환경에서 ITU-T P.835(SIG) and segment (SNR) 의해 평가하여 기존의 방법에 비해 향상된 결과를 나타냈다.