• Title/Summary/Keyword: Speech spectrum

Search Result 307, Processing Time 0.025 seconds

Noise Estimation based on Standard Deviation and Sigmoid Function Using a Posteriori Signal to Noise Ratio in Nonstationary Noisy Environments

  • Lee, Soo-Jeong;Kim, Soon-Hyob
    • International Journal of Control, Automation, and Systems
    • /
    • v.6 no.6
    • /
    • pp.818-827
    • /
    • 2008
  • In this paper, we propose a new noise estimation and reduction algorithm for stationary and nonstationary noisy environments. This approach uses an algorithm that classifies the speech and noise signal contributions in time-frequency bins. It relies on the ratio of the normalized standard deviation of the noisy power spectrum in time-frequency bins to its average. If the ratio is greater than an adaptive estimator, speech is considered to be present. The propose method uses an auto control parameter for an adaptive estimator to work well in highly nonstationary noisy environments. The auto control parameter is controlled by a linear function using a posteriori signal to noise ratio(SNR) according to the increase or the decrease of the noise level. The estimated clean speech power spectrum is obtained by a modified gain function and the updated noisy power spectrum of the time-frequency bin. This new algorithm has the advantages of much more simplicity and light computational load for estimating the stationary and nonstationary noise environments. The proposed algorithm is superior to conventional methods. To evaluate the algorithm's performance, we test it using the NOIZEUS database, and use the segment signal-to-noise ratio(SNR) and ITU-T P.835 as evaluation criteria.

A Study on the Technique of Spectrum Flattening for Improved Pitch Detection (개선된 피치검출을 위한 스펙트럼 평탄화 기법에 관한 연구)

  • 강은영;배명진;민소연
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.3
    • /
    • pp.310-314
    • /
    • 2002
  • The exact pitch (fundamental frequency) extraction is important in speech signal processing like speech recognition, speech analysis and synthesis. However the exact pitch extraction from speech signal is very difficult due to the effect of formant and transitional amplitude. So in this paper, the pitch is detected after the elimination of formant ingredients by flattening the spectrum in frequency region. The effect of the transition and change of phoneme is low in frequency region. In this paper we proposed the new flattening method of log spectrum and the performance was compared with LPC method and Cepstrum method. The results show the proposed method is better than conventional method.

Prosodic pattern of the children with high-functioning autism spectrum disorder according to sentence type (문장유형에 따른 고기능 자폐스펙트럼장애 아동의 운율 특성)

  • Shin, Hee Baek;Choi, Jieun;Lee, YoonKyoung
    • Phonetics and Speech Sciences
    • /
    • v.8 no.2
    • /
    • pp.65-71
    • /
    • 2016
  • The purpose of this study is to examine the prosodic pattern of the children with high functioning autism spectrum disorder(HFASD) according to sentence type. The participants were 18 children aged from 7 - 9 years; 9 children with HFASD and 9 typical development children(TD) of the same chronological age with HFASD children. Sentence reading tasks were conducted in this study. Seven interrogative sentences and 7 declarative sentences were presented to the participants and were asked to read the sentences three times. Mean values of F0, F0 range, intensity, speech rate and pitch contour were measured for each sentence. The results showed that for F0 range, significant main effect and interaction effect were observed in the subject group and sentence type. There were significant differences in intensity, mean F0, speech rate, pitch contour across sentence types. The results of this study indicated that HFASD showed no difference in intonation across sentence types. Speakers' intention may have a negative effect on pragmatic aspects. These results suggest that the assessment and intervention of prosody be important for HFASD.

Subband Based Spectrum Subtraction Algorithm (서브밴드에 기반한 스펙트럼 차감 알고리즘)

  • Choi, Jae-Seung
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.8 no.4
    • /
    • pp.555-560
    • /
    • 2013
  • This paper first proposes a classification algorithm which detects a voiced, unvoiced, and silence signal using distance measure, logarithm power and root mean square methods at each frame, then a spectrum subtraction algorithm based on a subband filter. The proposed algorithm subtracts spectrums of white noise and street noise from noisy signal based on the subband filter at each frame. In this experiment, experimental results of the proposed spectrum subtraction algorithm demonstrate using the speech and noise data of Aurora-2 database. Based on measuring the speech-to-noise ratio (SNR), experiments confirm that the proposed algorithm is effective for the speech by contaminated the noise. From the experiments, the improvement in the output SNR values was approximately 2.1 dB and 1.91 dB better for white noise and street noise, respectively.

Vector Quantization using Speech Signal Property

  • Ha, Seok-Won;Yoon, Seok-Hyun;Chung, Kwang-Woo;Hong, Kwang-Seok
    • Proceedings of the KSPS conference
    • /
    • 1996.10a
    • /
    • pp.448-455
    • /
    • 1996
  • In this paper, we have proposed a VQ algorithm which uses a generating order to make quantize feature vector of speech signal. The proposed algorithm inspects what codeword follows a(ter present codeword and adds new index to established codebook, when mapping speech signal. We present a variable bit rate for new codebook, and propose an efficient compressed way of information. In this way, the number of computation and the number of codewords to be searched are reduced considerably. The performance of the proposed VQ algorithm is evaluated by spectrum distortion measure and bit rate. The obtained spectrum distortion is reduced about 0.22 [db], and the bit rate is saved over 0.21 bit/frame.

  • PDF

KORAN DIGIT RECOGNITION IN NOISE ENVIRONMENT USING SPECTRAL MAPPING TRAINING

  • Ki Young Lee
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06a
    • /
    • pp.1015-1020
    • /
    • 1994
  • This paper presents the Korean digit recognition method under noise environment using the spectral mapping training based on static supervised adaptation algorithm. In the presented recognition method, as a result of spectral mapping from one space of noisy speech spectrum to another space of speech spectrum without noise, spectral distortion of noisy speech is improved, and the recognition rate is higher than that of the conventional method using VQ and DTW without noise processing, and even when SNR level is 0 dB, the recognition rate is 10 times of that using the conventional method. It has been confirmed that the spectral mapping training has an ability to improve the recognition performance for speech in noise environment.

  • PDF

A Speech Enhancement Algorithm based on Human Psychoacoustic Property (심리음향 특성을 이용한 음성 향상 알고리즘)

  • Jeon, Yu-Yong;Lee, Sang-Min
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.59 no.6
    • /
    • pp.1120-1125
    • /
    • 2010
  • In the speech system, for example hearing aid as well as speech communication, speech quality is degraded by environmental noise. In this study, to enhance the speech quality which is degraded by environmental speech, we proposed an algorithm to reduce the noise and reinforce the speech. The minima controlled recursive averaging (MCRA) algorithm is used to estimate the noise spectrum and spectral weighting factor is used to reduce the noise. And partial masking effect which is one of the human hearing properties is introduced to reinforce the speech. Then we compared the waveform, spectrogram, Perceptual Evaluation of Speech Quality (PESQ) and segmental Signal to Noise Ratio (segSNR) between original speech, noisy speech, noise reduced speech and enhanced speech by proposed method. As a result, enhanced speech by proposed method is reinforced in high frequency which is degraded by noise, and PESQ, segSNR is enhanced. It means that the speech quality is enhanced.

Noisy Speech Recognition Based on Noise-Adapted HMMs Using Speech Feature Compensation

  • Chung, Yong-Joo
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.15 no.2
    • /
    • pp.37-41
    • /
    • 2014
  • The vector Taylor series (VTS) based method usually employs clean speech Hidden Markov Models (HMMs) when compensating speech feature vectors or adapting the parameters of trained HMMs. It is well-known that noisy speech HMMs trained by the Multi-condition TRaining (MTR) and the Multi-Model-based Speech Recognition framework (MMSR) method perform better than the clean speech HMM in noisy speech recognition. In this paper, we propose a method to use the noise-adapted HMMs in the VTS-based speech feature compensation method. We derived a novel mathematical relation between the train and the test noisy speech feature vector in the log-spectrum domain and the VTS is used to estimate the statistics of the test noisy speech. An iterative EM algorithm is used to estimate train noisy speech from the test noisy speech along with noise parameters. The proposed method was applied to the noise-adapted HMMs trained by the MTR and MMSR and could reduce the relative word error rate significantly in the noisy speech recognition experiments on the Aurora 2 database.

Extraction of Speaker Recognition Parameter Using Chaos Dimension (카오스차원에 의한 화자식별 파라미터 추출)

  • Yoo, Byong-Wook;Kim, Chang-Seok
    • Speech Sciences
    • /
    • v.1
    • /
    • pp.285-293
    • /
    • 1997
  • This paper was constructed to investigate strange attractor in considering speech which is regarded as chaos in that the random signal appears in the deterministic raising system. This paper searches for the delay time from AR model power spectrum for constructing fit attractor for speech signal. As a result of applying Taken's embedding theory to the delay time, an exact correlation dimension solution is obtained. As a result of this consideration of speech, it is found that it has more speaker recognition characteristic parameter, and gains a large speaker discrimination recognition rate.

  • PDF

A Study on the Estimation of Glottal Spectrum Slope Using the LSP (Line Spectrum Pairs) (LSP를 이용한 성문 스펙트럼 기울기 추정에 관한 연구)

  • Min, So-Yeon;Jang, Kyung-A
    • Speech Sciences
    • /
    • v.12 no.4
    • /
    • pp.43-52
    • /
    • 2005
  • The common form of pre-emphasis filter is $H(z)\;=\;1\;- az^{-1}$, where a typically lies between 0.9 and 1.0 in voiced signal. Also, this value reflects the degree of filter and equals R(1)/R(0) in Auto-correlation method. This paper proposes a new flattening algorithm to compensate the weaked high frequency components that occur by vocal cord characteristic. We used interval information of LSP to estimate formant frequency. After obtaining the value of slope and inverse slope using linear interpolation among formant frequency, flattening process is followed. Experimental results show that the proposed algorithm flattened the weaked high frequency components effectively. That is, we could improve the flattened characteristics by using interval information of LSP as flattening factor at the process that compensates weaked high frequency components.

  • PDF