• Title/Summary/Keyword: voiced sound

Search Result 39, Processing Time 0.028 seconds

A Noise Reduction Method with Linear Prediction Using Periodicity of Voiced Speech

  • Sasaoka, Naoto;Kawamura, Arata;Fujii, Kensaku;Itoh, Yoshio;Fukui, Yutaka
    • Proceedings of the IEEK Conference
    • /
    • 2002.07a
    • /
    • pp.102-105
    • /
    • 2002
  • A noise reduction technique to reduce background noise in corrupted voice is proposed. The proposed method is based on linear prediction and takes advantages of periodicity of voiced speech. A voiced sound is regarded as a periodic stationary signal in short time interval. Therefore, the current voice signal is correlated with the voice signal delayed by a pitch period. A linear predictor can estimate only the current signal correlated with the delayed signal. Therefore, the enhanced voice can be obtained as output of the linear predictor. Simulation results show that the proposed method is able to reduce the background noise.

  • PDF

Detection and Synthesis of Transition Parts of The Speech Signal

  • Kim, Moo-Young
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.33 no.3C
    • /
    • pp.234-239
    • /
    • 2008
  • For the efficient coding and transmission, the speech signal can be classified into three distinctive classes: voiced, unvoiced, and transition classes. At low bit rate coding below 4 kbit/s, conventional sinusoidal transform coders synthesize speech of high quality for the purely voiced and unvoiced classes, whereas not for the transition class. The transition class including plosive sound and abrupt voiced-onset has the lack of periodicity, thus it is often classified and synthesized as the unvoiced class. In this paper, the efficient algorithm for the transition class detection is proposed, which demonstrates superior detection performance not only for clean speech but for noisy speech. For the detected transition frame, phase information is transmitted instead of magnitude information for speech synthesis. From the listening test, it was shown that the proposed algorithm produces better speech quality than the conventional one.

An ACLMS-MPC Coding Method Integrated with ACFBD-MPC and LMS-MPC at 8kbps bit rate. (8kbps 비트율을 갖는 ACFBD-MPC와 LMS-MPC를 통합한 ACLMS-MPC 부호화 방식)

  • Lee, See-woo
    • Journal of Internet Computing and Services
    • /
    • v.19 no.6
    • /
    • pp.1-7
    • /
    • 2018
  • This paper present an 8kbps ACLMS-MPC(Amplitude Compensation and Least Mean Square - Multi Pulse Coding) coding method integrated with ACFBD-MPC(Amplitude Compensation Frequency Band Division - Multi Pulse Coding) and LMS-MPC(Least Mean Square - Multi Pulse Coding) used V/UV/S(Voiced / Unvoiced / Silence) switching, compensation in a multi-pulses each pitch interval and Unvoiced approximate-synthesis by using specific frequency in order to reduce distortion of synthesis waveform. In integrating several methods, it is important to adjust the bit rate of voiced and unvoiced sound source to 8kbps while reducing the distortion of the speech waveform. In adjusting the bit rate of voiced and unvoiced sound source to 8 kbps, the speech waveform can be synthesized efficiently by restoring the individual pitch intervals using multi pulse in the representative interval. I was implemented that the ACLMS-MPC method and evaluate the SNR of APC-LMS in coding condition in 8kbps. As a result, SNR of ACLMS-MPC was 15.0dB for female voice and 14.3dB for male voice respectively. Therefore, I found that ACLMS-MPC was improved by 0.3dB~1.8dB for male voice and 0.3dB~1.6dB for female voice compared to existing MPC, ACFBD-MPC and LMS-MPC. These methods are expected to be applied to a method of speech coding using sound source in a low bit rate such as a cellular phone or internet phone. In the future, I will study the evaluation of the sound quality of 6.9kbps speech coding method that simultaneously compensation the amplitude and position of multi-pulse source.

A Study Of The Meaningful Speech Sound Block Classification Based On The Discrete Wavelet Transform (Discrete Wavelet Transform을 이용한 음성 추출에 관한 연구)

  • Baek, Han-Wook;Chung, Chin-Hyun
    • Proceedings of the KIEE Conference
    • /
    • 1999.07g
    • /
    • pp.2905-2907
    • /
    • 1999
  • The meaningful speech sound block classification provides very important information in the speech recognition. The following technique of the classification is based on the DWT (discrete wavelet transform), which will provide a more fast algorithm and a useful, compact solution for the pre-processing of speech recognition. The algorithm is implemented to the unvoiced/voiced classification and the denoising.

  • PDF

Pitch Detection Using Wavelet Transform (웨이브렛 변환을 이용한 피치검출)

  • Seok, Jong-Won;Son, Young-Ho;Bae, Keun-Sung
    • Speech Sciences
    • /
    • v.5 no.1
    • /
    • pp.23-33
    • /
    • 1999
  • Mallat has shown that, with a proper choice of wavelet function, the local maxima of wavelet transformed signal indicate a sharp variation in the signal. Since the glottal closure causes sharp discontinuities in the speech signal, dyadic wavelet transform can be useful for detecting abrupt change in the voiced sounds, i.e., epochs. In this paper, we investigate the glottal closure instants obtained from the wavelet analysis of speech signal and compare them with those obtained from the EGG signal. Then, we detect pitch period of speech signal on the basis of these results. Experimental results demonstrated that local maxima of wavelet transformed signal give accurate estimation of epoch and pitch periods of voiced sound obtained by the proposed algorithm also correspond to those from EGG well.

  • PDF

Generalized cross correlation with phase transform sound source localization combined with steered response power method (조정 응답 파워 방법과 결합된 generalized cross correlation with phase transform 음원 위치 추정)

  • Kim, Young-Joon;Oh, Min-Jae;Lee, In-Sung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.36 no.5
    • /
    • pp.345-352
    • /
    • 2017
  • We propose a methods which is reducing direction estimation error of sound source in the reverberant and noisy environments. The proposed algorithm divides speech signal into voice and unvoice using VAD. We estimate the direction of source when current frame is voiced. TDOA (Time-Difference of Arrival) between microphone array using the GCC-PHAT (Generalized Cross Correlation with Phase Transform) method will be estimated in that frame. Then, we compare the peak value of cross-correlation of two signals applied to estimated time-delay with other time-delay in time-table in order to improve the accuracy of source location. If the angle of current frame is far different from before and after frame in successive voiced frame, the angle of current frame is replaced with mean value of the estimated angle in before and after frames.

Multi-Pulse Amplitude and Location Estimation by Maximum-Likelihood Estimation in MPE-LPC Speech Synthesis (MPE-LPC음성합성에서 Maximum- Likelihood Estimation에 의한 Multi-Pulse의 크기와 위치 추정)

  • 이기용;최홍섭;안수길
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.26 no.9
    • /
    • pp.1436-1443
    • /
    • 1989
  • In this paper, we propose a maximum-likelihood estimation(MLE) method to obtain the location and the amplitude of the pulses in MPE( multi-pulse excitation)-LPC speech synthesis using multi-pulses as excitation source. This MLE method computes the value maximizing the likelihood function with respect to unknown parameters(amplitude and position of the pulses) for the observed data sequence. Thus in the case of overlapped pulses, the method is equivalent to Ozawa's crosscorrelation method, resulting in equal amount of computation and sound quality with the cross-correlation method. We show by computer simulation: the multi-pulses obtained by MLE method are(1) pseudo-periodic in pitch in the case of voicde sound, (2) the pulses are random for unvoiced sound, (3) the pulses change from random to periodic in the interval where the original speech signal changes from unvoiced to voiced. Short time power specta of original speech and syunthesized speech obtained by using multi-pulses as excitation source are quite similar to each other at the formants.

  • PDF

A Study on PCFBD-MPC in 8kbps (8kbps에 있어서 PCFBD-MPC에 관한 연구)

  • Lee, See-woo
    • Journal of Internet Computing and Services
    • /
    • v.18 no.5
    • /
    • pp.17-22
    • /
    • 2017
  • In a MPC coding using excitation source of voiced and unvoiced, it would be a distortion of speech waveform. This is caused by normalization of synthesis speech waveform of voiced in the process of restoration the multi-pulses of representation section. This paper present PCFBD-MPC( Position Compensation Frequency Band Division-Multi Pulse Coding ) used V/UV/S( Voiced / Unvoiced / Silence ) switching, position compensation in a multi-pulses each pitch interval and Unvoiced approximate-synthesis by using specific frequency in order to reduce distortion of synthesis waveform. Also, I was implemented that the PCFBD-MPC( Position Compensation Frequency Band Division-Multi Pulse Coding ) system and evaluate the SNRseg of PCFBD-MPC in coding condition of 8kbps. As a result, SNRseg of PCFBD-MPC was 13.4dB for female voice and 13.8dB for male voice respectively. In the future, I will study the evaluation of the sound quality of 8kbps speech coding method that simultaneously compensation the amplitude and position of multi-pulse source. These methods are expected to be applied to a method of speech coding using sound source in a low bit rate such as a cellular phone or a smart phone.

Pitch Extraction of Speech Signals by the Harmonics analysis (고조파 분석에 의한 음성신호의 피치 검출)

  • Kim, Kee-Hee;Choi, Jung-Ah;Bae, Myung-Jin;Ann, Sou-Guil
    • Proceedings of the KIEE Conference
    • /
    • 1987.07b
    • /
    • pp.1610-1614
    • /
    • 1987
  • The harmonies of the fundamental frequency in speech signal make a minute line spectrum in frequency domain. In this paper, we propose a new algorithm to detect a pitch interval in voiced sound based on the fact that the number of harmonies can represent the period of the pitch in the time domain.

  • PDF

A Study on Speech Separation in Cochannel using Sinusoidal Model (Sinusoidal Model을 이용한 Cochannel상에서의 음성분리에 관한 연구)

  • Park, Hyun-Gyu;Shin, Joong-In;Park, Sang-Hee
    • Proceedings of the KIEE Conference
    • /
    • 1997.11a
    • /
    • pp.597-599
    • /
    • 1997
  • Cochannel speaker separation is employed when speech from two talkers has been summed into one signal and it is desirable to recover one or both of the speech signals from the composite signal. Cochannel speech occurs in many common situations such as when two AM signals containing speech are transmitted on the same frequency or when two people are speaking simultaneously (e. g., when talking on the telephone). In this paper, the method that separated the speech in such a situation is proposed. Especially, only the voiced sound of few sound states is separated. And the similarity of the signals by the cross correlation between the signals for exactness of original signal and separated signal is proved.

  • PDF