• Title/Summary/Keyword: speech waveform

Search Result 135, Processing Time 0.022 seconds

A Study on the Possibility of Drinking through speech Waveform Compensation in Wireless Communication Environments (무선통신 환경에서 음성파형 보상을 통한 음주가능성 여부에 관한 연구)

  • Lee, Won-Hee;Park, Hyungwoo;Bae, Seong-Geon;Bae, Myung-Jin
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.17 no.3
    • /
    • pp.47-53
    • /
    • 2017
  • There is a difficulty in preventing drunken driving by enforcing alcohol control on the sea due to the environment of Marine transportation rather than roads. In the previous study, we proposed the algorithm, that was developed to identify the voices changed according to be drunk. Using the developed algorithm, it became possible to know the possibility of drinking by long distance ship operators and crew members. In that method drinking can be measured in real time, no matter how far the distance is, if the interception is through a voice that can be transmitted over a distance, rather than a short distance. When communicating voice using the VTS wireless devices, clipping occurs when that environment is uneven, and the rate of judgment of the possibility of drinking may be lowered. Therefore, in this paper, we proposed an enhanced method to compensate the signal in order to reduce the error rate of the possibility of drinking due to distortion of the speech signal.

A Study on ACFBD-MPC in 8kbps (8kbps에 있어서 ACFBD-MPC에 관한 연구)

  • Lee, See-Woo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.17 no.7
    • /
    • pp.49-53
    • /
    • 2016
  • Recently, the use of signal compression methods to improve the efficiency of wireless networks have increased. In particular, the MPC system was used in the pitch extraction method and the excitation source of voiced and unvoiced to reduce the bit rate. In general, the MPC system using an excitation source of voiced and unvoiced would result in a distortion of the synthesis speech waveform in the case of voiced and unvoiced consonants in a frame. This is caused by normalization of the synthesis speech waveform in the process of restoring the multi-pulses of the representation segment. This paper presents an ACFBD-MPC (Amplitude Compensation Frequency Band Division-Multi Pulse Coding) using amplitude compensation in a multi-pulses each pitch interval and specific frequency to reduce the distortion of the synthesis speech waveform. The experiments were performed with 16 sentences of male and female voices. The voice signal was A/D converted to 10kHz 12bit. In addition, the ACFBD-MPC system was realized and the SNR of the ACFBD-MPC estimated in the coding condition of 8kbps. As a result, the SNR of ACFBD-MPC was 13.6dB for the female voice and 14.2dB for the male voice. The ACFBD-MPC improved the male and female voice by 1 dB and 0.9 dB, respectively, compared to the traditional MPC. This method is expected to be used for cellular telephones and smartphones using the excitation source with a low bit rate.

Wavelet-based Pitch Detector for 2.4 kbps Harmonic-CELP Coder (2.4 kbps 하모닉-CELP 코더를 위한 웨이블렛 피치 검출기)

  • 방상운;이인성;권오주
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.8
    • /
    • pp.717-726
    • /
    • 2003
  • This paper presents the methods that design the Wavelet-based pitch detector for 2,4 kbps Harmonic-CELP Coder, and that achieve the effective waveform interpolation by decision window shape of the transition region, Waveform interpolation coder operates by encoding one pitch-period-sized segment, a prototype segment, of speech for each frame, generate the smooth waveform interpolation between the prototype segments for voiced frame, But, harmonic synthesis of the prototype waveforms between previous frame and current frame occur not only waveform errors but also discontinuity at frame boundary on that case of pitch halving or doubling, In addtion, in transition region since waveform interpolation coder synthesizes the excitation waveform by using overlap-add with triangularity window, therefore, Harmonic-CELP fail to model the instantaneous increasing speech and synthesis waveform linearly increases, First of all, in order to detect the precise pitch period, we use the hybrid 1st pitch detector, and increse the precision by using 2nd ACF-pitch detector, Next, in order to modify excitation window, we detect the onset, offset of frame by GCI, As the result, pitch doubling is removed and pitch error rate is decreased 5.4% in comparison with ACF, and is decreased 2,66% in comparison with wavelet detector, MOS test improve 0.13 at transition region.

A Time-Domain Parameter Extraction Method for Speech Recognition using the Local Peak-to-Peak Interval Information (국소 극대-극소점 간의 간격정보를 이용한 시간영역에서의 음성인식을 위한 파라미터 추출 방법)

  • 임재열;김형일;안수길
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.31B no.2
    • /
    • pp.28-34
    • /
    • 1994
  • In this paper, a new time-domain parameter extraction method for speech recognition is proposed. The suggested emthod is based on the fact that the local peak-to-peak interval, i.e., the interval between maxima and minima of speech waveform is closely related to the frequency component of the speech signal. The parameterization is achieved by a sort of filter bank technique in the time domain. To test the proposed parameter extraction emthod, an isolated word recognizer based on Vector Quantization and Hidden Markov Model was constructed. As a test material, 22 words spoken by ten males were used and the recognition rate of 92.9% was obtained. This result leads to the conclusion that the new parameter extraction method can be used for speech recognition system. Since the proposed method is processed in the time domain, the real-time parameter extraction can be implemented in the class of personal computer equipped onlu with an A/D converter without any DSP board.

  • PDF

Pitch Modification based on a Voice Source Model (음원 모델에 기초한 합성음의 피치 조절)

  • Choi, Yong-Jin;Yeo, Su-Jin;Kim, Jin-Young;Sung, Koeng-Mo
    • Speech Sciences
    • /
    • v.3
    • /
    • pp.132-147
    • /
    • 1998
  • Previously developed methods for pitch modification have not been based on the voice source model. Therefore, the synthesized speech often sounds unnatural although it may be highly intelligible. The purpose of this paper is to analyze the alteration of a voice source signal with pitch period and to establish the pitch-modification rule based on the result of this analysis. We examine the alteration of the interval of closing phase, closed phase and open phase using the excitation waveform as the pitch increases. In comparison to the previous methods which performed directly on the speech signal, the pitch modification method based on a voice source model shows high intelligibility and naturalness. This study might benefit the application to the speaker identification and the voice color conversion. Therefore the proposed method will provide high quality synthetic speech.

  • PDF

Voice Source Modeling Using Harmonic Compensated LF Model (LF 모델에 고조파 성분을 보상한 음원 모델링)

  • 이건웅;김태우홍재근
    • Proceedings of the IEEK Conference
    • /
    • 1998.10a
    • /
    • pp.1247-1250
    • /
    • 1998
  • In speech synthesis, LF model is widely used for excitation signal for voice source coding system. But LF model does not represent the harmonic frequencies of excitation signal. We propose an effective method which use sinusoidal functions for representing the harmonics of voice source signal. The proposed method could achieve more exact voice source waveform and better synthesized speech quality than LF model.

  • PDF

EFFICIENCY OF SPEECH FEATURES (음성 특징의 효율성)

  • 황규웅
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1995.06a
    • /
    • pp.225-227
    • /
    • 1995
  • This paper compared waveform, cepstrum, and spline wavelet features with nonlinear discriminant analysis. This measure shows efficiency of speech parametrization better than old linear separability criteria and can be used to measure the efficiency of each layer of certain system. Spline wavelet transform has larger gap among classes and cepstrum is clustered better than the spline wavelet feature. Both features do not have good property for classification and we will compare Gabor wavelet transform, Mel cepstrum, delta cepstrum, etc.

  • PDF

Real Time Implementation of a Korean Speech Synthesizer (한국어 음성합성기의 실시간 구현에 관한 연구)

  • 임광일;이규태;조철우;이우선;신인철;이태원
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.25 no.2
    • /
    • pp.176-181
    • /
    • 1988
  • In this paper, the LPC speech synthesizer with Multipulsse excitation is implemented using general-purpose DSP \ulcornerD7720. As the driving function for synthesis filter is used in the amplitude and position of pulse, the Voice/Unvoice decision and pitch period detectioncan be excluded. The synthesizer is implemented with DSP device which is operated on the interrupt mehtod with main computer and on the DMA mehtod with D/A converter. The comparision of synthetic and original waveform, alogn with the listening test, proves the validity of this system.

  • PDF

On Detcdting the Steady State Segments of Speech Waveform by using the Normalized AMDF (규준화된 AMDF 이용한 음성파형의안정상태 구간검출)

  • Bae, Myung-Jin;Kim, Ul-Je;Ahn, Sou-Guil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.10 no.3
    • /
    • pp.44-50
    • /
    • 1991
  • To recognize continued speech, it is necessary to segment the connected acoustic signal into phonetic units. In this paper, as a parameter to detect the transition regions in continued speech, we propose a new noramlized AMDF. The suggested parameter represents a change rate of magnitude of speech signals. As comparing this value with the adjactent frames value the state of the frames can be distinguished as a level between the steady state and transient state.

  • PDF

Development of an algorithm for the control of prosodic factors to synthesize unlimited isolated words in the time domain (시간 영역에서의 무제한 고립어 합성을 위한 운율 요소 제어용 알고리즘 개발)

  • 강찬희
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.35C no.7
    • /
    • pp.59-68
    • /
    • 1998
  • This paper is to develop an algorithm for the unlimited korean speech synthesis. We present the results controlled of prosodic factors with isolated words as aynthesis basis unit int he time domain. With a new pitch-synchronous and parametric speech synthesis mehtod in the time domain here we mainly present the results of controlled prosody factors such a spitch periods, energy envelops and durations and the evaluaton of synthetic speech qualities. In the case of synthesis, it is possible ot synthesize connected words by controlling of a continuous unified prosody that makes to improve the naturalities. In the results of experiment, it also has been to be improved uncontinuities of pitch and zeroing of energy in the junction parts of speech waveforms. Specially it has been to be possible to synthesize speeches with unlimitted durations and tones. So on it makes the noisiness and the clearness better by improving the degradation effects from the phase distortion due to the discontinuities in the waveform connection parts.

  • PDF