• Title/Summary/Keyword: Speech Signal

Search Result 1,172, Processing Time 0.028 seconds

Pattern Recognition Methods for Emotion Recognition with speech signal

  • Park Chang-Hyun;Sim Kwee-Bo
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.6 no.2
    • /
    • pp.150-154
    • /
    • 2006
  • In this paper, we apply several pattern recognition algorithms to emotion recognition system with speech signal and compare the results. Firstly, we need emotional speech databases. Also, speech features for emotion recognition are determined on the database analysis step. Secondly, recognition algorithms are applied to these speech features. The algorithms we try are artificial neural network, Bayesian learning, Principal Component Analysis, LBG algorithm. Thereafter, the performance gap of these methods is presented on the experiment result section.

Designing of efficient super-wide bandwidth extension system using enhanced parameter estimation in time domain (시간 영역에서 개선된 파라미터 추론을 통한 효율적인 초광대역 확장 시스템 설계)

  • Jeon, Jong-jeon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2018.10a
    • /
    • pp.431-433
    • /
    • 2018
  • This paper proposes the system that offer super-wideband speech which is made by artificial bandwidth extension technique using wideband speech signal in time-domain. wideband excitation signal and line spectrum pair(LSP) are extracted based on source-filter model in time-domain. Two parameters are extended by each bandwidth extension algorithms, and then, super-wideband speech parameters are estimated. and synthesized. Subjective test shows super-wideband speech is better speech quality than wideband speech signal.

  • PDF

Implementation of Dual Rate G.723 ADPCM Speech codec (16Kbps와 40Kbps의 Dual Rate G.723 ADPCM 음성 codec 구현)

  • Kim, Jae-Ohe;Han, Kyong-Ho
    • Proceedings of the KIEE Conference
    • /
    • 1998.07g
    • /
    • pp.2480-2482
    • /
    • 1998
  • In this paper, the implementation of dual rate ADPCM using G.723 16Kbps and 40Kbps speech codec algorithm is handled. For small signals, the low rate 16Kbps coding algorithm shows the same SNR as the high rate 40Kbps coding algorithm, while the low rate 16Kbps coding algorithm shows the lower SNR than the high rate 40Kbps coding algorithm for large signal. To obtain the good trade-off between the data rate and synthesized speech quality, we applied low rate 16Kbps for the small signal and high rate 40Kbps for the large signal. Various threshold values determining the rate are tested for good trade off data rate and speech quality. Also the low pass filter effect of speech input and output devices is simulated at several cut-off frequencies. To simulation result shows the good speech quality at a low rate comparing with 16Kbps & 40Kbps.

  • PDF

Speech Parameters for the Robust Emotional Speech Recognition (감정에 강인한 음성 인식을 위한 음성 파라메터)

  • Kim, Weon-Goo
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.16 no.12
    • /
    • pp.1137-1142
    • /
    • 2010
  • This paper studied the speech parameters less affected by the human emotion for the development of the robust speech recognition system. For this purpose, the effect of emotion on the speech recognition system and robust speech parameters of speech recognition system were studied using speech database containing various emotions. In this study, mel-cepstral coefficient, delta-cepstral coefficient, RASTA mel-cepstral coefficient and frequency warped mel-cepstral coefficient were used as feature parameters. And CMS (Cepstral Mean Subtraction) method were used as a signal bias removal technique. Experimental results showed that the HMM based speaker independent word recognizer using vocal tract length normalized mel-cepstral coefficient, its derivatives and CMS as a signal bias removal showed the best performance of 0.78% word error rate. This corresponds to about a 50% word error reduction as compare to the performance of baseline system using mel-cepstral coefficient, its derivatives and CMS.

Adaptive noise cancellation algorithm reducing path misadjustment due to speech signal (음성신호로 인한 잡음전달경로의 오조정을 감소시킨 적응잡음제거 알고리듬)

  • 박장식;김형순;김재호;손경식
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.21 no.5
    • /
    • pp.1172-1179
    • /
    • 1996
  • General adaptive noise canceller(ANC) suffers from the misadjustment of adaptive filter weights, because of the gradient-estimate noise at steady state. In this paper, an adaptive noise cancellation algorithm with speech detector which is distinguishing speech from silence and adaptation-transient region is proposed. The speech detector uses property of adaptive prediction-error filter which can filter the highly correlated speech. To detect speech region, estimation error which is the output of the adaptive filter is applied to the adaptive prediction-error filter. When speech signal apears at the input of the adaptive prediction-error filter. The ratio of input and output energy of adaptive prediction-error filter becomes relatively lower. The ratio becomes large when the white noise appears at the input. So the region of speech is detected by the ratio. Sign algorithm is applied at speech region to prevent the weights from perturbing by output speech of ANC. As results of computer simulation, the proposed algorithm improves segmental SNR and SNR up to about 4 dBand 11 dB, respectively.

  • PDF

Detection and Synthesis of Transition Parts of The Speech Signal

  • Kim, Moo-Young
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.33 no.3C
    • /
    • pp.234-239
    • /
    • 2008
  • For the efficient coding and transmission, the speech signal can be classified into three distinctive classes: voiced, unvoiced, and transition classes. At low bit rate coding below 4 kbit/s, conventional sinusoidal transform coders synthesize speech of high quality for the purely voiced and unvoiced classes, whereas not for the transition class. The transition class including plosive sound and abrupt voiced-onset has the lack of periodicity, thus it is often classified and synthesized as the unvoiced class. In this paper, the efficient algorithm for the transition class detection is proposed, which demonstrates superior detection performance not only for clean speech but for noisy speech. For the detected transition frame, phase information is transmitted instead of magnitude information for speech synthesis. From the listening test, it was shown that the proposed algorithm produces better speech quality than the conventional one.

Neural Network Approaches and Trends for Speech Recognition (음성 인식을 위한 신경회로망 접근과 동향)

  • 김순협
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1995.06a
    • /
    • pp.33-41
    • /
    • 1995
  • We proposed the approach method of neural network for signal processing, especially speech signal processing and reviewed the algorithms for several neural networks which are used for many alppication field in speech processing. Finally, investigated the trends in neural network method through 3 conference jounal and the ASK jounal in 1994.

  • PDF

Design of a Variable Bit Rate Speech Coder Based on One-dimensional SPIHT (1차원 SPIHT를 이용한 가변 비트율 음성 부호기의 설계)

  • Na, Hoon;Jeong, Dae-Gwon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.6
    • /
    • pp.443-451
    • /
    • 2003
  • Since a codebook-based CELP coder models its excitation signal according to one of several bit rates pre-assigned to codebooks and synthesizes speech signal using codebooks, it can not support encoding of speech signal at an arbitrary bit rate in one encoder. The proposed variable bit rate speech coder encodes the excitation signal based on the bit rate assigned to a present frame of speech using one-dimensional SPIHT and wavelet transform. Also it does't need to model excitation signal (or codebook) to some types as CELP coder, and can encode excitation signal at various bit rates without exact pitch information according to user requirement. As a result, since the coder doesn't have a codebook structure, it has relatively low coder complexity and provides equal or better speech quality compared to G.729 and G.723.1 coder.

A Study of Peak Finding Algorithms for the Autocorrelation Function of Speech Signal

  • So, Shin-Ae;Lee, Kang-Hee;You, Kwang-Bock;Lim, Ha-Young;Park, Ji Su
    • Journal of the Korea Society of Computer and Information
    • /
    • v.21 no.12
    • /
    • pp.131-137
    • /
    • 2016
  • In this paper, the peak finding algorithms corresponding to the Autocorrelation Function (ACF), which are widely exploited for detecting the pitch of voiced signal, are proposed. According to various researchers, it is well known fact that the estimation of fundamental frequency (F0) in speech signal is not only very important task but quite difficult mission. The proposed algorithms, presented in this paper, are implemented by using many characteristics - such as monotonic increasing function - of ACF function. Thus, the proposed algorithms may be able to estimate both reliable and correct the fundamental frequency as long as the autocorrelation function of speech signal is accurate. Since the proposed algorithms may reduce the computational complexity it can be applied to the real-time processing. The speech data, is composed of Korean emotion expressed words, is used for evaluation of their performance. The pitches are measured to compare the performance of proposed algorithms.

On a Pitch Extraction of Speech Signal using Residual Signal of the Uniform Quantizer (균일양자화기의 잔여신호를 이용한 음성신호의 피치검출)

  • Bae, Myung-Jin;Han, Ki-Cheon;Cha, Jin-Jong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.16 no.2
    • /
    • pp.36-40
    • /
    • 1997
  • In speech signal processing, it is necessary and important to detect exactly the pitch. The algorithms of pitch extraction which have been proposed until now are difficult exactly pitches over wide range speech signals. In this paper, thus, we proposed a new pitch detection algorithm that finds the fundamental period of speech signal in the residual signal quantized by the uniform quantizer as PCM. The proposed method shows little gross error of average 0.25% for clean speech and average 3.39% for SNR of 0dB. It also achieves results of the pitch contours, improving the accuracy of pitch detection in transient phonemes and noise environments.

  • PDF