• Title/Summary/Keyword: Speech signal analysis

Search Result 275, Processing Time 0.033 seconds

Spectral Analysis and Modeling of Speech Signal - Analysis by FFT and LP Analysis - (음성신호의 스텍트럼해석 및 모델링 - FFT와 선형예측분석법에 의한 음성신호분석 -)

  • 조철우
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1998.06c
    • /
    • pp.393-398
    • /
    • 1998
  • 본 논문에서는 음성신호처리의 기초적인 해석법인 FFT와 LP분석법에 대하여 기본적인 이론과 함께 분석과정에서 알아두어야 할 사항들을 정리한다. 아울러 이러한 분석을 실제 음성신호를 대상으로 행함에 있어서 주의해야 할 점들을 실제음성을 처리한 그림과 함께 설명한다.

  • PDF

Development of an Optimized Feature Extraction Algorithm for Throat Signal Analysis

  • Jung, Young-Giu;Han, Mun-Sung;Lee, Sang-Jo
    • ETRI Journal
    • /
    • v.29 no.3
    • /
    • pp.292-299
    • /
    • 2007
  • In this paper, we present a speech recognition system using a throat microphone. The use of this kind of microphone minimizes the impact of environmental noise. Due to the absence of high frequencies and the partial loss of formant frequencies, previous systems using throat microphones have shown a lower recognition rate than systems which use standard microphones. To develop a high performance automatic speech recognition (ASR) system using only a throat microphone, we propose two methods. First, based on Korean phonological feature theory and a detailed throat signal analysis, we show that it is possible to develop an ASR system using only a throat microphone, and propose conditions of the feature extraction algorithm. Second, we optimize the zero-crossing with peak amplitude (ZCPA) algorithm to guarantee the high performance of the ASR system using only a throat microphone. For ZCPA optimization, we propose an intensification of the formant frequencies and a selection of cochlear filters. Experimental results show that this system yields a performance improvement of about 4% and a reduction in time complexity of 25% when compared to the performance of a standard ZCPA algorithm on throat microphone signals.

  • PDF

Statistical analysis on long-term change of jitter component on continuous speech signal (음성신호의 Jitter 성분의 장시간 변화에 관한 통계적 분석)

  • Jo, Cheolwoo
    • Phonetics and Speech Sciences
    • /
    • v.12 no.4
    • /
    • pp.73-80
    • /
    • 2020
  • In this study, a method for measuring the jitter component in continuous speech is presented. In the conventional jitter measurement method, pitch variabilities are commonly measured from the sustained vowels. In the case of continuous speech, such as a spoken sentence, distortion occurs with the existing measurement method owing to the influence of prosody information according to the sentence. Therefore, we propose a method to reduce the pitch fluctuations of prosody information in continuous speech. To remove this pitch fluctuation component, a curve representing the fluctuation is obtained via polynomial interpolation for the pitch track in the analysis interval, and the shift is removed according to the curve. Subsequently, the variability of the pitch frequency is obtained by a method of measuring jitter from the trajectory of the pitch from which the shift is removed. To measure the effects of the proposed method, parameter values before and after the operations are compared using samples from the Kay Pentax MEEI database. The statistical analysis of the experimental results showed that jitter components from the continuous speech can be measured effectively by proposed method and the values are comparable to the parameters of sustained vowel from the same speaker.

Pitch Extraction of Speech Signals by the Harmonics analysis (고조파 분석에 의한 음성신호의 피치 검출)

  • Kim, Kee-Hee;Choi, Jung-Ah;Bae, Myung-Jin;Ann, Sou-Guil
    • Proceedings of the KIEE Conference
    • /
    • 1987.07b
    • /
    • pp.1610-1614
    • /
    • 1987
  • The harmonies of the fundamental frequency in speech signal make a minute line spectrum in frequency domain. In this paper, we propose a new algorithm to detect a pitch interval in voiced sound based on the fact that the number of harmonies can represent the period of the pitch in the time domain.

  • PDF

Modular Fuzzy Neural Controller Driven by Voice Commands

  • Izumi, Kiyotaka;Lim, Young-Cheol
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2001.10a
    • /
    • pp.32.3-32
    • /
    • 2001
  • This paper proposes a layered protocol to interpret voice commands of the user´s own language to a machine, to control it in real time. The layers consist of speech signal capturing layer, lexical analysis layer, interpretation layer and finally activation layer, where each layer tries to mimic the human counterparts in command following. The contents of a continuous voice command are captured by using Hidden Markov Model based speech recognizer. Then the concepts of Artificial Neural Network are devised to classify the contents of the recognized voice command ...

  • PDF

The Speech Recognition Using the Diffusion Network (확산망을 이용한 음성인식)

  • 허만택
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1996.10a
    • /
    • pp.70-75
    • /
    • 1996
  • In this paper, the pre-precessing method for the recognition of single vowels by use of spectrum envelope is presented , we use new method of an extrating spectrum envelope using the diffusion filter bank. We reduced the total processing time, and got higher enhancement of discrimination . By getting 88.3% of average recognition rate for single vowels of real voice through computer simulation, we confirmed it to be useful for speech recongition which use spectrum analysis for voice signal to have many frequency components.

  • PDF

Design of a variable rate speech codec for the W-CDMA system (W-CDMA 시스템을 위한 가변율 음성코덱 설계)

  • 정우성
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1998.08a
    • /
    • pp.142-147
    • /
    • 1998
  • Recently, 8 kb/s CS-ACELP coder of G.729 is atandardized by ITU-T SG15 and it has been reported that the speech quality of G729 is better than or equal to that of 32kb/s ADPCM. However G.729 is the fixed rate speech coder, and it does not consider the property of voice activity in mutual conversation. If we use the voice activity, we can reduce the average bit rate in half without any degradations of the speech quality. In this paper, we propose an efficient variable rate algorithm for G.729. The variable rate algorithm consists of two main subjects, the rate determination algorithm and algorithm, we combine the energy-thresholding method, the phonetic segmentation method by integration of various feature parameters obtained through the analysis procedure, and the variable hangover period method. Through the analysis of noise features, the 1 kb/s sub rate coder is designed for coding the background noise signal. So, we design the 4 kb/s sub rate coder for the unvoiced parts. The performance of the variable rate algorithm is evaluated by the comparison of speed quality and average bit rate with G.729. Subjective quality test is also done by MOS test. Conclusively, it is verified that the proposed variable rate CS-ACELP coder produced the same speech quality as G.729, at the average bit rate of 4.4 kb/s.

  • PDF

VHDL Implementation of an LPC Analysis Algorithm (LPC 분석 알고리즘의 VHDL 구현)

  • 선우명훈;조위덕
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.32B no.1
    • /
    • pp.96-102
    • /
    • 1995
  • This paper presents the VHSIC Hardware Description Language(VHDL) implementation of the Fixed Point Covariance Lattice(FLAT) algorithm for an Linear Predictive Coding(LPC) analysis and its related algorithms, such as the forth order high pass Infinite Impulse Response(IIR) filter, covariance matrix calculation, and Spectral Smoothing Technique(SST) in the Vector Sum Exited Linear Predictive(VSELP) speech coder that has been Selected as the standard speech coder for the North America and Japanese digital cellular. Existing Digital Signal Processor(DSP) chips used in digital cellular phones are derived from general purpose DSP chips, and thus, these DSP chips may not be optimal and effective architectures are to be designed for the above mentioned algorithms. Then we implemented the VHDL code based on the C code, Finally, we verified that VHDL results are the same as C code results for real speech data. The implemented VHDL code can be used for performing logic synthesis and for designing an LPC Application Specific Integrated Circuit(ASOC) chip and DsP chips. We first developed the C language code to investigate the correctness of algorithms and to compare C code results with VHDL code results block by block.

  • PDF

A study on the recognition system of Korean phenemes using filter-Bank analysis (필터뱅크 분석법을 사용한 한국어 음소의 인식에 관한 연구)

  • 남문현;주상규
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1987.10b
    • /
    • pp.473-478
    • /
    • 1987
  • The purpose of this study is to design a phoneme-class recognition system for Korean language using filter-bank analysis and zero crossing rate method. First, the speech signals are separated in 16 bandpass filters to obtain short-time spectrum of speech signals, and digitized by 16-ch A/D converter. And then, with the set of features which extracted from patterns of ratios of each channel energy level to overall energy level, the decision rules are made for recognize unknown speech signal. In this experiment, the recognition rate was about 93.1 percent for 7 vowels under multitalker environment and 74.4 percent for 10 initial sounds at single speaker.

  • PDF

A Study on a Improvement of the Speech Quality with Variable Window in CELP Vocoder (가변 윈도우를 이용한 CELP 부호화기의 음질 향상에 관한 연구)

  • Ju, Sang-Gyu
    • Proceedings of the KAIS Fall Conference
    • /
    • 2010.05a
    • /
    • pp.265-268
    • /
    • 2010
  • There have been proposed two types of low bit rate vocoder upto now : One is MBE type using the spectrum modeling and another is CELP type using the hybrid coding method. CELP type vocoder has mainly studied between them. Specially, much of intensity is concentrated in CELP vocoder due to the emergence of Internet Phone and PCS in a domestic. In order to improve the speech quality in CELP vocoder, in this paper, we proposed a new spectrum analysis algorithm with variable window. In CELP vocoder, the spectrum of the synthesised speech signal is distorted because the fixed size windows is used for spectrum analysis. So we have measured the spectral leakage and in order to minimize the spectral leakage have adjusted the window size. Applying this method G.723.1 ACELP, we can get SD(Spectral Distortion) reduction 0.084(dB), residual energy reduction 6.3% and MOS(Mean Opinion Score) improvement 0.1.

  • PDF