• Title/Summary/Keyword: Linear Predictive Coding Coefficient

Search Result 16, Processing Time 0.03 seconds

Neural-network-based Driver Drowsiness Detection System Using Linear Predictive Coding Coefficients and Electroencephalographic Changes (선형예측계수와 뇌파의 변화를 이용한 신경회로망 기반 운전자의 졸음 감지 시스템)

  • Chong, Ui-Pil;Han, Hyung-Seob
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.13 no.3
    • /
    • pp.136-141
    • /
    • 2012
  • One of the main reasons for serious road accidents is driving while drowsy. For this reason, drowsiness detection and warning system for drivers has recently become a very important issue. Monitoring physiological signals provides the possibility of detecting features of drowsiness and fatigue of drivers. One of the effective signals is to measure electroencephalogram (EEG) signals and electrooculogram (EOG) signals. The aim of this study is to extract drowsiness-related features from a set of EEG signals and to classify the features into three states: alertness, drowsiness, sleepiness. This paper proposes a neural-network-based drowsiness detection system using Linear Predictive Coding (LPC) coefficients as feature vectors and Multi-Layer Perceptron (MLP) as a classifier. Samples of EEG data from each predefined state were used to train the MLP program by using the proposed feature extraction algorithms. The trained MLP program was tested on unclassified EEG data and subsequently reviewed according to manual classification. The classification rate of the proposed system is over 96.5% for only very small number of samples (250ms, 64 samples). Therefore, it can be applied to real driving incident situation that can occur for a split second.

Performance of Vocal Tract Area Estimation from Deaf and Normal Children's Speech (청각장애아동과 건청아동의 성도면적 추정 성능)

  • Kim Se-Hwan;Kim Nam;Kwon Oh-Wook
    • MALSORI
    • /
    • no.56
    • /
    • pp.159-172
    • /
    • 2005
  • This paper analyzes the vocal tract area estimation algorithm used as a part of a speech analysis program to help deaf children correct their pronunciations by comparing their vocal tract shape with normal children's. Assuming that a vocal tract is a concatenation of cylinder tubes with a different cross section, we compute the relative vocal tract area of each tube using the reflection coefficients obtained from linear predictive coding. Then, we obtain the absolute vocal tract area by computing the height of lip opening with a formula modified for children's speech. Using the speech data for five Korean vowels (/a/, /e/, /i/, /o/, and /u/), we investigate the effects of the sampling frequency, frame size, and model order on the estimated vocal tract shape. We compare the vocal tract shapes obtained from deaf and normal children's speech.

  • PDF

GMM-Based Gender Identification Employing Group Delay (Group Delay를 이용한 GMM기반의 성별 인식 알고리즘)

  • Lee, Kye-Hwan;Lim, Woo-Hyung;Kim, Nam-Soo;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.26 no.6
    • /
    • pp.243-249
    • /
    • 2007
  • We propose an effective voice-based gender identification using group delay(GD) Generally, features for speech recognition are composed of magnitude information rather than phase information. In our approach, we address a difference between male and female for GD which is a derivative of the Fourier transform phase. Also, we propose a novel way to incorporate the features fusion scheme based on a combination of GD and magnitude information such as mel-frequency cepstral coefficients(MFCC), linear predictive coding (LPC) coefficients, reflection coefficients and formant. The experimental results indicate that GD is effective in discriminating gender and the performance is significantly improved when the proposed feature fusion technique is applied.

A MFCC-based CELP Speech Coder for Server-based Speech Recognition in Network Environments (네트워크 환경에서 서버용 음성 인식을 위한 MFCC 기반 음성 부호화기 설계)

  • Lee, Gil-Ho;Yoon, Jae-Sam;Oh, Yoo-Rhee;Kim, Hong-Kook
    • MALSORI
    • /
    • no.54
    • /
    • pp.27-43
    • /
    • 2005
  • Existing standard speech coders can provide speech communication of high quality while they degrade the performance of speech recognition systems that use the reconstructed speech by the coders. The main cause of the degradation is that the spectral envelope parameters in speech coding are optimized to speech quality rather than to the performance of speech recognition. For example, mel-frequency cepstral coefficient (MFCC) is generally known to provide better speech recognition performance than linear prediction coefficient (LPC) that is a typical parameter set in speech coding. In this paper, we propose a speech coder using MFCC instead of LPC to improve the performance of a server-based speech recognition system in network environments. However, the main drawback of using MFCC is to develop the efficient MFCC quantization with a low-bit rate. First, we explore the interframe correlation of MFCCs, which results in the predictive quantization of MFCC. Second, a safety-net scheme is proposed to make the MFCC-based speech coder robust to channel error. As a result, we propose a 8.7 kbps MFCC-based CELP coder. It is shown from a PESQ test that the proposed speech coder has a comparable speech quality to 8 kbps G.729 while it is shown that the performance of speech recognition using the proposed speech coder is better than that using G.729.

  • PDF

Speaker Recognition using LPC cepstrum Coefficients and Neural Network (LPC 켑스트럼 계수와 신경회로망을 사용한 화자인식)

  • Choi, Jae-Seung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.12
    • /
    • pp.2521-2526
    • /
    • 2011
  • This paper proposes a speaker recognition algorithm using a perceptron neural network and LPC (Linear Predictive Coding) cepstrum coefficients. The proposed algorithm first detects the voiced sections at each frame. Then, the LPC cepstrum coefficients which have speaker characteristics are obtained by the linear predictive analysis for the detected voiced sections. To classify the obtained LPC cepstrum coefficients, a neural network is trained using the LPC cepstrum coefficients. In this experiment, the performance of the proposed algorithm was evaluated using the speech recognition rates based on the LPC cepstrum coefficients and the neural network.

A Design of Speech Feature Vector Extractor using TMS320C31 DSP Chip (TMS DSP 칩을 이용한 음성 특징 벡터 추출기 설계)

  • 예병대;이광명;성광수
    • Proceedings of the IEEK Conference
    • /
    • 2003.07e
    • /
    • pp.2212-2215
    • /
    • 2003
  • In this paper, we proposed speech feature vector extractor for embedded system using TMS 320C31 DSP chip. For this extractor, we used algorithm using cepstrum coefficient based on LPC(Linear Predictive Coding) that is reliable algorithm to be is widely used for speech recognition. This system extract the speech feature vector in real time, so is used the mobile system, such as cellular phones, PDA, electronic note, and so on, implemented speech recognition.

  • PDF

Enhanced Spectral Envelope Coding Scheme Using Inter-frame Correlation for G.729.1 (G.729.1 코더에서 프레임 간의 상호상관 관계를 이용한 개선된 스펙트럼 포락 코딩 방법)

  • Cho, Keun-Seok;Sung, Jong-Mo;Hahn, Min-Soo;Kim, Young-Il;Jeong, Sang-Bae
    • Phonetics and Speech Sciences
    • /
    • v.1 no.4
    • /
    • pp.97-103
    • /
    • 2009
  • This paper describes a new algorithm for encoding spectral envelope in the time domain alias cancellation (TDAC) part of G.729.1. The spectral envelope and modified discrete cosine transform (MDCT) coefficients of the weighted code-excited linear predictive (CELP) coding error in lower-band and the higher-band input signal are encoded in the TDAC part. In order to reduce allocation bits for spectral envelope coding, a new algorithm using sub-band correlation between adjacent frames is proposed. In addition, to improve the quality of decoded signals, two bit allocation strategies using reduced bits from the proposed algorithm are proposed. The performance of the proposed algorithm is evaluated in terms of objective quality and bit reduction rates. Experimental results show that the proposed algorithm increases the quality of sounds significantly.

  • PDF

A LSF Quantizer for the Wideband Speech Using the Predictive VQ-Pyramid VQ (예측 VQ-Pyramid VQ를 이용한 광대역 음성용 LSF 양자학기 설계)

  • 이강은;이인성;강상원
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.4
    • /
    • pp.333-339
    • /
    • 2004
  • This Paper proposes the vector quantizer-pyramid vector quantizer(VQ-PVQ) structure. Also both predictive structure and safety-net concept are combined into the VQ-PVQ to quantize the IPC parameter of wideband speech codec. The Performance is compared to the LPC vector quantizer used in the AMR-WB(ITU-T G.722.2). demonstrating reduction in both spectral distortion and encoding memory.

Spoken digit recognition Using the ZCR and PARCOR Coefficient (ZCR과 PARCOR 계수를 이용한 숫자음성 인식)

  • 김학윤
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1985.10a
    • /
    • pp.75-78
    • /
    • 1985
  • 본 연구는 시간 영역의 parament를 이용하여 한국어 숫자음(영, 일, 이, 삼, 사, 오, 육, 칠, 팔, 구)을 인식했다. 입력 음성 신호 X(n)의 Beginning Point와 Ending point를 ZCR(Zero-crossing Rate), Magnitude, Energy, Autocorrelation을 이용 Beginning point와 Ending point를 구하고 자음부의 인식은 위 계수들을 이용하여 행했다. 또, 유성음 부분에서는 PARCOR(Partial Autocorrelation), LPC(Linear Predictive Coding)를 이용 모음부와 유성자음을 인식하여 모음을 6개 부류(ㅏ, ㅑ, ㅗ, ㅜ, ㅠ, ㅣ)로 구분 인식했다. 이 방법에 의하면 입력 음성 신호 X(n)의 B.P(Beginning Point)와 E.P(Ending Point)를 쉽게 추출 가능하며 또한 각 Parameter를 이용하여 94.4%의 인식율을 얻었다.

  • PDF

A Practical Implementation of the LTJ Adaptive Filter and Its Application to the Adaptive Echo Canceller (LTJ 적응필터의 실용적 구현과 적응반향제거기에 대한 적용)

  • Yoo, Jae-Ha
    • Speech Sciences
    • /
    • v.11 no.2
    • /
    • pp.227-235
    • /
    • 2004
  • In this paper, we proposed a new practical implementation method of the lattice transversal joint (LTJ) adaptive filter using speech codec's information. And it was applied to the adaptive echo cancellation problem to verify the efficiency of the proposed method. Realtime implementation of the LTJ adaptive filter is very difficult due to high computational complexity for the filter coefficients compensation. However, in case of using speech codec, complexity can be reduced since linear predictive coding (LPC) coefficients are updated each frame or sub-frame instead of every sample. Furthermore, LPC coefficients can be acquired from speech decoder and transformed to the reflection coefficients. Therefore, the computational complexity for updates of the reflection coefficients can be reduced. The effectiveness of the proposed LTJ adaptive filter was verified by the experiments about convergence and tracking performance of the adaptive echo canceller.

  • PDF