• Title/Summary/Keyword: linear predictive

Search Result 509, Processing Time 0.021 seconds

A Study on the Phonemic Segmentation of an Initial Affricate (초성파찰음의 음소분류에 관한 연구)

  • Kim, Ki-Woon;Lee, Ki-Young;Bae, Chul-Soo;Choi, Kap-Seok
    • Proceedings of the KIEE Conference
    • /
    • 1988.07a
    • /
    • pp.33-36
    • /
    • 1988
  • In this paper, the starting point of affricate is detected from the first predictor coefficient of a 12-pole linear predictive coding (LPC) analysis and phonemic segmentation is done through measuring short time energy and zero crossing rate. By this segmentation method, the duration of an aspirate can be mearsured in order to detect an aspirate or not.

  • PDF

Design of Receding Horizon Control for Boiler-Turbine Systems (보일러-터빈 시스템을 위한 이동구간 예측제어기 설계)

  • Lee, Young-I.;Lee, Gi-Won
    • Proceedings of the KIEE Conference
    • /
    • 1997.07b
    • /
    • pp.441-445
    • /
    • 1997
  • In this paper, we suggest a design scheme of receding horizon predictive control(RHPC) for boiler-turbine systems whose dynamics are given in nonlinear equations. RHPC is designed for linear state space models which are obtained at a nominal operating point of the boiler-turbine system. In this consideration, the boiler is operated in a sliding pressure mode, in which the reference value of drum pressure is changing according to the electrical power generation. The reference values of the system outputs are prefiltered before they are fed to the RHPC in order to compensate the linearization errors. Simulation results show that the proposed controller provides acceptable performances in both of the cases of 'steep and small changes' and 'slow and large changes' of power demand and yields the effect of modest coordination of conventional PID schemes such as boiler-following and turbine-following control.

  • PDF

Speech Recognition of Multi-Syllable Words Using Soft Computing Techniques (소프트컴퓨팅 기법을 이용한 다음절 단어의 음성인식)

  • Lee, Jong-Soo;Yoon, Ji-Won
    • Transactions of the Society of Information Storage Systems
    • /
    • v.6 no.1
    • /
    • pp.18-24
    • /
    • 2010
  • The performance of the speech recognition mainly depends on uncertain factors such as speaker's conditions and environmental effects. The present study deals with the speech recognition of a number of multi-syllable isolated Korean words using soft computing techniques such as back-propagation neural network, fuzzy inference system, and fuzzy neural network. Feature patterns for the speech recognition are analyzed with 12th order thirty frames that are normalized by the linear predictive coding and Cepstrums. Using four models of speech recognizer, actual experiments for both single-speakers and multiple-speakers are conducted. Through this study, the recognizers of combined fuzzy logic and back-propagation neural network and fuzzy neural network show the better performance in identifying the speech recognition.

A VOWEL TRAJECTORY DISPLAY FOR SPEECH TRAINING

  • Kido, Ken'iti;Tanahashi, Kenji;Ohuchi, Yasuhiro
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06a
    • /
    • pp.971-976
    • /
    • 1994
  • A speech display system is developed for the evaluation and the training of speech utterance. The speech is analyzed by linear predictive technique every 5 ms and the frequencies of the lowest two spectral local peaks P1 and P2 are extracted. The vowel trakectory is displayed using those frequencies on th P1-P2 plane. In most cases, P1 and P2 correspond to the first and the second formants, but in the case of indistinct utterance, the correspondence between the local spectral peaks and the formants tends to fall into disorder. And the system is considered to be useful for the evaluation of speech quality. The examples of some words uttered by normal speakers and some patients with difficulty in utterance are compared each other for the discussion of the effectiveness of the system.

  • PDF

HMM-Based Automatic Speech Recognition using EMG Signal

  • Lee Ki-Seung
    • Journal of Biomedical Engineering Research
    • /
    • v.27 no.3
    • /
    • pp.101-109
    • /
    • 2006
  • It has been known that there is strong relationship between human voices and the movements of the articulatory facial muscles. In this paper, we utilize this knowledge to implement an automatic speech recognition scheme which uses solely surface electromyogram (EMG) signals. The EMG signals were acquired from three articulatory facial muscles. Preliminary, 10 Korean digits were used as recognition variables. The various feature parameters including filter bank outputs, linear predictive coefficients and cepstrum coefficients were evaluated to find the appropriate parameters for EMG-based speech recognition. The sequence of the EMG signals for each word is modelled by a hidden Markov model (HMM) framework. A continuous word recognition approach was investigated in this work. Hence, the model for each word is obtained by concatenating the subword models and the embedded re-estimation techniques were employed in the training stage. The findings indicate that such a system may have a capacity to recognize speech signals with an accuracy of up to 90%, in case when mel-filter bank output was used as the feature parameters for recognition.

Enhanced Maximum Voiced Frequency Estimation Scheme for HTS Using Two-Band Excitation Model

  • Park, Jihoon;Hahn, Minsoo
    • ETRI Journal
    • /
    • v.37 no.6
    • /
    • pp.1211-1219
    • /
    • 2015
  • In a hidden Markov model-based speech synthesis system using a two-band excitation model, a maximum voiced frequency (MVF) is the most important feature as an excitation parameter because the synthetic speech quality depends on the MVF. This paper proposes an enhanced MVF estimation scheme based on a peak picking method. In the proposed scheme, both local peaks and peak lobes are picked from the spectrum of a linear predictive residual signal. The average of the normalized distances of local peaks and peak lobes is calculated and utilized as a feature to estimate an MVF. Experimental results of both objective and subjective tests show that the proposed scheme improves the synthetic speech quality compared with that of a conventional one in a mobile device as well as a PC environment.

Compression of Electrocardiogram Using MPE-LPC (MPE-LPC를 이용한 심전도 신호의 압축)

  • 이태진;김원기;차일환;윤대희
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.28B no.11
    • /
    • pp.866-875
    • /
    • 1991
  • In this paper, multi pulse excited-linear predictive coding (MPE-LPC), where the correlation eliminated residual signal is modeled by a few pules, is shown to be effective for the compression of electrocardiogram (ECG) data, and a more efficient scheme for a faithful reconstruction of ECG is proposed. The reconstruction charateristic of QRS's and P.T waves is improved using the adaptive pulse allocation (APA), and the compression ratio (CR) can be changed by controlling the mumber of modeling pulses. The performance of the proposed method was evaluated using 10 normal and 10 abnormal ECG data. The proposed method had a better performance than the variable threshold amplitude zone time epoch coding (AZTEC) algorithm and the scan-along polygonal approximation (SAPA) algorithm with the same CR. With the CR in kthe range of 8:1 to 14:1, we could compress ECG data efficiently.

  • PDF

An Image Coding Technique Using the Image Segmentation (영상 영역화를 이용한 영상 부호화 기법)

  • 정철호;이상욱;박래홍
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.24 no.5
    • /
    • pp.914-922
    • /
    • 1987
  • An image coding technique based on a segmentation, which utilizes a simplified description of regions composing an image, is investigated in this paper. The proposed coding technique consists of 3 stages: segmentation, contour coding. In this paper, emphasis was given to texture coding in order to improve a quality of an image. Split-and-merge method was employed for a segmentation. In the texture coding, a linear predictive coding(LPC), along with approximation technique based on a two-dimensional polynomial function was used to encode texture components. Depending on a size of region and a mean square error between an original and a reconstructed image, appropriate texture coding techniques were determined. A computer simulation on natural images indicates that an acceptable image quality at a compression ratio as high as 15-25 could be obtained. In comparison with a discrete cosine transform coding technique, which is the most typical coding technique in the first-generation coding, the proposed scheme leads to a better quality at compression ratio higher than 15-20.

  • PDF

Statistical Error Compensation Techniques for Spectral Quantization

  • Choi, Seung-Ho;Kim, Hong-Kook
    • Speech Sciences
    • /
    • v.11 no.4
    • /
    • pp.17-28
    • /
    • 2004
  • In this paper, we propose a statistical approach to improve the performance of spectral quantization of speech coders. The proposed techniques compensate for the distortion in a decoded line spectrum pairs (LSP) vector based on a statistical mapping function between a decoded LSP vector and its corresponding original LSP vector. We first develop two codebook-based probabilistic matching (CBPM) methods based on linear mapping functions according to different assumption of distribution of LSP vectors. In addition, we propose an iterative procedure for the two CBPMs. We apply the proposed techniques to a predictive vector quantizer used for the IS-641 speech coder. The experimental results show that the proposed techniques reduce average spectral distortion by around 0.064dB.

  • PDF

NEURAL NETWORK DYNAMIC IDENTIFICATION OF A FERMENTATION PROCESS

  • Syu, Mei-J.;Tsao, G.T.
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 1993.06a
    • /
    • pp.1021-1024
    • /
    • 1993
  • System identification is a major component for a control system. In biosystems, which is nonlinear and dynamic, precise identification would be very helpful for implementing a control system. It is difficult to precisely identify such non-linear systems. The measurable data on products from 2,3-butanediol fermentation could not be included in a process model based on kinetic approach. Meanwhile, a predictive capability is required in developing a control system. A neural network (NN) dynamic identifier with a by/(1+ t ) transfer function was therefore designed being able to predict this fermentation. This modified inverse NN identifier differs from traditional models in which it is not only able to see but also able to predict the system. A moving window, with a dimension of 11 and a fixed data size of seven, was properly designed. One-step ahead identification/prediction by an 11-3-1 BPNN is demonstrated. Even under process fault, this neural network is still able to perform several-step ahead prediction.

  • PDF