• Title/Summary/Keyword: linear predictive coding

Search Result 71, Processing Time 0.027 seconds

Audio Watermarking Using Independent Component Analysis

  • Seok, Jong-Won
    • Journal of information and communication convergence engineering
    • /
    • v.10 no.2
    • /
    • pp.175-180
    • /
    • 2012
  • This paper presents a blind watermark detection scheme for an additive watermark embedding model. The proposed estimation-correlation-based watermark detector first estimates the embedded watermark by exploiting non-Gaussian of the real-world audio signal and the mutual independence between the host-signal and the embedded watermark and then a correlation-based detector is used to determine the presence or the absence of the watermark. For watermark estimation, blind source separation (BSS) based on independent component analysis (ICA) is used. Low watermark-to-signal ratio (WSR) is one of the limitations of blind detection with the additive embedding model. The proposed detector uses two-stage processing to improve the WSR at the blind detector; the first stage removes the audio spectrum from the watermarked audio signal using linear predictive (LP) filtering and the second stage uses the resulting residue from the LP filtering stage to estimate the embedded watermark using BSS based on ICA. Simulation results show that the proposed detector performs significantly better than existing estimation-correlationbased detection schemes.

Enhanced Spectral Envelope Coding Scheme Using Inter-frame Correlation for G.729.1 (G.729.1 코더에서 프레임 간의 상호상관 관계를 이용한 개선된 스펙트럼 포락 코딩 방법)

  • Cho, Keun-Seok;Sung, Jong-Mo;Hahn, Min-Soo;Kim, Young-Il;Jeong, Sang-Bae
    • Phonetics and Speech Sciences
    • /
    • v.1 no.4
    • /
    • pp.97-103
    • /
    • 2009
  • This paper describes a new algorithm for encoding spectral envelope in the time domain alias cancellation (TDAC) part of G.729.1. The spectral envelope and modified discrete cosine transform (MDCT) coefficients of the weighted code-excited linear predictive (CELP) coding error in lower-band and the higher-band input signal are encoded in the TDAC part. In order to reduce allocation bits for spectral envelope coding, a new algorithm using sub-band correlation between adjacent frames is proposed. In addition, to improve the quality of decoded signals, two bit allocation strategies using reduced bits from the proposed algorithm are proposed. The performance of the proposed algorithm is evaluated in terms of objective quality and bit reduction rates. Experimental results show that the proposed algorithm increases the quality of sounds significantly.

  • PDF

A MFCC-based CELP Speech Coder for Server-based Speech Recognition in Network Environments (네트워크 환경에서 서버용 음성 인식을 위한 MFCC 기반 음성 부호화기 설계)

  • Lee, Gil-Ho;Yoon, Jae-Sam;Oh, Yoo-Rhee;Kim, Hong-Kook
    • MALSORI
    • /
    • no.54
    • /
    • pp.27-43
    • /
    • 2005
  • Existing standard speech coders can provide speech communication of high quality while they degrade the performance of speech recognition systems that use the reconstructed speech by the coders. The main cause of the degradation is that the spectral envelope parameters in speech coding are optimized to speech quality rather than to the performance of speech recognition. For example, mel-frequency cepstral coefficient (MFCC) is generally known to provide better speech recognition performance than linear prediction coefficient (LPC) that is a typical parameter set in speech coding. In this paper, we propose a speech coder using MFCC instead of LPC to improve the performance of a server-based speech recognition system in network environments. However, the main drawback of using MFCC is to develop the efficient MFCC quantization with a low-bit rate. First, we explore the interframe correlation of MFCCs, which results in the predictive quantization of MFCC. Second, a safety-net scheme is proposed to make the MFCC-based speech coder robust to channel error. As a result, we propose a 8.7 kbps MFCC-based CELP coder. It is shown from a PESQ test that the proposed speech coder has a comparable speech quality to 8 kbps G.729 while it is shown that the performance of speech recognition using the proposed speech coder is better than that using G.729.

  • PDF

A Design of Speech Feature Vector Extractor using TMS320C31 DSP Chip (TMS DSP 칩을 이용한 음성 특징 벡터 추출기 설계)

  • 예병대;이광명;성광수
    • Proceedings of the IEEK Conference
    • /
    • 2003.07e
    • /
    • pp.2212-2215
    • /
    • 2003
  • In this paper, we proposed speech feature vector extractor for embedded system using TMS 320C31 DSP chip. For this extractor, we used algorithm using cepstrum coefficient based on LPC(Linear Predictive Coding) that is reliable algorithm to be is widely used for speech recognition. This system extract the speech feature vector in real time, so is used the mobile system, such as cellular phones, PDA, electronic note, and so on, implemented speech recognition.

  • PDF

A Study on Korean Speech Analysis using Walsh Transform (Walsh변환을 이용한 한국어 숫자음 음성분석에 관한 연구)

  • 김계현;김준현
    • The Transactions of the Korean Institute of Electrical Engineers
    • /
    • v.37 no.4
    • /
    • pp.251-256
    • /
    • 1988
  • This work describes a speech analysis of Korean number ('1'-'10') which are spoken by several speakers using Fast Walsh Transform(FWHT) method. FWHT includes only addition and subtraction operations, therefore faster and needs less memory than FFT(Fast Fourier Transfifrm) or LPC(Linear Predictive Coding) analysis method. We have investigated that FWHT method can find speaker independent feature(which represents same cue about some word independent of different speakers) The results of this experiment, the 70% of same words(korean number '2')which spoken by several speakers have had slmilar patterns.

  • PDF

A LSF Quantizer for the Wideband Speech Using the Predictive VQ-Pyramid VQ (예측 VQ-Pyramid VQ를 이용한 광대역 음성용 LSF 양자학기 설계)

  • 이강은;이인성;강상원
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.4
    • /
    • pp.333-339
    • /
    • 2004
  • This Paper proposes the vector quantizer-pyramid vector quantizer(VQ-PVQ) structure. Also both predictive structure and safety-net concept are combined into the VQ-PVQ to quantize the IPC parameter of wideband speech codec. The Performance is compared to the LPC vector quantizer used in the AMR-WB(ITU-T G.722.2). demonstrating reduction in both spectral distortion and encoding memory.

Electroencephalogram-Based Driver Drowsiness Detection System Using Errors-In-Variables(EIV) and Multilayer Perceptron(MLP) (EIV와 MLP를 이용한 뇌파 기반 운전자의 졸음 감지 시스템)

  • Han, Hyungseob;Song, Kyoung-Young
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.39C no.10
    • /
    • pp.887-895
    • /
    • 2014
  • Drowsy driving is a large proportion of the total car accidents. For this reason, drowsiness detection and warning system for drivers has recently become a very important issue. Monitoring physiological signals provides the possibility of detecting features of drowsiness and fatigue of drivers. Many researches have been published that to measure electroencephalogram(EEG) signals is the effective way in order to be aware of fatigue and drowsiness of drivers. The aim of this study is to extract drowsiness-related features from a set of EEG signals and to classify the features into three states: alertness, transition, and drowsiness. This paper proposes a drowsiness detection system using errors-in-variables(EIV) for extraction of feature vectors and multilayer perceptron (MLP) for classification. The proposed method evaluates robustness for noise and compares to the previous one using linear predictive coding (LPC) combined with MLP. From evaluation results, we conclude that the proposed scheme outperforms the previous one in the low signal-to-noise ratio regime.

A New Vocoder based on AMR 7.4Kbit/s Mode for Speaker Dependent System (화자 의존 환경의 AMR 7.4Kbit/s모드에 기반한 보코더)

  • Min, Byung-Jae;Park, Dong-Chul
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.33 no.9C
    • /
    • pp.691-696
    • /
    • 2008
  • A new vocoder of Code Excited Linear Predictive (CELP) based on Adaptive Multi Rate (AMR) 7.4kbit/s mode is proposed in this paper. The proposed vocoder achieves a better compression rate in an environment of Speaker Dependent Coding System (SDSC) and is efficiently used for systems, such as OGM(Outgoing message) and TTS(Text To Speech), which needs only one person's speech. In order to enhance the compression rate of a coder, a new Line Spectral Pairs(LSP) code-book is employed by using Centroid Neural Network (CNN) algorithm. In comparison with original(traditional) AMR 7.4 Kbit/s coder, the new coder shows 27% higher compression rate while preserving synthesized speech quality in terms of Mean Opinion Score(MOS).

Performance Improvement of Double Talk Detection before Convergence of the Echo Canceller by Using Linear Predictive Coding Filter Gain of the Primary Input Signal (주입력신호의 LPC 필터 이득을 이용한 반향제거기의 수렴전 동시통화검출 성능 개선)

  • Yoo, Jae-Ha
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.24 no.6
    • /
    • pp.628-633
    • /
    • 2014
  • This paper proposes a performance improvement method of the conventional double talk detection method which can operate before convergence of the echo canceller. The proposed method estimates the coefficients of the linear predictive coding(LPC) filter by using the primary input signal. The time-varying threshold for double talk detection is determined based on the LPC filter gain of the primary input signal level. The proposed method can reduce not only false detection rate which means wrong detection of single talk as double talk but also double talk detection delay. Computer simulation was performed using a long-term real speech signals. It is shown that the proposed method improves the conventional method in terms of lowering the false detection rate and shortening the detection delay.

Compression of Electrocardiogram Using MPE-LPC (MPE-LPC를 이용한 심전도 신호의 압축)

  • 이태진;김원기;차일환;윤대희
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.28B no.11
    • /
    • pp.866-875
    • /
    • 1991
  • In this paper, multi pulse excited-linear predictive coding (MPE-LPC), where the correlation eliminated residual signal is modeled by a few pules, is shown to be effective for the compression of electrocardiogram (ECG) data, and a more efficient scheme for a faithful reconstruction of ECG is proposed. The reconstruction charateristic of QRS's and P.T waves is improved using the adaptive pulse allocation (APA), and the compression ratio (CR) can be changed by controlling the mumber of modeling pulses. The performance of the proposed method was evaluated using 10 normal and 10 abnormal ECG data. The proposed method had a better performance than the variable threshold amplitude zone time epoch coding (AZTEC) algorithm and the scan-along polygonal approximation (SAPA) algorithm with the same CR. With the CR in kthe range of 8:1 to 14:1, we could compress ECG data efficiently.

  • PDF