Search | Korea Science

A MFCC-based CELP Speech Coder for Server-based Speech Recognition in Network Environments (네트워크 환경에서 서버용 음성 인식을 위한 MFCC 기반 음성 부호화기 설계)

Lee, Gil-Ho;Yoon, Jae-Sam;Oh, Yoo-Rhee;Kim, Hong-Kook
- MALSORI
- /
- no.54
- /
- pp.27-43
- /
- 2005
Existing standard speech coders can provide speech communication of high quality while they degrade the performance of speech recognition systems that use the reconstructed speech by the coders. The main cause of the degradation is that the spectral envelope parameters in speech coding are optimized to speech quality rather than to the performance of speech recognition. For example, mel-frequency cepstral coefficient (MFCC) is generally known to provide better speech recognition performance than linear prediction coefficient (LPC) that is a typical parameter set in speech coding. In this paper, we propose a speech coder using MFCC instead of LPC to improve the performance of a server-based speech recognition system in network environments. However, the main drawback of using MFCC is to develop the efficient MFCC quantization with a low-bit rate. First, we explore the interframe correlation of MFCCs, which results in the predictive quantization of MFCC. Second, a safety-net scheme is proposed to make the MFCC-based speech coder robust to channel error. As a result, we propose a 8.7 kbps MFCC-based CELP coder. It is shown from a PESQ test that the proposed speech coder has a comparable speech quality to 8 kbps G.729 while it is shown that the performance of speech recognition using the proposed speech coder is better than that using G.729.
PDF

GMM-Based Gender Identification Employing Group Delay (Group Delay를 이용한 GMM기반의 성별 인식 알고리즘)

Lee, Kye-Hwan;Lim, Woo-Hyung;Kim, Nam-Soo;Chang, Joon-Hyuk
- The Journal of the Acoustical Society of Korea
- /
- v.26 no.6
- /
- pp.243-249
- /
- 2007
We propose an effective voice-based gender identification using group delay(GD) Generally, features for speech recognition are composed of magnitude information rather than phase information. In our approach, we address a difference between male and female for GD which is a derivative of the Fourier transform phase. Also, we propose a novel way to incorporate the features fusion scheme based on a combination of GD and magnitude information such as mel-frequency cepstral coefficients(MFCC), linear predictive coding (LPC) coefficients, reflection coefficients and formant. The experimental results indicate that GD is effective in discriminating gender and the performance is significantly improved when the proposed feature fusion technique is applied.
https://doi.org/10.7776/ASK.2007.26.6.243 인용 PDF KSCI

Audio Watermarking Using Independent Component Analysis

Seok, Jong-Won
- Journal of information and communication convergence engineering
- /
- v.10 no.2
- /
- pp.175-180
- /
- 2012
This paper presents a blind watermark detection scheme for an additive watermark embedding model. The proposed estimation-correlation-based watermark detector first estimates the embedded watermark by exploiting non-Gaussian of the real-world audio signal and the mutual independence between the host-signal and the embedded watermark and then a correlation-based detector is used to determine the presence or the absence of the watermark. For watermark estimation, blind source separation (BSS) based on independent component analysis (ICA) is used. Low watermark-to-signal ratio (WSR) is one of the limitations of blind detection with the additive embedding model. The proposed detector uses two-stage processing to improve the WSR at the blind detector; the first stage removes the audio spectrum from the watermarked audio signal using linear predictive (LP) filtering and the second stage uses the resulting residue from the LP filtering stage to estimate the embedded watermark using BSS based on ICA. Simulation results show that the proposed detector performs significantly better than existing estimation-correlationbased detection schemes.
https://doi.org/10.6109/jicce.2012.10.2.175 인용 PDF KSCI

Compression of Electrocardiogram Using MPE-LPC (MPE-LPC를 이용한 심전도 신호의 압축)

이태진;김원기;차일환;윤대희
- Journal of the Korean Institute of Telematics and Electronics B
- /
- v.28B no.11
- /
- pp.866-875
- /
- 1991
In this paper, multi pulse excited-linear predictive coding (MPE-LPC), where the correlation eliminated residual signal is modeled by a few pules, is shown to be effective for the compression of electrocardiogram (ECG) data, and a more efficient scheme for a faithful reconstruction of ECG is proposed. The reconstruction charateristic of QRS's and P.T waves is improved using the adaptive pulse allocation (APA), and the compression ratio (CR) can be changed by controlling the mumber of modeling pulses. The performance of the proposed method was evaluated using 10 normal and 10 abnormal ECG data. The proposed method had a better performance than the variable threshold amplitude zone time epoch coding (AZTEC) algorithm and the scan-along polygonal approximation (SAPA) algorithm with the same CR. With the CR in kthe range of 8:1 to 14:1, we could compress ECG data efficiently.
PDF

Color Data Compression for Three-dimensional Mesh Models Using Connectivity and Geometry Information (연결성 정보와 기하학 정보를 이용한 삼차원 메쉬 모델의 색상 정보 압축 방법)

Yoon, Young-Suk;Kim, Sung-Yeol;Ho, Yo-Sung
- Proceedings of the IEEK Conference
- /
- 2006.06a
- /
- pp.745-746
- /
- 2006
In this paper, we propose a new predictive coding scheme for color data of three-dimensional (3-D) mesh models. We exploit connectivity and geometry information to improve coding efficiency. After ordering all vertices in a 3-D mesh model with a vertex traversal technique, we employ a geometry predictor to compress the color data. The predicted color can be acquired by a weighted sum of reconstructed colors for adjacent vertices using both angles and distances between the current vertex and adjacent vertices.
PDF

A New Predictive EC Algorithm for Reduction of Memory Size and Bandwidth Requirements in Wavelet Transform (웨이블릿 변환의 메모리 크기와 대역폭 감소를 위한 Prediction 기반의 Embedded Compression 알고리즘)

Choi, Woo-Soo;Son, Chang-Hoon;Kim, Ji-Won;Na, Seong-Yu;Kim, Young-Min
- Journal of Korea Multimedia Society
- /
- v.14 no.7
- /
- pp.917-923
- /
- 2011
In this paper, a new prediction based embedded compression (EC) codec algorithm for the JPEG2000 encoder system is proposed to reduce excessive memory requirements. The EC technique can reduce the 50 % memory requirement for intermediate low-frequency coefficients during multiple discrete wavelet transform (DWT) stages compared with direct implementation of the DWT engine of this paper. The LOCO-I predictor and MAP are widely used in many lossless picture compression codec. The proposed EC algorithm use these predictor which are very simple but surprisingly effective. The predictive EC scheme adopts a forward adaptive quantization and fixed length coding to encoding the prediction error. Simulation results show that our LOCO-I and MAP based EC codecs present only PSNR degradation of 0.48 and 0.26 dB in average, respectively. The proposed algorithm improves the average PSNR by 1.39 dB compared to the previous work in [9].
https://doi.org/10.9717/kmms.2011.14.7.917 인용 PDF KSCI

Quadtree Image Compression Using Edge-Based Decomposition and Predictive Coding of Leaf Nodes (에지-기반 분할과 잎 노드의 예측부호화를 적용한 쿼드트리 영상 압축)

Jang, Ho-Seok;Jung, Kyeong-Hoon;Kim, Ki-Doo;Kang, Dong-Wook
- Journal of Broadcast Engineering
- /
- v.15 no.1
- /
- pp.133-143
- /
- 2010
This paper proposes a quadtree image compression method which encodes images efficiently and also makes unartificial compressed images. The proposed compression method uses edge-based quadtree decomposition to preserve the significant edge-lines, and it utilizes the predictive coding scheme to exploit the high correlation of the leaf node blocks. The simulation results with $256\times256$ grayscale images verify that the proposed method yields better coding efficiency than the JPEG by about 25 percents. The proposed method can provide more natural compressed images as it is free from the ringing effect in the compressed images which used to be in the images compressed by the fixed block based encoders such as the JPEG.
https://doi.org/10.5909/JBE.2010.15.1.133 인용 PDF KSCI

Electroencephalogram-Based Driver Drowsiness Detection System Using Errors-In-Variables(EIV) and Multilayer Perceptron(MLP) (EIV와 MLP를 이용한 뇌파 기반 운전자의 졸음 감지 시스템)

Han, Hyungseob;Song, Kyoung-Young
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.39C no.10
- /
- pp.887-895
- /
- 2014
Drowsy driving is a large proportion of the total car accidents. For this reason, drowsiness detection and warning system for drivers has recently become a very important issue. Monitoring physiological signals provides the possibility of detecting features of drowsiness and fatigue of drivers. Many researches have been published that to measure electroencephalogram(EEG) signals is the effective way in order to be aware of fatigue and drowsiness of drivers. The aim of this study is to extract drowsiness-related features from a set of EEG signals and to classify the features into three states: alertness, transition, and drowsiness. This paper proposes a drowsiness detection system using errors-in-variables(EIV) for extraction of feature vectors and multilayer perceptron (MLP) for classification. The proposed method evaluates robustness for noise and compares to the previous one using linear predictive coding (LPC) combined with MLP. From evaluation results, we conclude that the proposed scheme outperforms the previous one in the low signal-to-noise ratio regime.
https://doi.org/10.7840/kics.2014.39C.10.887 인용 PDF KSCI

Adaptive Predictive Coding with Two-Level Quantizer for Image (이진 양자화에 의한 영상신호의 적응 예측 부호화)

Kim, Yong-Woo;Kim, Nam-Chul
- Proceedings of the KIEE Conference
- /
- 1987.07b
- /
- pp.1422-1426
- /
- 1987
In this paper, an adaptive DPCM scheme is presented for encoding monochrome images with easy hardware implementation at a transmission rate of exactly 1 bit/pel. The system is mainly composed of a compensated mean predictor and an adaptive two-level quantizer with backward estimation. In this system, the predictor is a sort of two-dimensional ARMA predictor in which a moving-average part is added to the conventional mean predictor. The quantizer adapts to the local statistics of its input without overhead information. To reduce annoying granular noise in the reconstructed image, Lee filter is used after reconstruction in the receiver.
PDF

Spatially Scalable Kronecker Compressive Sensing of Still Images (공간 스케일러블 Kronecker 정지영상 압축 센싱)

Nguyen, Canh Thuong;Jeon, Byeungwoo
- Journal of the Institute of Electronics and Information Engineers
- /
- v.52 no.10
- /
- pp.118-128
- /
- 2015
Compressive sensing (CS) has to face with two challenges of computational complexity reconstruction and low coding efficiency. As a solution, this paper presents a novel spatially scalable Kronecker two layer compressive sensing framework which facilitates reconstruction up to three spatial resolutions as well as much improved CS coding performance. We propose a dual-resolution sensing matrix based on the quincunx sampling grid which is applied to the base layer. This sensing matrix can provide a fast-preview of low resolution image at encoder side which is utilized for predictive coding. The enhancement layer is encoded as the residual measurement between the acquired measurement and predicted measurement data. The low resolution reconstruction is obtained from the base layer only while the high resolution image is jointly reconstructed using both two layers. Experimental results validate that the proposed scheme outperforms both conventional single layer and previous multi-resolution schemes especially at high bitrate like 2.0 bpp by 5.75dB and 5.05dB PSNR gain on average, respectively.
https://doi.org/10.5573/ieie.2015.52.10.118 인용 PDF KSCI

Search Result 24, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)