Search | Korea Science

Speech/Mixed Content Signal Classification Based on GMM Using MFCC (MFCC를 이용한 GMM 기반의 음성/혼합 신호 분류)

Kim, Ji-Eun;Lee, In-Sung
- Journal of the Institute of Electronics and Information Engineers
- /
- v.50 no.2
- /
- pp.185-192
- /
- 2013
In this paper, proposed to improve the performance of speech and mixed content signal classification using MFCC based on GMM probability model used for the MPEG USAC(Unified Speech and Audio Coding) standard. For effective pattern recognition, the Gaussian mixture model (GMM) probability model is used. For the optimal GMM parameter extraction, we use the expectation maximization (EM) algorithm. The proposed classification algorithm is divided into two significant parts. The first one extracts the optimal parameters for the GMM. The second distinguishes between speech and mixed content signals using MFCC feature parameters. The performance of the proposed classification algorithm shows better results compared to the conventionally implemented USAC scheme.
https://doi.org/10.5573/ieek.2013.50.2.185 인용 PDF KSCI

Sound System Analysis for Health Smart Home

CASTELLI Eric;ISTRATE Dan;NGUYEN Cong-Phuong
- Proceedings of the IEEK Conference
- /
- summer
- /
- pp.237-243
- /
- 2004
A multichannel smart sound sensor capable to detect and identify sound events in noisy conditions is presented in this paper. Sound information extraction is a complex task and the main difficulty consists is the extraction of highlevel information from an one-dimensional signal. The input of smart sound sensor is composed of data collected by 5 microphones and its output data is sent through a network. For a real time working purpose, the sound analysis is divided in three steps: sound event detection for each sound channel, fusion between simultaneously events and sound identification. The event detection module find impulsive signals in the noise and extracts them from the signal flow. Our smart sensor must be capable to identify impulsive signals but also speech presence too, in a noisy environment. The classification module is launched in a parallel task on the channel chosen by data fusion process. It looks to identify the event sound between seven predefined sound classes and uses a Gaussian Mixture Model (GMM) method. Mel Frequency Cepstral Coefficients are used in combination with new ones like zero crossing rate, centroid and roll-off point. This smart sound sensor is a part of a medical telemonitoring project with the aim of detecting serious accidents.
PDF

A Statistically Model-Based Adaptive Technique to Unsupervised Segmentation of MR Images (자기공명영상의 비지도 분할을 위한 통계적 모델기반 적응적 방법)

Kim, Tae-Woo
- The Transactions of the Korea Information Processing Society
- /
- v.7 no.1
- /
- pp.286-295
- /
- 2000
We present a novel statistically adaptive method using the Minimum Description Length(MDL) principle for unsupervised segmentation of magnetic resonance(MR) images. In the method, Markov random filed(MRF) modeling of tissue region accounts for random noise. Intensity measurements on the local region defined by a window are modeled by a finite Gaussian mixture, which accounts for image inhomogeneities. The segmentation algorithm is based on an iterative conditional modes(ICM) algorithm, approximately finds maximum ${\alpha}$ posteriori(MAP) estimation, and estimates model parameters on the local region. The size of the window for parameter estimation and segmentation is estimated from the image using the MDL principle. In the experiments, the technique well reflected image characteristic of the local region and showed better results than conventional methods in segmentation of MR images with inhomogeneities, especially.
PDF

Fast Sequential Probability Ratio Test Method to Obtain Consistent Results in Speaker Verification (화자확인에서 일정한 결과를 얻기 위한 빠른 순시 확률비 테스트 방법)

Kim, Eun-Young;Seo, Chang-Woo;Jeon, Sung-Chae
- Phonetics and Speech Sciences
- /
- v.2 no.2
- /
- pp.63-68
- /
- 2010
A new version of sequential probability ratio test (SPRT) which has been investigated in utterance-length control is proposed to obtain uniform response results in speaker verification (SV). Although SPRTs can obtain fast responses in SV tests, differences in the performance may occur depending on the compositions of consonants and vowels in the sentences used. In this paper, a fast sequential probability ratio test (FSPRT) method that shows consistent performances at all times regardless of the compositions of vocalized sentences for SV will be proposed. In generating frames, the FSPRT will first conduct SV test processes with only generated frames without any overlapping and if the results do not satisfy discrimination criteria, the FSPRT will sequentially use frames applied with overlapping. With the progress of processes as such, the test will not be affected by the compositions of sentences for SV and thus fast response outcomes and even consistent performances can be obtained. Experimental results show that the FSPRT has better performance to the SPRT method while requiring less complexity with equal error rates (EER).
PDF

Automatic Extraction of UV patterns for Paper Money Inspection (지폐검사를 위한 UV 패턴의 자동추출)

Lee, Geon-Ho;Park, Tae-Hyoung
- Journal of the Korean Institute of Intelligent Systems
- /
- v.21 no.3
- /
- pp.365-371
- /
- 2011
Most recently issued paper money includes security patterns that can be only identified by ultra violet (UV) illuminations. We propose an automatic extraction method of UV patterns for paper money inspection systems. The image acquired by camera and UV illumination is transformed to input data through preprocessing. And then, the Gaussian mixture model (GMM) and split-and-merge expectation maximization (SMEM) algorithm are applied to segment the image represented by input data. In order to extract the UV pattern from the segmented image, we develop a criterion using the area of covariance vector and the weight value. The experimental results on various paper money are presented to verify the usefulness of the proposed method.
https://doi.org/10.5391/JKIIS.2011.21.3.365 인용 PDF KSCI

Analysis of Human Activity Using Motion Vector and GPU (움직임 벡터와 GPU를 이용한 인간 활동성 분석)

Kim, Sun-Woo;Choi, Yeon-Sung
- The Journal of the Korea institute of electronic communication sciences
- /
- v.9 no.10
- /
- pp.1095-1102
- /
- 2014
In this paper, We proposed the approach of GPU and motion vector to analysis the Human activity in real-time surveillance system. The most important part, that is detect blob(human) in the foreground. We use to detect Adaptive Gaussian Mixture, Weighted subtraction image for salient motion and motion vector. And then, We use motion vector for human activity analysis. In this paper, the activities of human recognize and classified such as meta-classes like this {Active, Inactive}, {Position Moving, Fixed Moving}, {Walking, Running}. We created approximately 300 conditions for the simulation. As a result, We showed a high success rate about 86~98%. The results also showed that the high resolution experiment by the proposed GPU-based method was over 10 times faster than the cpu-based method.
https://doi.org/10.13067/JKIECS.2014.9.10.1095 인용 PDF KSCI

Statistical Inference in Non-Identifiable and Singular Statistical Models

Amari, Shun-ichi;Amari, Shun-ichi;Tomoko Ozeki
- Journal of the Korean Statistical Society
- /
- v.30 no.2
- /
- pp.179-192
- /
- 2001
When a statistical model has a hierarchical structure such as multilayer perceptrons in neural networks or Gaussian mixture density representation, the model includes distribution with unidentifiable parameters when the structure becomes redundant. Since the exact structure is unknown, we need to carry out statistical estimation or learning of parameters in such a model. From the geometrical point of view, distributions specified by unidentifiable parameters become a singular point in the parameter space. The problem has been remarked in many statistical models, and strange behaviors of the likelihood ratio statistics, when the null hypothesis is at a singular point, have been analyzed so far. The present paper studies asymptotic behaviors of the maximum likelihood estimator and the Bayesian predictive estimator, by using a simple cone model, and show that they are completely different from regular statistical models where the Cramer-Rao paradigm holds. At singularities, the Fisher information metric degenerates, implying that the cramer-Rao paradigm does no more hold, and that he classical model selection theory such as AIC and MDL cannot be applied. This paper is a first step to establish a new theory for analyzing the accuracy of estimation or learning at around singularities.
PDF

Adaptive Threshold Detection Using Expectation-Maximization Algorithm for Multi-Level Holographic Data Storage (멀티레벨 홀로그래픽 저장장치를 위한 적응 EM 알고리즘)

Kim, Jinyoung;Lee, Jaejin
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.37A no.10
- /
- pp.809-814
- /
- 2012
We propose an adaptive threshold detector algorithm for multi-level holographic data storage based on the expectation-maximization (EM) method. In this paper, the signal intensities that are passed through the four-level holographic channel are modeled as a four Gaussian mixture with unknown DC offsets and the threshold levels are estimated based on the maximum likelihood criterion. We compare the bit error rate (BER) performance of the proposed algorithm with the non-adaptive threshold detection algorithm for various levels of DC offset and misalignments. Our proposed algorithm shows consistently acceptable performance when the DC offset variance is fixed or the misalignments are lower than 20%. When the DC offset varies with each page, the BER of the proposed method is acceptable when the misalignments are lower than 10% and DC offset variance is 0.001.
https://doi.org/10.7840/kics.2012.37A.10.809 인용 PDF KSCI

Error Estimation Based on the Bhattacharyya Distance for Classifying Multimodal Data (Multimodal 데이터에 대한 분류 에러 예측 기법)

Choe, Ui-Seon;Kim, Jae-Hui;Lee, Cheol-Hui
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.39 no.2
- /
- pp.147-154
- /
- 2002
In this paper, we propose an error estimation method based on the Bhattacharyya distance for multimodal data. First, we try to find the empirical relationship between the classification error and the Bhattacharyya distance. Then, we investigate the possibility to derive the error estimation equation based on the Bhattacharyya distance for multimodal data. We assume that the distribution of multimodal data can be approximated as a mixture of several Gaussian distributions. Experimental results with remotely sensed data showed that there exist strong relationships between the Bhattacharyya distance and the classification error and that it is possible to predict the classification error using the Bhattacharyya distance for multimodal data.
PDF KSCI

A Study on User Authentication for Wireless Communication Security in the Telematics Environment (텔레메틱스 환경에서 무선통신 보안을 위한 사용자 인증에 관한 연구)

Kim, Hyoung-Gook
- The Journal of The Korea Institute of Intelligent Transport Systems
- /
- v.9 no.2
- /
- pp.104-109
- /
- 2010
In this paper, we propose a user authentication technology to protect wiretapping and attacking from others in the telematics environment, which users in vehicle can use internet service in local area network via mobile device. In the proposed user authentication technology, the packet speech data is encrypted by speech-based biometric key, which is generated from the user's speech signal. Thereafter, the encrypted data packet is submitted to the information communication server(ICS). At the ICS, the speech feature of the user is reconstructed from the encrypted data packet and is compared with the preregistered speech-based biometric key for user authentication. Based on implementation of our proposed communication method, we confirm that our proposed method is secure from various attack methods.
PDF KSCI

Search Result 507, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)