• Title/Summary/Keyword: Mixture of Gaussian

Search Result 505, Processing Time 0.02 seconds

Multi-layer Speech Processing System for Point-Of-Interest Recognition in the Car Navigation System (차량용 항법장치에서의 관심지 인식을 위한 다단계 음성 처리 시스템)

  • Bhang, Ki-Duck;Kang, Chul-Ho
    • Journal of Korea Multimedia Society
    • /
    • v.12 no.1
    • /
    • pp.16-25
    • /
    • 2009
  • In the car environment that the first priority is a safety problem, the large vocabulary isolated word recognition system with POI domain is required as the optimal HMI technique. For the telematics terminal with a highly limited processing time and memory capacity, it is impossible to process more than 100,000 words in the terminal by the general speech recognition methods. Therefore, we proposed phoneme recognizer using the phonetic GMM and also PDM Levenshtein distance with multi-layer architecture for the POI recognition of telematics terminal. By the proposed methods, we obtained high performance in the telematics terminal with low speed processing and small memory capacity. we obtained the recognition rate of maximum 94.8% in indoor environment and of maximum 92.4% in the car navigation environments.

  • PDF

Fire-Smoke Detection Based on Video using Dynamic Bayesian Networks (동적 베이지안 네트워크를 이용한 동영상 기반의 화재연기감지)

  • Lee, In-Gyu;Ko, Byung-Chul;Nam, Jae-Yeol
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.34 no.4C
    • /
    • pp.388-396
    • /
    • 2009
  • This paper proposes a new fire-smoke detection method by using extracted features from camera images and pattern recognition technique. First, moving regions are detected by analyzing the frame difference between two consecutive images and generate candidate smoke regions by applying smoke color model. A smoke region generally has a few characteristics such as similar color, simple texture and upward motion. From these characteristics, we extract brightness, wavelet high frequency and motion vector as features. Also probability density functions of three features are generated using training data. Probabilistic models of smoke region are then applied to observation nodes of our proposed Dynamic Bayesian Networks (DBN) for considering time continuity. The proposed algorithm was successfully applied to various fire-smoke tasks not only forest smokes but also real-world smokes and showed better detection performance than previous method.

A Study on Lip Detection based on Eye Localization for Visual Speech Recognition in Mobile Environment (모바일 환경에서의 시각 음성인식을 위한 눈 정위 기반 입술 탐지에 대한 연구)

  • Gyu, Song-Min;Pham, Thanh Trung;Kim, Jin-Young;Taek, Hwang-Sung
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.19 no.4
    • /
    • pp.478-484
    • /
    • 2009
  • Automatic speech recognition(ASR) is attractive technique in trend these day that seek convenient life. Although many approaches have been proposed for ASR but the performance is still not good in noisy environment. Now-a-days in the state of art in speech recognition, ASR uses not only the audio information but also the visual information. In this paper, We present a novel lip detection method for visual speech recognition in mobile environment. In order to apply visual information to speech recognition, we need to extract exact lip regions. Because eye-detection is more easy than lip-detection, we firstly detect positions of left and right eyes, then locate lip region roughly. After that we apply K-means clustering technique to devide that region into groups, than two lip corners and lip center are detected by choosing biggest one among clustered groups. Finally, we have shown the effectiveness of the proposed method through the experiments based on samsung AVSR database.

A Forest Fire Detection Algorithm Using Image Information (영상정보를 이용한 산불 감지 알고리즘)

  • Seo, Min-Seok;Lee, Choong Ho
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.20 no.3
    • /
    • pp.159-164
    • /
    • 2019
  • Detecting wildfire using only color in image information is a very difficult issue. This paper proposes an algorithm to detect forest fire area by analyzing color and motion of the area in the video including forest fire. The proposed algorithm removes the background region using the Gaussian Mixture based background segmentation algorithm, which does not depend on the lighting conditions. In addition, the RGB channel is changed to an HSV channel to extract flame candidates based on color. The extracted flame candidates judge that it is not a flame if the area moves while labeling and tracking. If the flame candidate areas extracted in this way are in the same position for more than 2 minutes, it is regarded as flame. Experimental results using the implemented algorithm confirmed the validity.

Safety Robust Speaker Recognition Against Utterance Variationsed (발성변화에 강인한 화자 인식에 관한 연구)

  • Lee Ki-Yong
    • Journal of Internet Computing and Services
    • /
    • v.5 no.2
    • /
    • pp.69-73
    • /
    • 2004
  • A speaker model In speaker recognition system is to be trained from a large data set gathered in multiple sessions. Large data set requires large amount of memory and computation, and moreover it's practically hard to make users utter the data inseveral sessions. Recently the incremental adaptation methods are proposed to cover the problems, However, the data set gathered from multiple sessions is vulnerable to the outliers from the irregular utterance variations and the presence of noise, which result in inaccurate speaker model. In this paper, we propose an incremental robust adaptation method to minimize the influence of outliers on Gaussian Mixture Madel based speaker model. The robust adaptation is obtained from an incremental version of M-estimation. Speaker model is initially trained from small amount of data and it is adapted recursively with the data available in each session, Experimental results from the data set gathered over seven months show that the proposed method is robust against outliers.

  • PDF

Clustering Analysis of Science and Engineering College Students' understanding on Probability and Statistics (Robust PCA를 활용한 이공계 대학생의 확률 및 통계 개념 이해도 분석)

  • Yoo, Yongseok
    • Journal of Convergence for Information Technology
    • /
    • v.12 no.3
    • /
    • pp.252-258
    • /
    • 2022
  • In this study, we propose a method for analyzing students' understanding of probability and statistics in small lectures at universities. A computer-based test for probability and statistics was performed on 95 science and engineering college students. After dividing the students' responses into 7 clusters using the Robust PCA and the Gaussian mixture model, the achievement of each subject was analyzed for each cluster. High-ranking clusters generally showed high achievement on most topics except for statistical estimation, and low-achieving clusters showed strengths and weaknesses on different topics. Compared to the widely used PCA-based dimension reduction followed by clustering analysis, the proposed method showed each group's characteristics more clearly. The characteristics of each cluster can be used to develop an individualized learning strategy.

A Real-time Audio Surveillance System Detecting and Localizing Dangerous Sounds for PTZ Camera Surveillance (PTZ 카메라 감시를 위한 실시간 위험 소리 검출 및 음원 방향 추정 소리 감시 시스템)

  • Nguyen, Viet Quoc;Kang, HoSeok;Chung, Sun-Tae;Cho, Seongwon
    • Journal of Korea Multimedia Society
    • /
    • v.16 no.11
    • /
    • pp.1272-1280
    • /
    • 2013
  • In this paper, we propose an audio surveillance system which can detect and localize dangerous sounds in real-time. The location information about dangerous sounds can render a PTZ camera to be directed so as to catch a snapshot image about the dangerous sound source area and send it to clients instantly. The proposed audio surveillance system firstly detects foreground sounds based on adaptive Gaussian mixture background sound model, and classifies it into one of pre-trained classes of foreground dangerous sounds. For detected dangerous sounds, a sound source localization algorithm based on Dual delay-line algorithm is applied to localize the sound sources. Finally, the proposed system renders a PTZ camera to be oriented towards the dangerous sound source region, and take a snapshot against over the sound source region. Experiment results show that the proposed system can detect foreground dangerous sounds stably and classifies the detected foreground dangerous sounds into correct classes with a precision of 79% while the sound source localization can estimate orientation of the sound source with acceptably small error.

EM Algorithm for Designing Soft-Decision Binary Error Correction Codes of MLC NAND Flash Memory (멀티 레벨 낸드 플래시 메모리용 연판정 복호를 수행하는 이진 ECC 설계를 위한 EM 알고리즘)

  • Kim, Sung-Rae;Shin, Dong-Joon
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.39A no.3
    • /
    • pp.127-139
    • /
    • 2014
  • In this paper, we present two signal processing techniques for designing binary error correction codes for Multi-Level Cell(MLC) NAND flash memory. MLC NAND flash memory saves the non-binary symbol at each cell and shows asymmetric channel LLR l-density which makes it difficult to design soft-decision binary error correction codes such as LDPC codes and Polar codes. Therefore, we apply density mirroring and EM algorithm for approximating the MLC NAND flash memory channel to the binary-input memoryless channel. The density mirroring processes channel LLRs to satisfy roughly all-zero codeword assumption, and then EM algorithm is applied to l-density after density mirroring for approximating it to mixture of symmetric Gaussian densities. These two signal processing techniques make it possible to use conventional code design algorithms, such as density evolution and EXIT chart, for MLC NAND flash memory channel.

Real Time Abandoned and Removed Objects Detection System (실시간 방치 및 제거 객체 검출 시스템)

  • Jeong, Cheol-Jun;Ahn, Tae-Ki;Park, Jong-Hwa;Park, Goo-Man
    • Journal of Broadcast Engineering
    • /
    • v.16 no.3
    • /
    • pp.462-470
    • /
    • 2011
  • We proposed a realtime object tracking system that detects the abandoned or disappeared objects. Because these events are caused by human, we used the tracking based algorithm. After the background subtraction by Gaussian mixture model, the shadow removal is applied for accurate object detection. The static object is classified as either of abandoned objects or disappeared object. We assigned monitoring time to the static object to overcome a situation that it is being overlapped by other object. We obtained more accurate detection by using region growing method. We implemented our algorithm by DSP processor and obtained an excellent result throughout the experiment.

Noise Rabust Speaker Verification Using Sub-Band Weighting (서브밴드 가중치를 이용한 잡음에 강인한 화자검증)

  • Kim, Sung-Tak;Ji, Mi-Kyong;Kim, Hoi-Rin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.3
    • /
    • pp.279-284
    • /
    • 2009
  • Speaker verification determines whether the claimed speaker is accepted based on the score of the test utterance. In recent years, methods based on Gaussian mixture models and universal background model have been the dominant approaches for text-independent speaker verification. These speaker verification systems based on these methods provide very good performance under laboratory conditions. However, in real situations, the performance of speaker verification system is degraded dramatically. For overcoming this performance degradation, the feature recombination method was proposed, but this method had a drawback that whole sub-band feature vectors are used to compute the likelihood scores. To deal with this drawback, a modified feature recombination method which can use each sub-band likelihood score independently was proposed in our previous research. In this paper, we propose a sub-band weighting method based on sub-band signal-to-noise ratio which is combined with previously proposed modified feature recombination. This proposed method reduces errors by 28% compared with the conventional feature recombination method.