• Title/Summary/Keyword: Model recognition

Search Result 3,389, Processing Time 0.035 seconds

Performance Improvement of Continuous Digits Speech Recognition Using the Transformed Successive State Splitting and Demi-syllable Pair (반음절쌍과 변형된 연쇄 상태 분할을 이용한 연속 숫자 음 인식의 성능 향상)

  • Seo Eun-Kyoung;Choi Gab-Keun;Kim Soon-Hyob;Lee Soo-Jeong
    • Journal of Korea Multimedia Society
    • /
    • v.9 no.1
    • /
    • pp.23-32
    • /
    • 2006
  • This paper describes the optimization of a language model and an acoustic model to improve speech recognition using Korean unit digits. Since the model is composed of a finite state network (FSN) with a disyllable, recognition errors of the language model were reduced by analyzing the grammatical features of Korean unit digits. Acoustic models utilize a demisyllable pair to decrease recognition errors caused by inaccurate division of a phone or monosyllable due to short pronunciation time and articulation. We have used the K-means clustering algorithm with the transformed successive state splitting in the feature level for the efficient modelling of feature of the recognition unit. As a result of experiments, 10.5% recognition rate is raised in the case of the proposed language model. The demi-syllable fair with an acoustic model increased 12.5% recognition rate and 1.5% recognition rate is improved in transformed successive state splitting.

  • PDF

Radar target recognition using Gaussian mixture model over wide-angular region (Gaussian Mixture Model을 이용한 넓은 관측각에서의 효율적인 레이더 표적인식)

  • 서동규;김경태;김효태
    • Proceedings of the IEEK Conference
    • /
    • 2002.06a
    • /
    • pp.195-198
    • /
    • 2002
  • One-dimensional radar signature, such as range profile, is highly dependent on the aspect angle. Therefore, radar target recognition over wide angular region is a very difficult task. In this paper, we propose the Bayes classifier with Gaussian mixture model for radar target recognition over wide-angular region and compare performances of proposed technique and radar target recognition with subclasses concept in the literature of probability of correct classification ratio.

  • PDF

DYNAMICALLY LOCALIZED SELF-ORGANIZING MAP MODEL FOR SPEECH RECOGNITION

  • KyungMin NA
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06a
    • /
    • pp.1052-1057
    • /
    • 1994
  • Dynamically localized self-organizing map model (DLSMM) is a new speech recognition model based on the well-known self-organizing map algorithm and dynamic programming technique. The DLSMM can efficiently normalize the temporal and spatial characteristics of speech signal at the same time. Especially, the proposed can use contextual information of speech. As experimental results on ten Korean digits recognition task, the DLSMM with contextual information has shown higher recognition rate than predictive neural network models.

  • PDF

Illumination Robust Face Recognition using Ridge Regressive Bilinear Models (Ridge Regressive Bilinear Model을 이용한 조명 변화에 강인한 얼굴 인식)

  • Shin, Dong-Su;Kim, Dai-Jin;Bang, Sung-Yang
    • Journal of KIISE:Software and Applications
    • /
    • v.34 no.1
    • /
    • pp.70-78
    • /
    • 2007
  • The performance of face recognition is greatly affected by the illumination effect because intra-person variation under different lighting conditions can be much bigger than the inter-person variation. In this paper, we propose an illumination robust face recognition by separating identity factor and illumination factor using the symmetric bilinear models. The translation procedure in the bilinear model requires a repetitive computation of matrix inverse operation to reach the identity and illumination factors. Sometimes, this computation may result in a nonconvergent case when the observation has an noisy information. To alleviate this situation, we suggest a ridge regressive bilinear model that combines the ridge regression into the bilinear model. This combination provides some advantages: it makes the bilinear model more stable by shrinking the range of identity and illumination factors appropriately, and it improves the recognition performance by reducing the insignificant factors effectively. Experiment results show that the ridge regressive bilinear model outperforms significantly other existing methods such as the eigenface, quotient image, and the bilinear model in terms of the recognition rate under a variety of illuminations.

Deep Learning Model Selection Platform for Object Detection (사물인식을 위한 딥러닝 모델 선정 플랫폼)

  • Lee, Hansol;Kim, Younggwan;Hong, Jiman
    • Smart Media Journal
    • /
    • v.8 no.2
    • /
    • pp.66-73
    • /
    • 2019
  • Recently, object recognition technology using computer vision has attracted attention as a technology to replace sensor-based object recognition technology. It is often difficult to commercialize sensor-based object recognition technology because such approach requires an expensive sensor. On the other hand, object recognition technology using computer vision may replace sensors with inexpensive cameras. Moreover, Real-time recognition is viable due to the growth of CNN, which is actively introduced into other fields such as IoT and autonomous vehicles. Because object recognition model applications demand expert knowledge on deep learning to select and learn the model, such method, however, is challenging for non-experts to use it. Therefore, in this paper, we analyze the structure of deep - learning - based object recognition models, and propose a platform that can automatically select a deep - running object recognition model based on a user 's desired condition. We also present the reason we need to select statistics-based object recognition model through conducted experiments on different models.

Improvement of Gesture Recognition using 2-stage HMM (2단계 히든마코프 모델을 이용한 제스쳐의 성능향상 연구)

  • Jung, Hwon-Jae;Park, Hyeonjun;Kim, Donghan
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.21 no.11
    • /
    • pp.1034-1037
    • /
    • 2015
  • In recent years in the field of robotics, various methods have been developed to create an intimate relationship between people and robots. These methods include speech, vision, and biometrics recognition as well as gesture-based interaction. These recognition technologies are used in various wearable devices, smartphones and other electric devices for convenience. Among these technologies, gesture recognition is the most commonly used and appropriate technology for wearable devices. Gesture recognition can be classified as contact or noncontact gesture recognition. This paper proposes contact gesture recognition with IMU and EMG sensors by using the hidden Markov model (HMM) twice. Several simple behaviors make main gestures through the one-stage HMM. It is equal to the Hidden Markov model process, which is well known for pattern recognition. Additionally, the sequence of the main gestures, which comes from the one-stage HMM, creates some higher-order gestures through the two-stage HMM. In this way, more natural and intelligent gestures can be implemented through simple gestures. This advanced process can play a larger role in gesture recognition-based UX for many wearable and smart devices.

Style-Specific Language Model Adaptation using TF*IDF Similarity for Korean Conversational Speech Recognition

  • Park, Young-Hee;Chung, Min-Hwa
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.2E
    • /
    • pp.51-55
    • /
    • 2004
  • In this paper, we propose a style-specific language model adaptation scheme using n-gram based tf*idf similarity for Korean spontaneous speech recognition. Korean spontaneous speech shows especially different style-specific characteristics such as filled pauses, word omission, and contraction, which are related to function words and depend on preceding or following words. To reflect these style-specific characteristics and overcome insufficient data for training language model, we estimate in-domain dependent n-gram model by relevance weighting of out-of-domain text data according to their n-. gram based tf*idf similarity, in which in-domain language model include disfluency model. Recognition results show that n-gram based tf*idf similarity weighting effectively reflects style difference.

A Robust Speech Recognition Method Combining the Model Compensation Method with the Speech Enhancement Algorithm (음질향상 기법과 모델보상 방식을 결합한 강인한 음성인식 방식)

  • Kim, Hee-Keun;Chung, Yong-Joo;Bae, Keun-Seung
    • Speech Sciences
    • /
    • v.14 no.2
    • /
    • pp.115-126
    • /
    • 2007
  • There have been many research efforts to improve the performance of the speech recognizer in noisy conditions. Among them, the model compensation method and the speech enhancement approach have been used widely. In this paper, we propose to combine the two different approaches to further enhance the recognition rates in the noisy speech recognition. For the speech enhancement, the minimum mean square error-short time spectral amplitude (MMSE-STSA) has been adopted and the parallel model combination (PMC) and Jacobian adaptation (JA) have been used as the model compensation approaches. From the experimental results, we could find that the hybrid approach that applies the model compensation methods to the enhanced speech produce better results than just using only one of the two approaches.

  • PDF

Acoustic and Pronunciation Model Adaptation Based on Context dependency for Korean-English Speech Recognition (한국인의 영어 인식을 위한 문맥 종속성 기반 음향모델/발음모델 적응)

  • Oh, Yoo-Rhee;Kim, Hong-Kook;Lee, Yeon-Woo;Lee, Seong-Ro
    • MALSORI
    • /
    • v.68
    • /
    • pp.33-47
    • /
    • 2008
  • In this paper, we propose a hybrid acoustic and pronunciation model adaptation method based on context dependency for Korean-English speech recognition. The proposed method is performed as follows. First, in order to derive pronunciation variant rules, an n-best phoneme sequence is obtained by phone recognition. Second, we decompose each rule into a context independent (CI) or a context dependent (CD) one. To this end, it is assumed that a different phoneme structure between Korean and English makes CI pronunciation variabilities while coarticulation effects are related to CD pronunciation variabilities. Finally, we perform an acoustic model adaptation and a pronunciation model adaptation for CI and CD pronunciation variabilities, respectively. It is shown from the Korean-English speech recognition experiments that the average word error rate (WER) is decreased by 36.0% when compared to the baseline that does not include any adaptation. In addition, the proposed method has a lower average WER than either the acoustic model adaptation or the pronunciation model adaptation.

  • PDF

Performance Analysis of Face Image Recognition System Using A R T Model and Multi-layer perceptron (ART와 다층 퍼셉트론을 이용한 얼굴인식 시스템의 성능분석)

  • 김영일;안민옥
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.30B no.2
    • /
    • pp.69-77
    • /
    • 1993
  • Automatic image recognition system is essential for a better man-to machine interaction. Because of the noise and deformation due to the sensor operation, it is not simple to build an image recognition system even for the fixed images. In this paper neural network which has been reported to be adequate for pattern recognition task is applied to the fixed and variational(rotation, size, position variation for the fixed image)recognition with a hope that the problems of conventional pattern recognition techniques are overcome. At fixed image recognition system. ART model is trained with face images obtained by camera. When recognizing an matching score. In the test when wigilance level 0.6 - 0.8 the system has achievel 100% correct face recognition rate. In the variational image recognition system, 65 invariant moment features sets are taken from thirteen persons. 39 data are taken to train multi-layer perceptron and other 26 data used for testing. The result shows 92.5% recognition rate.

  • PDF