• Title/Summary/Keyword: state recognition

Search Result 1,016, Processing Time 0.028 seconds

Korean Continuous Speech Recognition Using Discrete Duration Control Continuous HMM (이산 지속시간제어 연속분포 HMM을 이용한 연속 음성 인식)

  • Lee, Jong-Jin;Kim, Soo-Hoon;Hur, Kang-In
    • The Journal of the Acoustical Society of Korea
    • /
    • v.14 no.1
    • /
    • pp.81-89
    • /
    • 1995
  • In this paper, we report the continuous speech recognition system using the continuous HMM with discrete duration control and the regression coefficients. Also, we do recognition experiment using One Pass DP method(for 25 sentences of robot control commands) with finite state automata context control. In the experiment for 4 connected spoken digits, the recognition rates are $93.8\%$ when the discrete duration control and the regression coefficients are included, and $80.7\%$ when they are not included. In the experiment for 25 sentences of the robot control commands, the recognition rate are $90.9\%$ when FSN is not included and $98.4\%$ when FSN is included.

  • PDF

Implementation of HMM Based Speech Recognizer with Medium Vocabulary Size Using TMS320C6201 DSP (TMS320C6201 DSP를 이용한 HMM 기반의 음성인식기 구현)

  • Jung, Sung-Yun;Son, Jong-Mok;Bae, Keun-Sung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.1E
    • /
    • pp.20-24
    • /
    • 2006
  • In this paper, we focused on the real time implementation of a speech recognition system with medium size of vocabulary considering its application to a mobile phone. First, we developed the PC based variable vocabulary word recognizer having the size of program memory and total acoustic models as small as possible. To reduce the memory size of acoustic models, linear discriminant analysis and phonetic tied mixture were applied in the feature selection process and training HMMs, respectively. In addition, state based Gaussian selection method with the real time cepstral normalization was used for reduction of computational load and robust recognition. Then, we verified the real-time operation of the implemented recognition system on the TMS320C6201 EVM board. The implemented recognition system uses memory size of about 610 kbytes including both program memory and data memory. The recognition rate was 95.86% for ETRI 445DB, and 96.4%, 97.92%, 87.04% for three kinds of name databases collected through the mobile phones.

A Robust and Device-Free Daily Activities Recognition System using Wi-Fi Signals

  • Ding, Enjie;Zhang, Yue;Xin, Yun;Zhang, Lei;Huo, Yu;Liu, Yafeng
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.6
    • /
    • pp.2377-2397
    • /
    • 2020
  • Human activity recognition is widely used in smart homes, health care and indoor monitor. Traditional approaches all need hardware installation or wearable sensors, which incurs additional costs and imposes many restrictions on usage. Therefore, this paper presents a novel device-free activities recognition system based on the advanced wireless technologies. The fine-grained information channel state information (CSI) in the wireless channel is employed as the indicator of human activities. To improve accuracy, both amplitude and phase information of CSI are extracted and shaped into feature vectors for activities recognition. In addition, we discuss the classification accuracy of different features and select the most stable features for feature matrix. Our experimental evaluation in two laboratories of different size demonstrates that the proposed scheme can achieve an average accuracy over 95% and 90% in different scenarios.

Three-Dimensional Shape Recognition and Classification Using Local Features of Model Views and Sparse Representation of Shape Descriptors

  • Kanaan, Hussein;Behrad, Alireza
    • Journal of Information Processing Systems
    • /
    • v.16 no.2
    • /
    • pp.343-359
    • /
    • 2020
  • In this paper, a new algorithm is proposed for three-dimensional (3D) shape recognition using local features of model views and its sparse representation. The algorithm starts with the normalization of 3D models and the extraction of 2D views from uniformly distributed viewpoints. Consequently, the 2D views are stacked over each other to from view cubes. The algorithm employs the descriptors of 3D local features in the view cubes after applying Gabor filters in various directions as the initial features for 3D shape recognition. In the training stage, we store some 3D local features to build the prototype dictionary of local features. To extract an intermediate feature vector, we measure the similarity between the local descriptors of a shape model and the local features of the prototype dictionary. We represent the intermediate feature vectors of 3D models in the sparse domain to obtain the final descriptors of the models. Finally, support vector machine classifiers are used to recognize the 3D models. Experimental results using the Princeton Shape Benchmark database showed the average recognition rate of 89.7% using 20 views. We compared the proposed approach with state-of-the-art approaches and the results showed the effectiveness of the proposed algorithm.

A Consecutive Motion and Situation Recognition Mechanism to Detect a Vulnerable Condition Based on Android Smartphone

  • Choi, Hoan-Suk;Lee, Gyu Myoung;Rhee, Woo-Seop
    • International Journal of Contents
    • /
    • v.16 no.3
    • /
    • pp.1-17
    • /
    • 2020
  • Human motion recognition is essential for user-centric services such as surveillance-based security, elderly condition monitoring, exercise tracking, daily calories expend analysis, etc. It is typically based on the movement data analysis such as the acceleration and angular velocity of a target user. The existing motion recognition studies are only intended to measure the basic information (e.g., user's stride, number of steps, speed) or to recognize single motion (e.g., sitting, running, walking). Thus, a new mechanism is required to identify the transition of single motions for assessing a user's consecutive motion more accurately as well as recognizing the user's body and surrounding situations arising from the motion. Thus, in this paper, we collect the human movement data through Android smartphones in real time for five targeting single motions and propose a mechanism to recognize a consecutive motion including transitions among various motions and an occurred situation, with the state transition model to check if a vulnerable (life-threatening) condition, especially for the elderly, has occurred or not. Through implementation and experiments, we demonstrate that the proposed mechanism recognizes a consecutive motion and a user's situation accurately and quickly. As a result of the recognition experiment about mix sequence likened to daily motion, the proposed adoptive weighting method showed 4% (Holding time=15 sec), 88% (30 sec), 6.5% (60 sec) improvements compared to static method.

Gesture-Based Emotion Recognition by 3D-CNN and LSTM with Keyframes Selection

  • Ly, Son Thai;Lee, Guee-Sang;Kim, Soo-Hyung;Yang, Hyung-Jeong
    • International Journal of Contents
    • /
    • v.15 no.4
    • /
    • pp.59-64
    • /
    • 2019
  • In recent years, emotion recognition has been an interesting and challenging topic. Compared to facial expressions and speech modality, gesture-based emotion recognition has not received much attention with only a few efforts using traditional hand-crafted methods. These approaches require major computational costs and do not offer many opportunities for improvement as most of the science community is conducting their research based on the deep learning technique. In this paper, we propose an end-to-end deep learning approach for classifying emotions based on bodily gestures. In particular, the informative keyframes are first extracted from raw videos as input for the 3D-CNN deep network. The 3D-CNN exploits the short-term spatiotemporal information of gesture features from selected keyframes, and the convolutional LSTM networks learn the long-term feature from the features results of 3D-CNN. The experimental results on the FABO dataset exceed most of the traditional methods results and achieve state-of-the-art results for the deep learning-based technique for gesture-based emotion recognition.

Facial Gender Recognition via Low-rank and Collaborative Representation in An Unconstrained Environment

  • Sun, Ning;Guo, Hang;Liu, Jixin;Han, Guang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.9
    • /
    • pp.4510-4526
    • /
    • 2017
  • Most available methods of facial gender recognition work well under a constrained situation, but the performances of these methods have decreased significantly when they are implemented under unconstrained environments. In this paper, a method via low-rank and collaborative representation is proposed for facial gender recognition in the wild. Firstly, the low-rank decomposition is applied to the face image to minimize the negative effect caused by various corruptions and dynamical illuminations in an unconstrained environment. And, we employ the collaborative representation to be as the classifier, which using the much weaker $l_2-norm$ sparsity constraint to achieve similar classification results but with significantly lower complexity. The proposed method combines the low-rank and collaborative representation to an organic whole to solve the task of facial gender recognition under unconstrained environments. Extensive experiments on three benchmarks including AR, CAS-PERL and YouTube are conducted to show the effectiveness of the proposed method. Compared with several state-of-the-art algorithms, our method has overwhelming superiority in the aspects of accuracy and robustness.

Pattern Recognition of PD by Particles in GIS (GIS내 파티클에 의한 PD의 패턴인식)

  • 곽희로;이동준
    • Journal of the Korean Institute of Illuminating and Electrical Installation Engineers
    • /
    • v.17 no.1
    • /
    • pp.31-36
    • /
    • 2003
  • This paper describes the quantitative analysis and the pattern recognition of partial discharge signals generated by particles in GIS. Four states of particles were simulated in this paper. Partial discharge signals from each state was measured and the Ф-Q-N distribution of partial discharge signals was displayed and then the Ф-Q, the Ф-Qm, the Ф-N and the Q-N distribution were displayed. Each distribution can be quantitatively represented by statistical parameters and the parameters were used for input data of pattern recognition. As the results, it was found that the forms of each distribution were different according to the particle states. Recognition rate using neural network was about 92〔%〕 and the more input data number, the more accurate results.

Attention Deep Neural Networks Learning based on Multiple Loss functions for Video Face Recognition (비디오 얼굴인식을 위한 다중 손실 함수 기반 어텐션 심층신경망 학습 제안)

  • Kim, Kyeong Tae;You, Wonsang;Choi, Jae Young
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.10
    • /
    • pp.1380-1390
    • /
    • 2021
  • The video face recognition (FR) is one of the most popular researches in the field of computer vision due to a variety of applications. In particular, research using the attention mechanism is being actively conducted. In video face recognition, attention represents where to focus on by using the input value of the whole or a specific region, or which frame to focus on when there are many frames. In this paper, we propose a novel attention based deep learning method. Main novelties of our method are (1) the use of combining two loss functions, namely weighted Softmax loss function and a Triplet loss function and (2) the feasibility of end-to-end learning which includes the feature embedding network and attention weight computation. The feature embedding network has a positive effect on the attention weight computation by using combined loss function and end-to-end learning. To demonstrate the effectiveness of our proposed method, extensive and comparative experiments have been carried out to evaluate our method on IJB-A dataset with their standard evaluation protocols. Our proposed method represented better or comparable recognition rate compared to other state-of-the-art video FR methods.

Scene Text Recognition Performance Improvement through an Add-on of an OCR based Classifier (OCR 엔진 기반 분류기 애드온 결합을 통한 이미지 내부 텍스트 인식 성능 향상)

  • Chae, Ho-Yeol;Seok, Ho-Sik
    • Journal of IKEEE
    • /
    • v.24 no.4
    • /
    • pp.1086-1092
    • /
    • 2020
  • An autonomous agent for real world should be able to recognize text in scenes. With the advancement of deep learning, various DNN models have been utilized for transformation, feature extraction, and predictions. However, the existing state-of-the art STR (Scene Text Recognition) engines do not achieve the performance required for real world applications. In this paper, we introduce a performance-improvement method through an add-on composed of an OCR (Optical Character Recognition) engine and a classifier for STR engines. On instances from IC13 and IC15 datasets which a STR engine failed to recognize, our method recognizes 10.92% of unrecognized characters.