• Title/Summary/Keyword: Context Recognition

Search Result 526, Processing Time 0.021 seconds

Vowel Context Effect on the Perception of Stop Consonants in Malayalam and Its Role in Determining Syllable Frequency

  • Mohan, Dhanya;Maruthy, Sandeep
    • Korean Journal of Audiology
    • /
    • v.25 no.3
    • /
    • pp.124-130
    • /
    • 2021
  • Background and Objectives: The study investigated vowel context effects on the perception of stop consonants in Malayalam. It also probed into the role of vowel context effects in determining the frequency of occurrence of various consonant-vowel (CV) syllables in Malayalam. Subjects and Methods: The study used a cross-sectional pre-experimental post-test only research design on 30 individuals with normal hearing, who were native speakers of Malayalam. The stimuli included three stop consonants, each spoken in three different vowel contexts. The resultant nine syllables were presented in original form and five gating conditions. The consonant recognition in different vowel contexts of the participants was assessed. The frequency of occurrence of the nine target syllables in the spoken corpus of Malayalam was also systematically derived. Results: The consonant recognition score was better in the /u/ vowel context compared with /i/ and /a/ contexts. The frequency of occurrence of the target syllables derived from the spoken corpus of Malayalam showed that the three stop consonants occurred more frequently with the vowel /a/ compared with /u/ and /i/. Conclusions: The findings show a definite vowel context effect on the perception of the Malayalam stop consonants. This context effect observed is different from that in other languages. Stop consonants are perceived better in the context of /u/ compared with the /a/ and /i/ contexts. Furthermore, the vowel context effects do not appear to determine the frequency of occurrence of different CV syllables in Malayalam.

Speaker and Context Independent Emotion Recognition System using Gaussian Mixture Model (GMM을 이용한 화자 및 문장 독립적 감정 인식 시스템 구현)

  • 강면구;김원구
    • Proceedings of the IEEK Conference
    • /
    • 2003.07e
    • /
    • pp.2463-2466
    • /
    • 2003
  • This paper studied the pattern recognition algorithm and feature parameters for emotion recognition. In this paper, KNN algorithm was used as the pattern matching technique for comparison, and also VQ and GMM were used lot speaker and context independent recognition. The speech parameters used as the feature are pitch, energy, MFCC and their first and second derivatives. Experimental results showed that emotion recognizer using MFCC and their derivatives as a feature showed better performance than that using the Pitch and energy Parameters. For pattern recognition algorithm, GMM based emotion recognizer was superior to KNN and VQ based recognizer

  • PDF

Context Recognition Using Environmental Sound for Client Monitoring System (피보호자 모니터링 시스템을 위한 환경음 기반 상황 인식)

  • Ji, Seung-Eun;Jo, Jun-Yeong;Lee, Chung-Keun;Oh, Siwon;Kim, Wooil
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.19 no.2
    • /
    • pp.343-350
    • /
    • 2015
  • This paper presents a context recognition method using environmental sound signals, which is applied to a mobile-based client monitoring system. Seven acoustic contexts are defined and the corresponding environmental sound signals are obtained for the experiments. To evaluate the performance of the context recognition, MFCC and LPCC method are employed as feature extraction, and statistical pattern recognition method are used employing GMM and HMM as acoustic models, The experimental results show that LPCC and HMM are more effective at improving context recognition accuracy compared to MFCC and GMM respectively. The recognition system using LPCC and HMM obtains 96.03% in recognition accuracy. These results demonstrate that LPCC is effective to represent environmental sounds which contain more various frequency components compared to human speech. They also prove that HMM is more effective to model the time-varying environmental sounds compared to GMM.

Probabilistic Graph Based Object Category Recognition Using the Context of Object-Action Interaction (물체-행동 컨텍스트를 이용하는 확률 그래프 기반 물체 범주 인식)

  • Yoon, Sung-baek;Bae, Se-ho;Park, Han-je;Yi, June-ho
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.40 no.11
    • /
    • pp.2284-2290
    • /
    • 2015
  • The use of human actions as context for object class recognition is quite effective in enhancing the recognition performance despite the large variation in the appearance of objects. We propose an efficient method that integrates human action information into object class recognition using a Bayesian appraoch based on a simple probabilistic graph model. The experiment shows that by using human actions ac context information we can improve the performance of the object calss recognition from 8% to 28%.

Emotion Recognition Based on Facial Expression by using Context-Sensitive Bayesian Classifier (상황에 민감한 베이지안 분류기를 이용한 얼굴 표정 기반의 감정 인식)

  • Kim, Jin-Ok
    • The KIPS Transactions:PartB
    • /
    • v.13B no.7 s.110
    • /
    • pp.653-662
    • /
    • 2006
  • In ubiquitous computing that is to build computing environments to provide proper services according to user's context, human being's emotion recognition based on facial expression is used as essential means of HCI in order to make man-machine interaction more efficient and to do user's context-awareness. This paper addresses a problem of rigidly basic emotion recognition in context-sensitive facial expressions through a new Bayesian classifier. The task for emotion recognition of facial expressions consists of two steps, where the extraction step of facial feature is based on a color-histogram method and the classification step employs a new Bayesian teaming algorithm in performing efficient training and test. New context-sensitive Bayesian learning algorithm of EADF(Extended Assumed-Density Filtering) is proposed to recognize more exact emotions as it utilizes different classifier complexities for different contexts. Experimental results show an expression classification accuracy of over 91% on the test database and achieve the error rate of 10.6% by modeling facial expression as hidden context.

Acoustic and Pronunciation Model Adaptation Based on Context dependency for Korean-English Speech Recognition (한국인의 영어 인식을 위한 문맥 종속성 기반 음향모델/발음모델 적응)

  • Oh, Yoo-Rhee;Kim, Hong-Kook;Lee, Yeon-Woo;Lee, Seong-Ro
    • MALSORI
    • /
    • v.68
    • /
    • pp.33-47
    • /
    • 2008
  • In this paper, we propose a hybrid acoustic and pronunciation model adaptation method based on context dependency for Korean-English speech recognition. The proposed method is performed as follows. First, in order to derive pronunciation variant rules, an n-best phoneme sequence is obtained by phone recognition. Second, we decompose each rule into a context independent (CI) or a context dependent (CD) one. To this end, it is assumed that a different phoneme structure between Korean and English makes CI pronunciation variabilities while coarticulation effects are related to CD pronunciation variabilities. Finally, we perform an acoustic model adaptation and a pronunciation model adaptation for CI and CD pronunciation variabilities, respectively. It is shown from the Korean-English speech recognition experiments that the average word error rate (WER) is decreased by 36.0% when compared to the baseline that does not include any adaptation. In addition, the proposed method has a lower average WER than either the acoustic model adaptation or the pronunciation model adaptation.

  • PDF

Performance Improvement of Speech Recognition Using Context and Usage Pattern Information (문맥 및 사용 패턴 정보를 이용한 음성인식의 성능 개선)

  • Song, Won-Moon;Kim, Myung-Won
    • The KIPS Transactions:PartB
    • /
    • v.13B no.5 s.108
    • /
    • pp.553-560
    • /
    • 2006
  • Speech recognition has recently been investigated to produce more reliable recognition results in a noisy environment, by integrating diverse sources of information into the result derivation-level or producing new results through post-processing the prior recognition results. In this paper we propose a method which uses the user's usage patterns and the context information in speech command recognition for personal mobile devices to improve the recognition accuracy in a noisy environment. Sequential usage (or speech) patterns prior to the current command spoken are used to adjust the base recognition results. For the context information, we use the relevance between the current function of the device in use and the spoken command. Our experiment results show that the proposed method achieves about 50% of error correction rate over the base recognition system. It demonstrates the feasibility of the proposed method.

Effects of Association and Imagery on Word Recognition (단어재인에 미치는 연상과 심상성의 영향)

  • Kim, Min-Jung;Lee, Seung-Bok;Jung, Bum-Suk
    • Korean Journal of Cognitive Science
    • /
    • v.20 no.3
    • /
    • pp.243-274
    • /
    • 2009
  • The association, word frequency and imagery have been considered as the main factors that affect the word recognition. The present study aimed to examine the imagery effect and the interaction of the association effect while controlling the frequency effect. To explain the imagery effect, we compared the two theories (dual-coding theory, context availability model). The lexical decision task using priming paradigm was administered. The duration of prime words was manipulated as 20ms, 50ms, and 450ms in experiments 1, 2, and 3, respectively. The association and imagery of prime words were manipulated as the main factors in each of the three experiments. In experiment 1, the duration of prime words (20ms) which is expected to not activate the semantic context enough to affects the word recognition was used. As a result, only imagery effect was statically significant. In experiment 2, the duration of prime word was 50ms, which we expected to activate the semantic context without perceptual awareness. The result showed both the association and imagery effects. The interaction between the two effects was also significant. In experiment 3, to activate the semantic context with perceptual awareness, the prime words were presented for 450ms. Only association effect was statically significant in this experimental condition. The results of the three experiments suggest that the influence of the imagery was at the early stages of word recognition, while the association effect appeared rather later than the imagery. These results implied that the two theories are not contrary to each other. The dual-coding theory just concerned imagery effect which affects the early stage of word recognition, and context-availability model is more for the semantic context effect which affects rather later stage of word recognition. To explain the word recognition process more completely, some integrated model need to be developed considering not only the main 3 effects but also the stages which extends along the time course of the process.

  • PDF

Speech Recognition Using MSVQ/TDRNN (MSVQ/TDRNN을 이용한 음성인식)

  • Kim, Sung-Suk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.33 no.4
    • /
    • pp.268-272
    • /
    • 2014
  • This paper presents a method for speech recognition using multi-section vector-quantization (MSVQ) and time-delay recurrent neural network (TDTNN). The MSVQ generates the codebook with normalized uniform sections of voice signal, and the TDRNN performs the speech recognition using the MSVQ codebook. The TDRNN is a time-delay recurrent neural network classifier with two different representations of dynamic context: the time-delayed input nodes represent local dynamic context, while the recursive nodes are able to represent long-term dynamic context of voice signal. The cepstral PLP coefficients were used as speech features. In the speech recognition experiments, the MSVQ/TDRNN speech recognizer shows 97.9 % word recognition rate for speaker independent recognition.

A Study on the Context-dependent Speaker Recognition Adopting the Method of Weighting the Frame-based Likelihood Using SNR (SNR을 이용한 프레임별 유사도 가중방법을 적용한 문맥종속 화자인식에 관한 연구)

  • Choi, Hong-Sub
    • MALSORI
    • /
    • no.61
    • /
    • pp.113-123
    • /
    • 2007
  • The environmental differences between training and testing mode are generally considered to be the critical factor for the performance degradation in speaker recognition systems. Especially, general speaker recognition systems try to get as clean speech as possible to train the speaker model, but it's not true in real testing phase due to environmental and channel noise. So in this paper, the new method of weighting the frame-based likelihood according to frame SNR is proposed in order to cope with that problem. That is to make use of the deep correlation between speech SNR and speaker discrimination rate. To verify the usefulness of this proposed method, it is applied to the context dependent speaker identification system. And the experimental results with the cellular phone speech DB which is designed by ETRI for Koran speaker recognition show that the proposed method is effective and increase the identification accuracy by 11% at maximum.

  • PDF