• Title/Summary/Keyword: speech recognizer

Search Result 185, Processing Time 0.024 seconds

Automatic Detection of Mispronunciation Using Phoneme Recognition For Foreign Language Instruction (음성인식기를 이용한 한국인의 외국어 발화오류 자동 검출)

  • Kwon Chul-Hong;Kang Hyo-Won;Lee Sang-Pil
    • MALSORI
    • /
    • no.48
    • /
    • pp.127-139
    • /
    • 2003
  • An automatic pronunciation correction system provides learners with correction guidelines for each mispronunciation. In this paper we propose an HMM based speech recognizer which automatically classifies pronunciation errors when Korean speak Japanese. For this purpose we also develop phoneme recognizers for Korean and Japanese. Experimental results show that the machine scores of the proposed recognizer correlate with expert ratings well.

  • PDF

Adoption of Support Vector Machine and Independent Component Analysis for Implementation of Speech Recognizer (음성인식기 구현을 위한 SVM과 독립성분분석 기법의 적용)

  • 박정원;김평환;김창근;허강인
    • Proceedings of the IEEK Conference
    • /
    • 2003.07e
    • /
    • pp.2164-2167
    • /
    • 2003
  • In this paper we propose effective speech recognizer through recognition experiments for three feature parameters(PCA, ICA and MFCC) using SVM(Support Vector Machine) classifier In general, SVM is classification method which classify two class set by finding voluntary nonlinear boundary in vector space and possesses high classification performance under few training data number. In this paper we compare recognition result for each feature parameter and propose ICA feature as the most effective parameter

  • PDF

A Study on the Robust Bimodal Speech-recognition System in Noisy Environments (잡음 환경에 강인한 이중모드 음성인식 시스템에 관한 연구)

  • 이철우;고인선;계영철
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.1
    • /
    • pp.28-34
    • /
    • 2003
  • Recent researches have been focusing on jointly using lip motions (i.e. visual speech) and speech for reliable speech recognitions in noisy environments. This paper also deals with the method of combining the result of the visual speech recognizer and that of the conventional speech recognizer through putting weights on each result: the paper proposes the method of determining proper weights for each result and, in particular, the weights are autonomously determined, depending on the amounts of noise in the speech and the image quality. Simulation results show that combining the audio and visual recognition by the proposed method provides the recognition performance of 84% even in severely noisy environments. It is also shown that in the presence of blur in images, the newly proposed weighting method, which takes the blur into account as well, yields better performance than the other methods.

Implementation of Speech Recognizer using Relevance Vector Machine (RVM을 이용한 음성인식기의 구현)

  • Kim, Chang-Keun;Koh, Si-Young;Hur, Kang-In;Lee, Kwang-Seok
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.11 no.8
    • /
    • pp.1596-1603
    • /
    • 2007
  • In this paper, we experimented by three kind of method for feature parameter, training method and recognition algorithm of most suitable for speech recognition system and considered. We decided speech recognition system of most suitable through two kind of experiment after we make speech recognizer. First, we did an experiment about three kind of feature parameter to evaluate recognition performance of it in speech recognizer using existent MFCC and MFCC new feature parameter that change characteristic space using PCA and ICA. Second, we experimented recognition performance or HMM, SVM and RVM by studying data number. By an experiment until now, feature parameter by ICA showed performance improvement of average 1.5% than MFCC by high linear discrimination from characteristic space. RVM showed performance improvement of maximum 3.25% than HMM in an experiment by decrease of studying data. As such result, effective method for speech recognition system to propose in this paper derives feature parameters using ICA and un recognition using RVM.

Voice Command Web Browser Using Variable Vocabulary Word Recognizer (가변어휘 단어 인식기를 사용한 음성 명령 웹 브라우저)

  • 이항섭
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.2
    • /
    • pp.48-52
    • /
    • 1999
  • In this paper, we describe a Voice Command Web Browser using a variable vocabulary word recognizer that can do Internet surfing with Korean speech recognition on the Web. The feature of this browser is that it can handle the links and menus of the web browser by speech. Therefore, we can use speech interface together with mouse for web browsing. To recognize the recognition candidates dynamically changing according to Web pages, we use the variable vocabulary word recognizer. The recognizer was trained using POW (Phonetically Optimized Words) 3,848 words. So that it can recognize new words which did not exist in training data. The preliminary test results showed that the performance of speaker-independent and vocabulary-independent recognition is 93.8% for 32 Korean words. The Voice Command Web Browser was developed on windows 95/NT using Netscape Navigator and reflected usability test results in order to offer easy interface to users unfamiliar with speech interface. In on-line experiment of speaker-independent and environment-independent situation, Voice Command Web Browser showed recognition accuracy of 90%.

  • PDF

Machine scoring method for speech recognizer detection mispronunciation of foreign language (외국어 발화오류 검출 음성인식기를 위한 스코어링 기법)

  • Kang, Hyo-Won;Bae, Min-Young;Lee, Jae-Kang;Kwon, Chul-Hong
    • Proceedings of the KSPS conference
    • /
    • 2004.05a
    • /
    • pp.239-242
    • /
    • 2004
  • An automatic pronunciation correction system provides users with correction guidelines for each pronunciation error. For this purpose, we propose a speech recognition system which automatically classifies pronunciation errors when Koreans speak a foreign language. In this paper, we also propose machine scoring methods for automatic assessment of pronunciation quality by the speech recognizer. Scores obtained from an expert human listener are used as the reference to evaluate the different machine scores and to provide targets when training some of algorithms. We use a log-likelihood score and a normalized log-likelihood score as machine scoring methods. Experimental results show that the normalized log-likelihood score had higher correlation with human scores than that obtained using the log-likelihood score.

  • PDF

The Comparison of Speech Feature Parameters for Emotion Recognition (감정 인식을 위한 음성의 특징 파라메터 비교)

  • 김원구
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2004.04a
    • /
    • pp.470-473
    • /
    • 2004
  • In this paper, the comparison of speech feature parameters for emotion recognition is studied for emotion recognition using speech signal. For this purpose, a corpus of emotional speech data recorded and classified according to the emotion using the subjective evaluation were used to make statical feature vectors such as average, standard deviation and maximum value of pitch and energy. MFCC parameters and their derivatives with or without cepstral mean subfraction are also used to evaluate the performance of the conventional pattern matching algorithms. Pitch and energy Parameters were used as a Prosodic information and MFCC Parameters were used as phonetic information. In this paper, In the Experiments, the vector quantization based emotion recognition system is used for speaker and context independent emotion recognition. Experimental results showed that vector quantization based emotion recognizer using MFCC parameters showed better performance than that using the Pitch and energy parameters. The vector quantization based emotion recognizer achieved recognition rates of 73.3% for the speaker and context independent classification.

  • PDF

Performance comparison of Text-Independent Speaker Recognizer Using VQ and GMM (VQ와 GMM을 이용한 문맥독립 화자인식기의 성능 비교)

  • Kim, Seong-Jong;Chung, Hoon;Chung, Ik-Joo
    • Speech Sciences
    • /
    • v.7 no.2
    • /
    • pp.235-244
    • /
    • 2000
  • This paper was focused on realizing the text-independent speaker recognizer using the VQ and GMM algorithm and studying the characteristics of the speaker recognizers that adopt these two algorithms. Because it was difficult ascertain the effect two algorithms have on the speaker recognizer theoretically, we performed the recognition experiments using various parameters and, as the result of the experiments, we could show that GMM algorithm had better recognition performance than VQ algorithm as following. The GMM showed better performance with small training data, and it also showed just a little difference of recognition rate as the kind of feature vectors and the length of input data vary. The GMM showed good recognition performance than the VQ on the whole.

  • PDF

Speech Recognition in the Car Noise Environment (자동차 소음 환경에서 음성 인식)

  • 김완구;차일환;윤대희
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.30B no.2
    • /
    • pp.51-58
    • /
    • 1993
  • This paper describes the development of a speaker-dependent isolated word recognizer as applied to voice dialing in a car noise environment. for this purpose, several methods to improve performance under such condition are evaluated using database collected in a small car moving at 100km/h The main features of the recognizer are as follow: The endpoint detection error can be reduced by using the magnitude of the signal which is inverse filtered by the AR model of the background noise, and it can be compensated by using variants of the DTW algorithm. To remove the noise, an autocorrelation subtraction method is used with the constraint that residual energy obtainable by linear predictive analysis should be positive. By using the noise rubust distance measure, distortion of the feature vector is minimized. The speech recognizer is implemented using the Motorola DSP56001(24-bit general purpose digital signal processor). The recognition database is composed of 50 Korean names spoken by 3 male speakers. The recognition error rate of the system is reduced to 4.3% using a single reference pattern for each word and 1.5% using 2 reference patterns for each word.

  • PDF

Speaker and Context Independent Emotion Recognition System using Gaussian Mixture Model (GMM을 이용한 화자 및 문장 독립적 감정 인식 시스템 구현)

  • 강면구;김원구
    • Proceedings of the IEEK Conference
    • /
    • 2003.07e
    • /
    • pp.2463-2466
    • /
    • 2003
  • This paper studied the pattern recognition algorithm and feature parameters for emotion recognition. In this paper, KNN algorithm was used as the pattern matching technique for comparison, and also VQ and GMM were used lot speaker and context independent recognition. The speech parameters used as the feature are pitch, energy, MFCC and their first and second derivatives. Experimental results showed that emotion recognizer using MFCC and their derivatives as a feature showed better performance than that using the Pitch and energy Parameters. For pattern recognition algorithm, GMM based emotion recognizer was superior to KNN and VQ based recognizer

  • PDF