• Title/Summary/Keyword: isolated word recognition

Search Result 134, Processing Time 0.034 seconds

Exclusion of Non-similar Candidates using Positional Accuracy based on Levenstein Distance from N-best Recognition Results of Isolated Word Recognition (레벤스타인 거리에 기초한 위치 정확도를 이용한 고립 단어 인식 결과의 비유사 후보 단어 제외)

  • Yun, Young-Sun;Kang, Jeom-Ja
    • Phonetics and Speech Sciences
    • /
    • v.1 no.3
    • /
    • pp.109-115
    • /
    • 2009
  • Many isolated word recognition systems may generate non-similar words for recognition candidates because they use only acoustic information. In this paper, we investigate several techniques which can exclude non-similar words from N-best candidate words by applying Levenstein distance measure. At first, word distance method based on phone and syllable distances are considered. These methods use just Levenstein distance on phones or double Levenstein distance algorithm on syllables of candidates. Next, word similarity approaches are presented that they use characters' position information of word candidates. Each character's position is labeled to inserted, deleted, and correct position after alignment between source and target string. The word similarities are obtained from characters' positional probabilities which mean the frequency ratio of the same characters' observations on the position. From experimental results, we can find that the proposed methods are effective for removing non-similar words without loss of system performance from the N-best recognition candidates of the systems.

  • PDF

Various Approaches to Improve Exclusion Performance of Non-similar Candidates from N-best Recognition Results on Isolated Word Recognition (고립 단어 인식 결과의 비유사 후보 단어 제외 성능을 개선하기 위한 다양한 접근 방법 연구)

  • Yun, Young-Sun
    • Phonetics and Speech Sciences
    • /
    • v.2 no.4
    • /
    • pp.153-161
    • /
    • 2010
  • Many isolated word recognition systems may generate non-similar words for recognition candidates because they use only acoustic information. The previous study [1,2] investigated several techniques which can exclude non-similar words from N-best candidate words by applying Levenstein distance measure. This paper discusses the various improving techniques of removing non-similar recognition results. The mentioned methods include comparison penalties or weights, phone accuracy based on confusion information, weights candidates by ranking order and partial comparisons. Through experimental results, it is found that some proposed method keeps more accurate recognition results than the previous method's results.

  • PDF

A Modified Viterbi Algorithm for Word Boundary Detection Error Compensation (단어 경계 검출 오류 보정을 위한 수정된 비터비 알고리즘)

  • Chung, Hoon;Chung, Ik-Joo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.26 no.1E
    • /
    • pp.21-26
    • /
    • 2007
  • In this paper, we propose a modified Viterbi algorithm to compensate for endpoint detection error during the decoding phase of an isolated word recognition task. Since the conventional Viterbi algorithm explores only the search space whose boundaries are fixed to the endpoints of the segmented utterance by the endpoint detector, the recognition performance is highly dependent on the accuracy level of endpoint detection. Inaccurately segmented word boundaries lead directly to recognition error. In order to relax the degradation of recognition accuracy due to endpoint detection error, we describe an unconstrained search of word boundaries and present an algorithm to explore the search space with efficiency. The proposed algorithm was evaluated by performing a variety of simulated endpoint detection error cases on an isolated word recognition task. The proposed algorithm reduced the Word Error Rate (WER) considerably, from 84.4% to 10.6%, while consuming only a little more computation power.

A Study on the recognition of local name using Spatio-Temporal method (Spatio-temporal방법을 이용한 지역명 인식에 관한 연구)

  • 지원우
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1993.06a
    • /
    • pp.121-124
    • /
    • 1993
  • This paper is a study on the word recognition using neural network. A limited vocabulary, speaker independent, isolated word recognition system has been built. This system recognizes isolated word without performing segmentation, phoneme identification, or dynamic time wrapping. It needs a static pattern approach to recognize a spatio-temporal pattern. The preprocessing only includes preceding and tailing silence removal, and word length determination. A LPC analysis is performed on each of 24 equally spaced frames. The PARCOR coefficients plus 3 other features from each frame is extracted. In order to simplify a structure of neural network, we composed binary code form to decrease output nodes.

  • PDF

Isolated Word Recognition Using Segment Probability Model (분할확률 모델을 이용한 한국어 고립단어 인식)

  • 김진영;성경모
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.25 no.12
    • /
    • pp.1541-1547
    • /
    • 1988
  • In this paper, a new model for isolated word recognition called segment probability model is proposed. The proposed model is composed of two procedures of segmentation and modelling each segment. Therefore the spoken word is devided into arbitrary segments and observation probability in each segments is obtained using vector quantization. The proposed model is compared with pattern matching method and hidden Markov model by recognition experiment. The experimental results show that the proposed model is better than exsisting methods in terms of recognition rate and caculation amounts.

  • PDF

Speaker Adaptation in HMM-based Korean Isoklated Word Recognition (한국어 격리단어 인식 시스템에서 HMM 파라미터의 화자 적응)

  • 오광철;이황수;은종관
    • The Transactions of the Korean Institute of Electrical Engineers
    • /
    • v.40 no.4
    • /
    • pp.351-359
    • /
    • 1991
  • This paper describes performances of speaker adaptation using a probabilistic spectral mapping matrix in hidden-Markov model(HMM) -based Korean isolated word recognition. Speaker adaptation based on probabilistic spectral mapping uses a well-trained prototype HMM's and is carried out by Viterbi, dynamic time warping, and forward-backward algorithms. Among these algorithms, the best performance is obtained by using the Viterbi approach together with codebook adaptation whose improvement for isolated word recognition accuracy is 42.6-68.8 %. Also, the selection of the initial values of the matrix and the normalization in computing the matrix affects the recognition accuracy.

Isolated Words Recognition using K-means iteration without Initialization (초기화하지 않은 K-means iteration을 이용한 고립단어 인식)

  • Kim, Jin-Young;Sung, Keong-Mo
    • Proceedings of the KIEE Conference
    • /
    • 1988.07a
    • /
    • pp.7-9
    • /
    • 1988
  • K-means iteration method is generally used for creating the templates in speaker-independent isolated-word recognition system. In this paper the initialization method of initial centers is proposed. The concepts are sorting and trace segmentation. All the tokens are sorted and segmented by trace segmentation so that initial centers are decided. The performance of this method is evaluated by isolated-word recognition of Korean digits. The highest recognition rate is 97.6%.

  • PDF

Removal of Heterogeneous Candidates Using Positional Accuracy Based on Levenshtein Distance on Isolated n-best Recognition (레벤스타인 거리 기반의 위치 정확도를 이용하여 다중 음성 인식 결과에서 관련성이 적은 후보 제거)

  • Yun, Young-Sun
    • The Journal of the Acoustical Society of Korea
    • /
    • v.30 no.8
    • /
    • pp.428-435
    • /
    • 2011
  • Many isolated word recognition systems may generate irrelevant words for recognition results because they use only acoustic information or small amount of language information. In this paper, I propose word similarity that is used for selecting (or removing) less common words from candidates by applying Levenshtein distance. Word similarity is obtained by using positional accuracy that reflects the frequency information along to character's alignment information. This paper also discusses various improving techniques of selection of disparate words. The methods include different loss values, phone accuracy based on confusion information, weights of candidates by ranking order and partial comparisons. Through experiments, I found that the proposed methods are effective for removing heterogeneous words without loss of performance.

Digital Isolated Word Recognition System based on MFCC and DTW Algorithm (MFCC와 DTW에 알고리즘을 기반으로 한 디지털 고립단어 인식 시스템)

  • Zang, Xian;Chong, Kil-To
    • Proceedings of the KIEE Conference
    • /
    • 2008.10b
    • /
    • pp.290-291
    • /
    • 2008
  • The most popular speech feature used in speech recognition today is the Mel-Frequency Cepstral Coefficients (MFCC) algorithm, which could reflect the perception characteristics of the human ear more accurately than other parameters. This paper adopts MFCC and its first order difference, which could reflect the dynamic character of speech signal, as synthetical parametric representation. Furthermore, we quote Dynamic Time Warping (DTW) algorithm to search match paths in the pattern recognition process. We use the software "GoldWave" to record English digitals in the lab environments and the simulation results indicate the algorithm has higher recognition accuracy than others using LPCC, etc. as character parameters in the experiment for Digital Isolated Word Recognition (DIWR) system.

  • PDF

Implementation of the Auditory Sense for the Smart Robot: Speaker/Speech Recognition (로봇 시스템에의 적용을 위한 음성 및 화자인식 알고리즘)

  • Jo, Hyun;Kim, Gyeong-Ho;Park, Young-Jin
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2007.05a
    • /
    • pp.1074-1079
    • /
    • 2007
  • We will introduce speech/speaker recognition algorithm for the isolated word. In general case of speaker verification, Gaussian Mixture Model (GMM) is used to model the feature vectors of reference speech signals. On the other hand, Dynamic Time Warping (DTW) based template matching technique was proposed for the isolated word recognition in several years ago. We combine these two different concepts in a single method and then implement in a real time speaker/speech recognition system. Using our proposed method, it is guaranteed that a small number of reference speeches (5 or 6 times training) are enough to make reference model to satisfy 90% of recognition performance.

  • PDF