• Title/Summary/Keyword: Confusability reduction

Search Result 3, Processing Time 0.021 seconds

Optimizing Multiple Pronunciation Dictionary Based on a Confusability Measure for Non-native Speech Recognition (타언어권 화자 음성 인식을 위한 혼잡도에 기반한 다중발음사전의 최적화 기법)

  • Kim, Min-A;Oh, Yoo-Rhee;Kim, Hong-Kook;Lee, Yeon-Woo;Cho, Sung-Eui;Lee, Seong-Ro
    • MALSORI
    • /
    • no.65
    • /
    • pp.93-103
    • /
    • 2008
  • In this paper, we propose a method for optimizing a multiple pronunciation dictionary used for modeling pronunciation variations of non-native speech. The proposed method removes some confusable pronunciation variants in the dictionary, resulting in a reduced dictionary size and less decoding time for automatic speech recognition (ASR). To this end, a confusability measure is first defined based on the Levenshtein distance between two different pronunciation variants. Then, the number of phonemes for each pronunciation variant is incorporated into the confusability measure to compensate for ASR errors due to words of a shorter length. We investigate the effect of the proposed method on ASR performance, where Korean is selected as the target language and Korean utterances spoken by Chinese native speakers are considered as non-native speech. It is shown from the experiments that an ASR system using the multiple pronunciation dictionary optimized by the proposed method can provide a relative average word error rate reduction of 6.25%, with 11.67% less ASR decoding time, as compared with that using a multiple pronunciation dictionary without the optimization.

  • PDF

Building a Morpheme-Based Pronunciation Lexicon for Korean Large Vocabulary Continuous Speech Recognition (한국어 대어휘 연속음성 인식용 발음사전 자동 생성 및 최적화)

  • Lee Kyong-Nim;Chung Minhwa
    • MALSORI
    • /
    • v.55
    • /
    • pp.103-118
    • /
    • 2005
  • In this paper, we describe a morpheme-based pronunciation lexicon useful for Korean LVCSR. The phonemic-context-dependent multiple pronunciation lexicon improves the recognition accuracy when cross-morpheme pronunciation variations are distinguished from within-morpheme pronunciation variations. Since adding all possible pronunciation variants to the lexicon increases the lexicon size and confusability between lexical entries, we have developed a lexicon pruning scheme for optimal selection of pronunciation variants to improve the performance of Korean LVCSR. By building a proposed pronunciation lexicon, an absolute reduction of $0.56\%$ in WER from the baseline performance of $27.39\%$ WER is achieved by cross-morpheme pronunciation variations model with a phonemic-context-dependent multiple pronunciation lexicon. On the best performance, an additional reduction of the lexicon size by $5.36\%$ is achieved from the same lexical entries.

  • PDF

Performance Improvement of Korean Connected Digit Recognition Using Various Discriminant Analyses (다양한 변별분석을 통한 한국어 연결숫자 인식 성능향상에 관한 연구)

  • Song Hwa Jeon;Kim Hyung Soon
    • MALSORI
    • /
    • no.44
    • /
    • pp.105-113
    • /
    • 2002
  • In Korean, each digit is monosyllable and some pairs are known to have high confusability, causing performance degradation of connected digit recognition systems. To improve the performance, in this paper, we employ various discriminant analyses (DA) including Linear DA (LDA), Weighted Pairwise Scatter LDA WPS-LDA), Heteroscedastic Discriminant Analysis (HDA), and Maximum Likelihood Linear Transformation (MLLT). We also examine several combinations of various DA for additional performance improvement. Experimental results show that applying any DA mentioned above improves the string accuracy, but the amount of improvement of each DA method varies according to the model complexity or number of mixtures per state. Especially, more than 20% of string error reduction is achieved by applying MLLT after WPS-LDA, compared with the baseline system, when class level of DA is defined as a tied state and 1 mixture per state is used.

  • PDF