• Title/Summary/Keyword: Pronunciation lexicon

Search Result 23, Processing Time 0.028 seconds

Building a Morpheme-Based Pronunciation Lexicon for Korean Large Vocabulary Continuous Speech Recognition (한국어 대어휘 연속음성 인식용 발음사전 자동 생성 및 최적화)

  • Lee Kyong-Nim;Chung Minhwa
    • MALSORI
    • /
    • v.55
    • /
    • pp.103-118
    • /
    • 2005
  • In this paper, we describe a morpheme-based pronunciation lexicon useful for Korean LVCSR. The phonemic-context-dependent multiple pronunciation lexicon improves the recognition accuracy when cross-morpheme pronunciation variations are distinguished from within-morpheme pronunciation variations. Since adding all possible pronunciation variants to the lexicon increases the lexicon size and confusability between lexical entries, we have developed a lexicon pruning scheme for optimal selection of pronunciation variants to improve the performance of Korean LVCSR. By building a proposed pronunciation lexicon, an absolute reduction of $0.56\%$ in WER from the baseline performance of $27.39\%$ WER is achieved by cross-morpheme pronunciation variations model with a phonemic-context-dependent multiple pronunciation lexicon. On the best performance, an additional reduction of the lexicon size by $5.36\%$ is achieved from the same lexical entries.

  • PDF

Generating Pronunciation Lexicon for Continuous Speech Recognition Based on Observation Frequencies of Phonetic Rules (음소변동규칙의 발견빈도에 기반한 음성인식 발음사전 구성)

  • Na, Min-Soo;Chung, Min-Hwa
    • MALSORI
    • /
    • no.64
    • /
    • pp.137-153
    • /
    • 2007
  • The pronunciation lexicon of a continuous speech recognition system should contain enough pronunciation variations to be used for building a search space large enough to contain a correct path, whereas the size of the pronunciation lexicon needs to be constrained for effective decoding and lower perplexities. This paper describes a procedure for selecting pronunciation variations to be included in the lexicon based on the frequencies of the corresponding phonetic rules observed in the training corpus. Likelihood of a phonetic rule's application is estimated using the observation frequency of the rule and is used to control the construction of a pronunciation lexicon. Experiments with various pronunciation lexica show that the proposed method is helpful to improve the speech recognition performance.

  • PDF

Pronunciation Lexicon Optimization with Applying Variant Selection Criteria (발음 변이의 발음사전 포함 결정 조건을 통한 발음사전 최적화)

  • Jeon, Je-Hun;Chung, Min-Hwa
    • Proceedings of the KSPS conference
    • /
    • 2006.11a
    • /
    • pp.24-27
    • /
    • 2006
  • This paper describes how a domain dependent pronunciation lexicon is generated and optimized for Korean large vocabulary continuous speech recognition(LVCSR). At the level of lexicon, pronunciation variations are usually modeled by adding pronunciation variants to the lexicon. We propose the criteria for selecting appropriate pronunciation variants in lexicon: (i) likelihood and (ii) frequency factors to select variants. Our experiment is conducted in three steps. First, the variants are generated with knowledge-based rules. Second, we generate a domain dependent lexicon which includes various numbers of pronunciation variants based on the proposed criteria. Finally, the WERs and RTFs are examined with each lexicon. In the experiment, 0.72% WER reduction is obtained by introducing the variants pruning criteria. Furthermore, RTF is not deteriorated although the average number of variants is higher than that of compared lexica.

  • PDF

Modeling Cross-morpheme Pronunciation Variations for Korean Large Vocabulary Continuous Speech Recognition (한국어 연속음성인식 시스템 구현을 위한 형태소 단위의 발음 변화 모델링)

  • Chung Minhwa;Lee Kyong-Nim
    • MALSORI
    • /
    • no.49
    • /
    • pp.107-121
    • /
    • 2004
  • In this paper, we describe a cross-morpheme pronunciation variation model which is especially useful for constructing morpheme-based pronunciation lexicon to improve the performance of a Korean LVCSR. There are a lot of pronunciation variations occurring at morpheme boundaries in continuous speech. Since phonemic context together with morphological category and morpheme boundary information affect Korean pronunciation variations, we have distinguished phonological rules that can be applied to phonemes in within-morpheme and cross-morpheme. The results of 33K-morpheme Korean CSR experiments show that an absolute reduction of 1.45% in WER from the baseline performance of 18.42% WER was achieved by modeling proposed pronunciation variations with a possible multiple context-dependent pronunciation lexicon.

  • PDF

Automatic Generation of Pronunciation Variants for Korean Continuous Speech Recognition (한국어 연속음성 인식을 위한 발음열 자동 생성)

  • 이경님;전재훈;정민화
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.2
    • /
    • pp.35-43
    • /
    • 2001
  • Many speech recognition systems have used pronunciation lexicon with possible multiple phonetic transcriptions for each word. The pronunciation lexicon is of often manually created. This process requires a lot of time and efforts, and furthermore, it is very difficult to maintain consistency of lexicon. To handle these problems, we present a model based on morphophon-ological analysis for automatically generating Korean pronunciation variants. By analyzing phonological variations frequently found in spoken Korean, we have derived about 700 phonemic contexts that would trigger the multilevel application of the corresponding phonological process, which consists of phonemic and allophonic rules. In generating pronunciation variants, morphological analysis is preceded to handle variations of phonological words. According to the morphological category, a set of tables reflecting phonemic context is looked up to generate pronunciation variants. Our experiments show that the proposed model produces mostly correct pronunciation variants of phonological words. Then we estimated how useful the pronunciation lexicon and training phonetic transcription using this proposed systems.

  • PDF

Modeling Cross-morpheme Pronunciation Variation for Korean LVCSR (한국어 연속음성인식을 위한 형태소 경계에서의 발음 변화 현상 모델링)

  • Lee Kyong-Nim;Chung Minhwa
    • Proceedings of the KSPS conference
    • /
    • 2003.05a
    • /
    • pp.75-78
    • /
    • 2003
  • In this paper, we describe a cross-morpheme pronunciation variation model which is especially useful for constructing morpheme-based pronunciation lexicon for Korean LVCSR. There are a lot of pronunciation variations occurring at morpheme boundaries in continuous speech. Since phonemic context together with morphological category and morpheme boundary information affect Korean pronunciation variations, we have distinguished pronunciation variation rules according to the locations such as within a morpheme, across a morpheme boundary in a compound noun, across a morpheme boundary in an eojeol, and across an eojeol boundary. In 33K-morpheme Korean CSR experiment, an absolute improvement of 1.16% in WER from the baseline performance of 23.17% WER is achieved by modeling cross-morpheme pronunciation variations with a context-dependent multiple pronunciation lexicon.

  • PDF

Computerized Sound Dictionary of Korean and English

  • Kim, Jong-Mi
    • Speech Sciences
    • /
    • v.8 no.1
    • /
    • pp.33-52
    • /
    • 2001
  • A bilingual sound dictionary in Korean and English has been created for a broad range of sound reference to cross-linguistic, dialectal, native language (L1)-transferred biological and allophonic variations. The paper demonstrates that the pronunciation dictionary of the lexicon is inadequate for sound reference due to the preponderance of unmarked sounds. The audio registry consists of the three-way comparison of 1) English speech from native English speakers, 2) Korean speech from Korean speakers, and 3) English speech from Korean speakers. Several sub-dictionaries have been created as the foundation research for independent development. They are 1) a pronunciation dictionary of the Korean lexicon in a keyboard-compatible phonetic transcription, 2) a sound dictionary of L1-interfered language, and 3) an audible dictionary of Korean sounds. The dictionary was designed to facilitate the exchange of the speech signal and its corresponding text data on various media particularly on CD-ROM. The methodology and findings of the construction are discussed.

  • PDF

Stochastic Pronunciation Lexicon Modeling for Large Vocabulary Continous Speech Recognition (확률 발음사전을 이용한 대어휘 연속음성인식)

  • Yun, Seong-Jin;Choi, Hwan-Jin;Oh, Yung-Hwan
    • The Journal of the Acoustical Society of Korea
    • /
    • v.16 no.2
    • /
    • pp.49-57
    • /
    • 1997
  • In this paper, we propose the stochastic pronunciation lexicon model for large vocabulary continuous speech recognition system. We can regard stochastic lexicon as HMM. This HMM is a stochastic finite state automata consisting of a Markov chain of subword states and each subword state in the baseform has a probability distribution of subword units. In this method, an acoustic representation of a word can be derived automatically from sample sentence utterances and subword unit models. Additionally, the stochastic lexicon is further optimized to the subword model and recognizer. From the experimental result on 3000 word continuous speech recognition, the proposed method reduces word error rate by 23.6% and sentence error rate by 10% compare to methods based on standard phonetic representations of words.

  • PDF

Improvements of an English Pronunciation Dictionary Generator Using DP-based Lexicon Pre-processing and Context-dependent Grapheme-to-phoneme MLP (DP 알고리즘에 의한 발음사전 전처리와 문맥종속 자소별 MLP를 이용한 영어 발음사전 생성기의 개선)

  • 김회린;문광식;이영직;정재호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.5
    • /
    • pp.21-27
    • /
    • 1999
  • In this paper, we propose an improved MLP-based English pronunciation dictionary generator to apply to the variable vocabulary word recognizer. The variable vocabulary word recognizer can process any words specified in Korean word lexicon dynamically determined according to the current recognition task. To extend the ability of the system to task for English words, it is necessary to build a pronunciation dictionary generator to be able to process words not included in a predefined lexicon, such as proper nouns. In order to build the English pronunciation dictionary generator, we use context-dependent grapheme-to-phoneme multi-layer perceptron(MLP) architecture for each grapheme. To train each MLP, it is necessary to obtain grapheme-to-phoneme training data from general pronunciation dictionary. To automate the process, we use dynamic programming(DP) algorithm with some distance metrics. For training and testing the grapheme-to-phoneme MLPs, we use general English pronunciation dictionary with about 110 thousand words. With 26 MLPs each having 30 to 50 hidden nodes and the exception grapheme lexicon, we obtained the word accuracy of 72.8% for the 110 thousand words superior to rule-based method showing the word accuracy of 24.0%.

  • PDF

Automatic Conversion of English Pronunciation Using Sequence-to-Sequence Model (Sequence-to-Sequence Model을 이용한 영어 발음 기호 자동 변환)

  • Lee, Kong Joo;Choi, Yong Seok
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.5
    • /
    • pp.267-278
    • /
    • 2017
  • As the same letter can be pronounced differently depending on word contexts, one should refer to a lexicon in order to pronounce a word correctly. Phonetic alphabets that lexicons adopt as well as pronunciations that lexicons describe for the same word can be different from lexicon to lexicon. In this paper, we use a sequence-to-sequence model that is widely used in deep learning research area in order to convert automatically from one pronunciation to another. The 12 seq2seq models are implemented based on pronunciation training data collected from 4 different lexicons. The exact accuracy of the models ranges from 74.5% to 89.6%. The aim of this study is the following two things. One is to comprehend a property of phonetic alphabets and pronunciations used in various lexicons. The other is to understand characteristics of seq2seq models by analyzing an error.