• 제목/요약/키워드: pronunciation model

검색결과 66건 처리시간 0.021초

한국인의 영어 인식을 위한 문맥 종속성 기반 음향모델/발음모델 적응 (Acoustic and Pronunciation Model Adaptation Based on Context dependency for Korean-English Speech Recognition)

  • 오유리;김홍국;이연우;이성로
    • 대한음성학회지:말소리
    • /
    • 제68권
    • /
    • pp.33-47
    • /
    • 2008
  • In this paper, we propose a hybrid acoustic and pronunciation model adaptation method based on context dependency for Korean-English speech recognition. The proposed method is performed as follows. First, in order to derive pronunciation variant rules, an n-best phoneme sequence is obtained by phone recognition. Second, we decompose each rule into a context independent (CI) or a context dependent (CD) one. To this end, it is assumed that a different phoneme structure between Korean and English makes CI pronunciation variabilities while coarticulation effects are related to CD pronunciation variabilities. Finally, we perform an acoustic model adaptation and a pronunciation model adaptation for CI and CD pronunciation variabilities, respectively. It is shown from the Korean-English speech recognition experiments that the average word error rate (WER) is decreased by 36.0% when compared to the baseline that does not include any adaptation. In addition, the proposed method has a lower average WER than either the acoustic model adaptation or the pronunciation model adaptation.

  • PDF

Optimized Chinese Pronunciation Prediction by Component-Based Statistical Machine Translation

  • Zhu, Shunle
    • Journal of Information Processing Systems
    • /
    • 제17권1호
    • /
    • pp.203-212
    • /
    • 2021
  • To eliminate ambiguities in the existing methods to simplify Chinese pronunciation learning, we propose a model that can predict the pronunciation of Chinese characters automatically. The proposed model relies on a statistical machine translation (SMT) framework. In particular, we consider the components of Chinese characters as the basic unit and consider the pronunciation prediction as a machine translation procedure (the component sequence as a source sentence, the pronunciation, pinyin, as a target sentence). In addition to traditional features such as the bidirectional word translation and the n-gram language model, we also implement a component similarity feature to overcome some typos during practical use. We incorporate these features into a log-linear model. The experimental results show that our approach significantly outperforms other baseline models.

영어 발음 평가 모델을 활용한 수동 평가자 연구 (A Study on Human Evaluators Using the Evaluation Model of English Pronunciation)

  • 윤규철
    • 말소리와 음성과학
    • /
    • 제5권4호
    • /
    • pp.109-119
    • /
    • 2013
  • The purpose of this paper is to show the tendency of evaluators in the pronunciation evaluation of English utterances. The tendency was visualized using the evaluation model of English pronunciation proposed in [1]. One hundred fifty female university students and four evaluators participated in the study. Students read eight English sentences aloud as evaluators evaluated English pronunciation by their own criteria. The models based on their pronunciation evaluation proved to be efficient in showing their evaluation tendency in terms of the fundamental frequency, intensity, segmental durations, and segmental spectra as compared to those of the five native speakers of English chosen for building the models. However, human evaluators were not always consistent in their evaluation and sometimes gave conflicting scores to the same students.

한국어 대어휘 연속음성 인식용 발음사전 자동 생성 및 최적화 (Building a Morpheme-Based Pronunciation Lexicon for Korean Large Vocabulary Continuous Speech Recognition)

  • 이경님;정민화
    • 대한음성학회지:말소리
    • /
    • 제55권
    • /
    • pp.103-118
    • /
    • 2005
  • In this paper, we describe a morpheme-based pronunciation lexicon useful for Korean LVCSR. The phonemic-context-dependent multiple pronunciation lexicon improves the recognition accuracy when cross-morpheme pronunciation variations are distinguished from within-morpheme pronunciation variations. Since adding all possible pronunciation variants to the lexicon increases the lexicon size and confusability between lexical entries, we have developed a lexicon pruning scheme for optimal selection of pronunciation variants to improve the performance of Korean LVCSR. By building a proposed pronunciation lexicon, an absolute reduction of $0.56\%$ in WER from the baseline performance of $27.39\%$ WER is achieved by cross-morpheme pronunciation variations model with a phonemic-context-dependent multiple pronunciation lexicon. On the best performance, an additional reduction of the lexicon size by $5.36\%$ is achieved from the same lexical entries.

  • PDF

형태소 발음변이를 고려한 음성인식 단위의 성능 (Performance of speech recognition unit considering morphological pronunciation variation)

  • 방정욱;김상훈;권오욱
    • 말소리와 음성과학
    • /
    • 제10권4호
    • /
    • pp.111-119
    • /
    • 2018
  • This paper proposes a method to improve speech recognition performance by extracting various pronunciations of the pseudo-morpheme unit from an eojeol unit corpus and generating a new recognition unit considering pronunciation variations. In the proposed method, we first align the pronunciation of the eojeol units and the pseudo-morpheme units, and then expand the pronunciation dictionary by extracting the new pronunciations of the pseudo-morpheme units at the pronunciation of the eojeol units. Then, we propose a new recognition unit that relies on pronunciation by tagging the obtained phoneme symbols according to the pseudo-morpheme units. The proposed units and their extended pronunciations are incorporated into the lexicon and language model of the speech recognizer. Experiments for performance evaluation are performed using the Korean speech recognizer with a trigram language model obtained by a 100 million pseudo-morpheme corpus and an acoustic model trained by a multi-genre broadcast speech data of 445 hours. The proposed method is shown to reduce the word error rate relatively by 13.8% in the news-genre evaluation data and by 4.5% in the total evaluation data.

The Comparisons of Pronunciation Teaching in Lingua Franca Core and IMO Maritime English Model Course 3.17 for Global Communication at Sea

  • Choi, Seung-Hee;Park, Jin-Soo
    • 한국항해항만학회지
    • /
    • 제40권5호
    • /
    • pp.279-284
    • /
    • 2016
  • As the International Maritime English Organization (IMO) model course for Maritime English has been recently revised and updated, the requirements of current changes to both the 2010 STCW Manila Amendments and English education have been actively reviewed. In order to provide practical guidelines for language teaching, a wide range of new pedagogical approaches and their theoretical backgrounds are also suggested. However, considering the current spread of Business English as a Lingua Franca (BELF) and its critical importance in maritime communication, the pedagogical approaches need to be re-evaluated, specifically in terms of teaching pronunciation in order to emphasize clear and effective communication among international interlocutors. Therefore, the core pedagogical elements of pronunciation should be clearly set and provided with consideration for Lingua Franca Core (LFC), which places importance on mutual intelligibility rather than following the rules of native speakers. In this paper, the current trends of BELF in the maritime industry will thus be introduced. Following this, the importance of LFC in maritime communication will be outlined, and its key features will be discussed in terms of effectiveness and clarity of international maritime communications. Finally, a close comparison between LFC and the pronunciation guidelines suggested by the IMO Maritime English model course 3.17 will be conducted, and pedagogical implications for future teaching pronunciation in cross-cultural global maritime industry will be suggested.

ENGLISH RESTRUCTURING AND A USE OF MUSIC IN TEACHING ENGLISH PRONUNCIATION

  • Kim, Key-Seop
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2000년도 7월 학술대회지
    • /
    • pp.117-134
    • /
    • 2000
  • Kim, Key-Seop(2000). English Restructuring and A Use of Music in Teaching English Pronunciation. JSEP 2000 voU This study has two-fold aims: one is to clarify the restructuring of English in utterance, and the other is to relate it to teaching English pronunciation for listening and speaking with a use of music and song by suggesting a model of 10-15 minute pronunciation class syllabus for every period in class. Generally, English utterances are restructured by stress-timed rhythm, irrespective of syntactic boundaries. So the rhythmic units are arranged in isochronous groups, of which the making is to attach clitic(s) to a host or head often leftwards and sometimes rightwards, which results in linking, contraction, reduction, sound change and rhythm adjustment in utterance, just as in music and song. With English restructuring focused on, a model of English pronunciation class syllabus is proposed to be put forward in class for every period of a lesson or unit. It tries to relate the focused factor(s) in pronunciation to the integrated, with teaching techniques and music made use of.

  • PDF

한국어 음성인식을 위한 효율적인 사전 구성에 관한 연구 (Study on Efficient Generation of Dictionary for Korean Vocabulary Recognition)

  • 이상복;최대림;김종교
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2002년도 11월 학술대회지
    • /
    • pp.41-44
    • /
    • 2002
  • This paper is related to the enhancement of speech recognition rate using enhanced pronunciation dictionary. Modern large vocabulary, continuous speech recognition systems have pronunciation dictionaries. A pronunciation dictionary provides pronunciation information for each word in the vocabulary in phonemic units, which are modeled in detail by the acoustic models. But in most speech recognition system based on Hidden Markov Model, actual pronunciation variations are disregarded. Without the pronunciation variations in the speech recognition system, the phonetic transcriptions in the dictionary do not match the actual occurrences in the database. In this paper, we proposed the unvoiced rule of semivowel in allophone rules to pronunciation dictionary. Experimental results on speech recognition system give higher performance than existing pronunciation dictionaries.

  • PDF

한국어 연속음성인식 시스템 구현을 위한 형태소 단위의 발음 변화 모델링 (Modeling Cross-morpheme Pronunciation Variations for Korean Large Vocabulary Continuous Speech Recognition)

  • 정민화;이경님
    • 대한음성학회지:말소리
    • /
    • 제49호
    • /
    • pp.107-121
    • /
    • 2004
  • In this paper, we describe a cross-morpheme pronunciation variation model which is especially useful for constructing morpheme-based pronunciation lexicon to improve the performance of a Korean LVCSR. There are a lot of pronunciation variations occurring at morpheme boundaries in continuous speech. Since phonemic context together with morphological category and morpheme boundary information affect Korean pronunciation variations, we have distinguished phonological rules that can be applied to phonemes in within-morpheme and cross-morpheme. The results of 33K-morpheme Korean CSR experiments show that an absolute reduction of 1.45% in WER from the baseline performance of 18.42% WER was achieved by modeling proposed pronunciation variations with a possible multiple context-dependent pronunciation lexicon.

  • PDF

한국어 연속음성인식을 위한 형태소 경계에서의 발음 변화 현상 모델링 (Modeling Cross-morpheme Pronunciation Variation for Korean LVCSR)

  • 이경님;정민화
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2003년도 5월 학술대회지
    • /
    • pp.75-78
    • /
    • 2003
  • In this paper, we describe a cross-morpheme pronunciation variation model which is especially useful for constructing morpheme-based pronunciation lexicon for Korean LVCSR. There are a lot of pronunciation variations occurring at morpheme boundaries in continuous speech. Since phonemic context together with morphological category and morpheme boundary information affect Korean pronunciation variations, we have distinguished pronunciation variation rules according to the locations such as within a morpheme, across a morpheme boundary in a compound noun, across a morpheme boundary in an eojeol, and across an eojeol boundary. In 33K-morpheme Korean CSR experiment, an absolute improvement of 1.16% in WER from the baseline performance of 23.17% WER is achieved by modeling cross-morpheme pronunciation variations with a context-dependent multiple pronunciation lexicon.

  • PDF