• Title/Summary/Keyword: 연속음성인식

Search Result 259, Processing Time 0.02 seconds

Comparison of the recognition performance of Korean connected digit telephone speech depending on channel compensation methods and feature parameters (채널보상기법 및 특징파라미터에 따른 한국어 연속숫자음 전화음성의 인식성능 비교)

  • Jung Sung Yun;Kim Min Sung;Son Jong Mok;Bae Keun Sung;Kim Sang Hun
    • Proceedings of the KSPS conference
    • /
    • 2002.11a
    • /
    • pp.201-204
    • /
    • 2002
  • As a preliminary study for improving recognition performance of the connected digit telephone speech, we investigate feature parameters as well as channel compensation methods of telephone speech. The CMN and RTCN are examined for telephone channel compensation, and the MFCC, DWFBA, SSC and their delta-features are examined as feature parameters. Recognition experiments with database we collected show that in feature level DWFBA is better than MFCC and for channel compensation RTCN is better than CMN. The DWFBA+Delta_ Mel-SSC feature shows the highest recognition rate.

  • PDF

Building a Morpheme-Based Pronunciation Lexicon for Korean Large Vocabulary Continuous Speech Recognition (한국어 대어휘 연속음성 인식용 발음사전 자동 생성 및 최적화)

  • Lee Kyong-Nim;Chung Minhwa
    • MALSORI
    • /
    • v.55
    • /
    • pp.103-118
    • /
    • 2005
  • In this paper, we describe a morpheme-based pronunciation lexicon useful for Korean LVCSR. The phonemic-context-dependent multiple pronunciation lexicon improves the recognition accuracy when cross-morpheme pronunciation variations are distinguished from within-morpheme pronunciation variations. Since adding all possible pronunciation variants to the lexicon increases the lexicon size and confusability between lexical entries, we have developed a lexicon pruning scheme for optimal selection of pronunciation variants to improve the performance of Korean LVCSR. By building a proposed pronunciation lexicon, an absolute reduction of $0.56\%$ in WER from the baseline performance of $27.39\%$ WER is achieved by cross-morpheme pronunciation variations model with a phonemic-context-dependent multiple pronunciation lexicon. On the best performance, an additional reduction of the lexicon size by $5.36\%$ is achieved from the same lexical entries.

  • PDF

Analysis of Error Patterns in ]Korean Connected Digit Telephone Speech Recognition (한국어 연속 숫자음 전화 음성 인식에서의 오인식 유형 분석)

  • Kim Min Sung;Jung Sung Yun;Son Jong Mok;Bae Keun Sung;Kim Sang Hun
    • MALSORI
    • /
    • no.46
    • /
    • pp.77-86
    • /
    • 2003
  • Channel distortion and coarticulation effect in the Korean connected digit telephone speech make it difficult to achieve high performance of connected digit recognition in the telephone environment. In this paper, as a basic research to improve the recognition performance of Korean connected digit telephone speech, recognition error patterns are investigated and analyzed. Korean connected digit telephone speech database released by SiTEC and HTK system are used for recognition experiments. Both DWFBA and MRTCN methods are used for feature extraction and channel compensation, respectively. Experimental results are discussed with our findings.

  • PDF

A Study on Recognition of Korean Postpositions and Suffixes in Continuous Speech (한국어 연속음성에서의 조사 및 어미 인식에 관한 연구)

  • Song, Min-Suck;Lee, Ki-Young
    • Speech Sciences
    • /
    • v.6
    • /
    • pp.181-195
    • /
    • 1999
  • This study proposes a method of recognizing postpositions and suffixes in Korean spoken language, using prosodic information. We detect grammatical boundaries automatically at first, by using prosodic information of the accentual phrase, and then we recognize grammatical function words by backward-tracking from the boundaries. The experiment employs 300 sentential speech data of 10 men's and 5 women's voice spoken in standard Korean, in which 1080 accentual phrases and 11 postpositions and suffixes are included. The result shows the recognition rate of postpositions in two cases. In one case in which only correctly detected boundaries are included, the recognition rate is 97.5%, and in the other case in which all detected boundaries are included, the recognition rate is 74.8%.

  • PDF

Rapid Speaker Adaptation for Continuous Speech Recognition Using Merging Eigenvoices (Eigenvoice 병합을 이용한 연속 음성 인식 시스템의 고속 화자 적응)

  • Choi, Dong-Jin;Oh, Yung-Hwan
    • MALSORI
    • /
    • no.53
    • /
    • pp.143-156
    • /
    • 2005
  • Speaker adaptation in eigenvoice space is a popular method for rapid speaker adaptation. To improve the performance of the method, the number of speaker dependent models should be increased and eigenvoices should be re-estimated. However, principal component analysis takes much time to find eigenvoices, especially in a continuous speech recognition system. This paper describes a method to reduce computation time to estimate eigenvoices only for supplementary speaker dependent models and to merge them with the used eigenvoices. Experiment results show that the computation time is reduced by 73.7% while the performance is almost the same in case that the number of speaker dependent models is the same as used ones.

  • PDF

Performance Comparison of Korean Connected Digit Telephone Speech Recognition According to Aurora Feature Extraction (Aurora 특징파라미터 추출기법에 따른 한국어 연속숫자음 전화음성의 인식 성능 비교)

  • Kim Min Sung;Jung Sung Yun;Son Jong Mok;Bae Keun Sung;Kim Sang Hun
    • Proceedings of the KSPS conference
    • /
    • 2003.10a
    • /
    • pp.145-148
    • /
    • 2003
  • To improve the recognition performance of Korean connected digit telephone speech, in this paper, both Aurora feature extraction method that employs noise reduction 2-state Wiener filter and DWFBA method are investigated and used. CMN and MRTCN are applied to static features for channel compensation. Telephone digit speech database released by SITEC is used for recognition experiments with HTK system. Experimental results has shown that Aurora feature is slightly better than MFCC and DWFBA without channel compensation. And when channel compensation is included, Aurora feature is slightly better than DWFBA with MRTCN.

  • PDF

A Corpus Selection Based Approach to Language Modeling for Large Vocabulary Continuous Speech Recognition (대용량 연속 음성 인식 시스템에서의 코퍼스 선별 방법에 의한 언어모델 설계)

  • Oh, Yoo-Rhee;Yoon, Jae-Sam;kim, Hong-Kook
    • Proceedings of the KSPS conference
    • /
    • 2005.11a
    • /
    • pp.103-106
    • /
    • 2005
  • In this paper, we propose a language modeling approach to improve the performance of a large vocabulary continuous speech recognition system. The proposed approach is based on the active learning framework that helps to select a text corpus from a plenty amount of text data required for language modeling. The perplexity is used as a measure for the corpus selection in the active learning. From the recognition experiments on the task of continuous Korean speech, the speech recognition system employing the language model by the proposed language modeling approach reduces the word error rate by about 6.6 % with less computational complexity than that using a language model constructed with randomly selected texts.

  • PDF

Language Model Adaptation for Conversational Speech Recognition (대화체 연속음성 인식을 위한 언어모델 적응)

  • Park Young-Hee;Chung Minhwa
    • Proceedings of the KSPS conference
    • /
    • 2003.05a
    • /
    • pp.83-86
    • /
    • 2003
  • This paper presents our style-based language model adaptation for Korean conversational speech recognition. Korean conversational speech is observed various characteristics of content and style such as filled pauses, word omission, and contraction as compared with the written text corpora. For style-based language model adaptation, we report two approaches. Our approaches focus on improving the estimation of domain-dependent n-gram models by relevance weighting out-of-domain text data, where style is represented by n-gram based tf*idf similarity. In addition to relevance weighting, we use disfluencies as predictor to the neighboring words. The best result reduces 6.5% word error rate absolutely and shows that n-gram based relevance weighting reflects style difference greatly and disfluencies are good predictor.

  • PDF

Statistical Analysis of Korean Phonological Variations Using a Grapheme-to-phoneme System (발음열 자동 생성기를 이용한 한국어 음운 변화 현상의 통계적 분석)

  • 이경님;정민화
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.7
    • /
    • pp.656-664
    • /
    • 2002
  • We present a statistical analysis of Korean phonological variations using a Grapheme-to-Phoneme (GPT) system. The GTP system used for experiments generates pronunciation variants by applying rules modeling obligatory and optional phonemic changes and allophonic changes. These rules are derived form morphophonological analysis and government standard pronunciation rules. The GTP system is optimized for continuous speech recognition by generating phonetic transcriptions for training and constructing a pronunciation dictionary for recognition. In this paper, we describe Korean phonological variations by analyzing the statistics of phonemic change rule applications for the 60,000 sentences in the Samsung PBS Speech DB. Our results show that the most frequently happening obligatory phonemic variations are in the order of liaison, tensification, aspirationalization, and nasalization of obstruent, and that the most frequently happening optional phonemic variations are in the order of initial consonant h-deletion, insertion of final consonant with the same place of articulation as the next consonants, and deletion of final consonant with the same place of articulation as the next consonant's, These statistics can be used for improving the performance of speech recognition systems.

QoS Management Scheme for LTE-Advanced Networks (LTE-Advanced 망을 위한 QoS 관리 방안)

  • Lee, Moon-Ho
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2015.01a
    • /
    • pp.209-212
    • /
    • 2015
  • LTE-Advanced 망을 형성하는 리소스 및 망의 형상(대역, 에러율 등)들은 가변적으로 변하기 때문에 기존 음성 서비스에 적용된 절차적이고, 인위적이고, 정적인 제어방식으로는 제어가 불가능한 것으로 인식되고 있다. 본 연구에서는 정책 제어를 기반으로 서비스 연속성을 효과적으로 지원하기 위한 QoS 관리 체계를 제안하고자 한다. 제안되는 QoS 관리 체계에서는 가입자 단말기는 자신의 현재 상태 및 주변 기지국 정보를 수집하고, 기지국은 내부 및 인접한 기지국 모니터링으로 정보를 수집한다. 또한 이와 같은 관련 제어 데이터를 공유하고 이를 종합 분석하여 서비스 연속성을 자체적으로 제어할 수 있게 한다.

  • PDF