• 제목/요약/키워드: Automatic Speech Analysis

검색결과 74건 처리시간 0.03초

음성인식기를 이용한 발음오류 자동분류 결과 분석 (Performance Analysis of Automatic Mispronunciation Detection Using Speech Recognizer)

  • 강효원;이상필;배민영;이재강;권철홍
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2003년도 10월 학술대회지
    • /
    • pp.29-32
    • /
    • 2003
  • This paper proposes an automatic pronunciation correction system which provides users with correction guidelines for each pronunciation error. For this purpose, we develop an HMM speech recognizer which automatically classifies pronunciation errors when Korean speaks foreign language. And, we collect speech database of native and nonnative speakers using phonetically balanced word lists. We perform analysis of mispronunciation types from the experiment of automatic mispronunciation detection using speech recognizer.

  • PDF

한국어 자동 발음열 생성을 위한 예외발음사전 구축 (Building an Exceptional Pronunciation Dictionary For Korean Automatic Pronunciation Generator)

  • 김선희
    • 음성과학
    • /
    • 제10권4호
    • /
    • pp.167-177
    • /
    • 2003
  • This paper presents a method of building an exceptional pronunciation dictionary for Korean automatic pronunciation generator. An automatic pronunciation generator is an essential element of speech recognition system and a TTS (Text-To-Speech) system. It is composed of a part of regular rules and an exceptional pronunciation dictionary. The exceptional pronunciation dictionary is created by extracting the words which have exceptional pronunciations from text corpus based on the characteristics of the words of exceptional pronunciation through phonological research and text analysis. Thus, the method contributes to improve performance of Korean automatic pronunciation generator as well as the performance of speech recognition system and TTS system.

  • PDF

한국어 자동 발음열 생성 시스템을 위한 예외 발음 연구 (A Study on Exceptional Pronunciations For Automatic Korean Pronunciation Generator)

  • 김선희
    • 대한음성학회지:말소리
    • /
    • 제48호
    • /
    • pp.57-67
    • /
    • 2003
  • This paper presents a systematic description of exceptional pronunciations for automatic Korean pronunciation generation. An automatic pronunciation generator in Korean is an essential part of a Korean speech recognition system and a TTS (Text-To-Speech) system. It is composed of a set of regular rules and an exceptional pronunciation dictionary. The exceptional pronunciation dictionary is created by extracting the words that have exceptional pronunciations, based on the characteristics of the words of exceptional pronunciation through phonological research and the systematic analysis of the entries of Korean dictionaries. Thus, the method contributes to improve performance of automatic pronunciation generator in Korean as well as the performance of speech recognition system and TTS system in Korean.

  • PDF

CDMA이동통신환경에서의 음성인식을 위한 왜곡음성신호 거부방법 (Distorted Speech Rejection For Automatic Speech Recognition under CDMA Wireless Communication)

  • 김남수;장준혁
    • 한국음향학회지
    • /
    • 제23권8호
    • /
    • pp.597-601
    • /
    • 2004
  • 본 논문에서는 CDMA이동통신 환경에서의 음성인식을 위한 왜곡음성신호의 전처리-지부방법을 소개한다. 먼저, CDMA이동통신 채널에서의 왜곡된 음성신호를 분석하고 분석된 매커니즘을 바탕으로 채널에 의해 왜곡된 음성신호를 음성의 준주기성을 바탕으로 하여 거부하는 알고리즘을 제안한다. 실험을 통해 제안된 전처리-거부방법이 적은 계산량을 가지고 음성인식에 적용되어 효과적으로 CDMA에 환경에서 채널왜곡된 음성신호를 거부-할 수 있음을 알 수 있었다.

Speech Rhythm Metrics for Automatic Scoring of English Speech by Korean EFL Learners

  • 장태엽
    • 대한음성학회지:말소리
    • /
    • 제66호
    • /
    • pp.41-59
    • /
    • 2008
  • Knowledge in linguistic rhythm of the target language plays a major role in foreign language proficiency. This study attempts to discover valid rhythm features that can be utilized in automatic assessment of non-native English pronunciation. Eight previously proposed and two novel rhythm metrics are investigated with 360 English read speech tokens obtained from 27 Korean learners and 9 native speakers. It is found that some of the speech-rate normalized interval measures and above-word level metrics are effective enough to be further applied for automatic scoring as they are significantly correlated with speakers' proficiency levels. It is also shown that metrics need to be dynamically selected depending upon the structure of target sentences. Results from a preliminary auto-scoring experiment through a Multi Regression analysis suggest that appropriate control of unexpected input utterances is also desirable for better performance.

  • PDF

자동 음성 인식기를 위한 단채널 음질 향상 알고리즘의 성능 분석 (Performance Analysis of a Class of Single Channel Speech Enhancement Algorithms for Automatic Speech Recognition)

  • 송명석;이창헌;이석필;강홍구
    • The Journal of the Acoustical Society of Korea
    • /
    • 제29권2E호
    • /
    • pp.86-99
    • /
    • 2010
  • This paper analyzes the performance of various single channel speech enhancement algorithms when they are applied to automatic speech recognition (ASR) systems as a preprocessor. The functional modules of speech enhancement systems are first divided into four major modules such as a gain estimator, a noise power spectrum estimator, a priori signal to noise ratio (SNR) estimator, and a speech absence probability (SAP) estimator. We investigate the relationship between speech recognition accuracy and the roles of each module. Simulation results show that the Wiener filter outperforms other gain functions such as minimum mean square error-short time spectral amplitude (MMSE-STSA) and minimum mean square error-log spectral amplitude (MMSE-LSA) estimators when a perfect noise estimator is applied. When the performance of the noise estimator degrades, however, MMSE methods including the decision directed module to estimate a priori SNR and the SAP estimation module helps to improve the performance of the enhancement algorithm for speech recognition systems.

독일어 감정음성에서 추출한 포먼트의 분석 및 감정인식 시스템과 음성인식 시스템에 대한 음향적 의미 (An Analysis of Formants Extracted from Emotional Speech and Acoustical Implications for the Emotion Recognition System and Speech Recognition System)

  • 이서배
    • 말소리와 음성과학
    • /
    • 제3권1호
    • /
    • pp.45-50
    • /
    • 2011
  • Formant structure of speech associated with five different emotions (anger, fear, happiness, neutral, sadness) was analysed. Acoustic separability of vowels (or emotions) associated with a specific emotion (or vowel) was estimated using F-ratio. According to the results, neutral showed the highest separability of vowels followed by anger, happiness, fear, and sadness in descending order. Vowel /A/ showed the highest separability of emotions followed by /U/, /O/, /I/ and /E/ in descending order. The acoustic results were interpreted and explained in the context of previous articulatory and perceptual studies. Suggestions for the performance improvement of an automatic emotion recognition system and automatic speech recognition system were made.

  • PDF

조음자질을 이용한 한국인 학습자의 영어 발화 자동 발음 평가 (Automatic pronunciation assessment of English produced by Korean learners using articulatory features)

  • 류혁수;정민화
    • 말소리와 음성과학
    • /
    • 제8권4호
    • /
    • pp.103-113
    • /
    • 2016
  • This paper aims to propose articulatory features as novel predictors for automatic pronunciation assessment of English produced by Korean learners. Based on the distinctive feature theory, where phonemes are represented as a set of articulatory/phonetic properties, we propose articulatory Goodness-Of-Pronunciation(aGOP) features in terms of the corresponding articulatory attributes, such as nasal, sonorant, anterior, etc. An English speech corpus spoken by Korean learners is used in the assessment modeling. In our system, learners' speech is forced aligned and recognized by using the acoustic and pronunciation models derived from the WSJ corpus (native North American speech) and the CMU pronouncing dictionary, respectively. In order to compute aGOP features, articulatory models are trained for the corresponding articulatory attributes. In addition to the proposed features, various features which are divided into four categories such as RATE, SEGMENT, SILENCE, and GOP are applied as a baseline. In order to enhance the assessment modeling performance and investigate the weights of the salient features, relevant features are extracted by using Best Subset Selection(BSS). The results show that the proposed model using aGOP features outperform the baseline. In addition, analysis of relevant features extracted by BSS reveals that the selected aGOP features represent the salient variations of Korean learners of English. The results are expected to be effective for automatic pronunciation error detection, as well.

발음열 자동 변환을 이용한 한국어 음운 변화 규칙의 통계적 분석 (Statistical Analysis of Korean Phonological Rules Using a Automatic Phonetic Transcription)

  • 이경님;정민화
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2002년도 11월 학술대회지
    • /
    • pp.81-85
    • /
    • 2002
  • We present a statistical analysis of Korean phonological variations using automatic generation of phonetic transcription. We have constructed the automatic generation system of Korean pronunciation variants by applying rules modeling obligatory and optional phonemic changes and allophonic changes. These rules are derived from knowledge-based morphophonological analysis and government standard pronunciation rules. This system is optimized for continuous speech recognition by generating phonetic transcriptions for training and constructing a pronunciation dictionary for recognition. In this paper, we describe Korean phonological variations by analyzing the statistics of phonemic change rule applications for the 60,000 sentences in the Samsung PBS(Phonetic Balanced Sentence) Speech DB. Our results show that the most frequently happening obligatory phonemic variations are in the order of liaison, tensification, aspirationalization, and nasalization of obstruent, and that the most frequently happening optional phonemic variations are in the order of initial consonant h-deletion, insertion of final consonant with the same place of articulation as the next consonants, and deletion of final consonant with the same place of articulation as the next consonants. These statistics can be used for improving the performance of speech recognition systems.

  • PDF

자동차 환경내의 음성인식 자동 평가 플랫폼 연구 (A Study of Automatic Evaluation Platform for Speech Recognition Engine in the Vehicle Environment)

  • 이성재;강선미
    • 한국통신학회논문지
    • /
    • 제37권7C호
    • /
    • pp.538-543
    • /
    • 2012
  • 주행 중 차량내의 음성인터페이스 에서 음성인식기의 성능은 가장 중요한 부분이다. 본 논문은 차량내 음성인식기의 성능 평가를 자동화하기 위한 플랫폼의 개발에 대한 것이다. 개발된 플랫폼은 주 프로그램, 중계 프로그램 데이터베이스 관리, 통계산출 모듈로 구성된다. 성능 평가에 있어 실제 차량의 주행 조건을 고려한 시뮬레이션 환경이 구축되었고, 미리 녹음된 주행 노이즈와 발화자의 목소리를 마이크를 통해 입력하여 실험하였다. 실험 결과 제안하는 플랫폼에서 얻어진 음성인식 결과의 유효성이 입증되었다. 제안한 플랫폼으로 사용자는 음성인식의 자동화와 인식결과의 효율적인 관리 및 통계산출을 함으로서 차량 음성인식기의 평가를 효과적으로 진행할 수 있다.