통합 검색 | Korea Science

Modified Phonetic Decision Tree For Continuous Speech Recognition

Kim, Sung-Ill;Kitazoe, Tetsuro;Chung, Hyun-Yeol
- The Journal of the Acoustical Society of Korea
- /
- 제17권4E호
- /
- pp.11-16
- /
- 1998
For large vocabulary speech recognition using HMMs, context-dependent subword units have been often employed. However, when context-dependent phone models are used, they result in a system which has too may parameters to train. The problem of too many parameters and too little training data is absolutely crucial in the design of a statistical speech recognizer. Furthermore, when building large vocabulary speech recognition systems, unseen triphone problem is unavoidable. In this paper, we propose the modified phonetic decision tree algorithm for the automatic prediction of unseen triphones which has advantages solving these problems through following two experiments in Japanese contexts. The baseline experimental results show that the modified tree based clustering algorithm is effective for clustering and reducing the number of states without any degradation in performance. The task experimental results show that our proposed algorithm also has the advantage of providing a automatic prediction of unseen triphones.
PDF

다양한 음성을 이용한 자동화자식별 시스템 성능 확인에 관한 연구 (Variation of the Verification Error Rate of Automatic Speaker Recognition System With Voice Conditions)

홍수기
- 대한음성학회지:말소리
- /
- 제43호
- /
- pp.45-55
- /
- 2002
High reliability of automatic speaker recognition regardless of voice conditions is necessary for forensic application. Audio recordings in real cases are not consistent in voice conditions, such as duration, time interval of recording, given text or conversational speech, transmission channel, etc. In this study the variation of verification error rate of ASR system with the voice conditions was investigated. As a result in order to decrease both false rejection rate and false acception rate, the various voices should be used for training and the duration of train voices should be longer than the test voices.
PDF

영어 발음교정시스템을 위한 발음사전 구축 (Pronunciation Dictionary for English Pronunciation Tutoring System)

김효숙;김선주
- 대한음성학회:학술대회논문집
- /
- 대한음성학회 2003년도 5월 학술대회지
- /
- pp.168-171
- /
- 2003
This study is about modeling pronunciation dictionary necessary for PLU(phoneme like unit) level word recognition. The recognition of nonnative speakers' pronunciation enables an automatic diagnosis and an error detection which are the core of English pronunciation tutoring system. The above system needs two pronunciation dictionaries. One is for representing standard English pronunciation. The other is for representing Korean speakers' English Pronunciation. Both dictionaries are integrated to generate pronunciation networks for variants.
PDF

한국인의 영어 발음에 영향을 미치는 개인적 특성 요인 (Personal Factors Affecting Korean Speakers' English Pronunciation)

전은
- 대한음성학회지:말소리
- /
- 제57호
- /
- pp.1-14
- /
- 2006
This study examines personal factors that affect Korean speakers' English pronunciation. Personal factors which are examined here are as follows: personality type, cognitive system, motivational orientation type, interest in English, how often they listen to tapes, and academic achievements. Data were collected through MBTI (Myers Briggs Type Indicator) Test, Group Embedded Figural Test, and a Questionnaire. The participants consisted of 65 college students. All the results were statistically analyzed: Korean students' personality type and cognitive system are not related with their pronunciation, but motivational orientation type, how often they listen to tapes, academic achievements, and interest in English study are correlated with their pronunciation.
PDF

이웃 정보에 기초한 반모델을 이용한 발화 검증 (Utterance Verification Using Anti-models Based on Neighborhood Information)

윤영선
- 대한음성학회지:말소리
- /
- 제67호
- /
- pp.79-102
- /
- 2008
In this paper, we investigate the relation between Bayes factor and likelihood ratio test (LRT) approaches and apply the neighborhood information of Bayes factor to building an alternate hypothesis model of the LRT system. To consider the neighborhood approaches, we contemplate a distance measure between models and algorithms to be applied. We also evaluate several methods to improve performance of utterance verification using neighborhood information. Among these methods, the system which adopts anti-models built by collecting mixtures of neighborhood models obtains maximum error rate reduction of 17% compared to the baseline, linear and weighted combination of neighborhood models.
PDF

음성인식 기반의 자동 프롬프터 시스템 (Auto-Scrolling Prompter System using Speech Recognition Technology)

김길연;김진우
- 대한음성학회:학술대회논문집
- /
- 대한음성학회 2006년도 춘계 학술대회 발표논문집
- /
- pp.95-98
- /
- 2006
A prompter software is used, behind the camera, to scroll the script for a TV narrator. So far it has been manually operated by an assistant, who scrolls the caption following narrator's speech. Automating this procedure using a speech recognition technology has been investigated in this project. The developed auto-scrolling software was tested in offline and online, which shows performance good enough to replace an existing prompter software. This paper describes the whole development process and concerns to be cared.
PDF

피치 정보를 이용한 GMM 기반의 화자 식별 (GMM based Speaker Identification using Pitch Information)

박태선;한민수
- 대한음성학회지:말소리
- /
- 제47호
- /
- pp.121-129
- /
- 2003
This paper describes the use of pitch information for speaker identification. The recognition system is a GMM based one with 4 connected Korean digits speech database. The mean of the pitch period in voiced sections of speech are shown to be ,useful at discriminating between speakers. Utilizing this feature with Gaussian mixture model in the speaker identification system gave a marked improvement, maximum 6% improvement comparing to the baseline Gaussian mixture model.
PDF

청음 음성학적 지식에 기반한 음가분류에 의한 핵심어 검출 시스템 구현 (The Design of Keyword Spotting System based on Auditory Phonetical Knowledge-Based Phonetic Value Classification)

김학진;김순협
- 정보처리학회논문지B
- /
- 제10B권2호
- /
- pp.169-178
- /
- 2003
This study outlines two viewpoints the classification of phone likely unit (PLU) which is the foundation of korean large vocabulary speech recognition, and the effectiveness of Chiljongseong (7 Final Consonants) and Paljogseong (8 Final Consonants) of the korean language. The phone likely classifies the phoneme phonetically according to the location of and method of articulation, and about 50 phone-likely units are utilized in korean speech recognition. In this study auditory phonetical knowledge was applied to the classification of phone likely unit to present 45 phone likely unit. The vowels 'ㅔ, ㅐ'were classified as phone-likely of (ee) ; 'ㅒ, ㅖ' as [ye] ; and 'ㅚ, ㅙ, ㅞ' as [we]. Secondly, the Chiljongseong System of the draft for unified spelling system which is currently in use and the Paljongseonggajokyong of Korean script haerye were illustrated. The question on whether the phonetic value on 'ㄷ' and 'ㅅ' among the phonemes used in the final consonant of the korean fan guage is the same has been argued in the academic world for a long time. In this study, the transition stages of Korean consonants were investigated, and Ciljonseeng and Paljongseonggajokyong were utilized in speech recognition, and its effectiveness was verified. The experiment was divided into isolated word recognition and speech recognition, and in order to conduct the experiment PBW452 was used to test the isolated word recognition. The experiment was conducted on about 50 men and women - divided into 5 groups - and they vocalized 50 words each. As for the continuous speech recognition experiment to be utilized in the materialized stock exchange system, the sentence corpus of 71 stock exchange sentences and speech corpus vocalizing the sentences were collected and used 5 men and women each vocalized a sentence twice. As the result of the experiment, when the Paljongseonggajokyong was used as the consonant, the recognition performance elevated by an average of about 1.45% : and when phone likely unit with Paljongseonggajokyong and auditory phonetic applied simultaneously, was applied, the rate of recognition increased by an average of 1.5% to 2.02%. In the continuous speech recognition experiment, the recognition performance elevated by an average of about 1% to 2% than when the existing 49 or 56 phone likely units were utilized.
https://doi.org/10.3745/KIPSTB.2003.10B.2.169 인용 PDF KSCI

온라인 다국적 게임을 위한 다국어 혼합 음성 인식에 관한 연구 (A Study on the Multilingual Speech Recognition for On-line International Game)

김석동;강흥순;우인성;신좌철;윤춘덕
- 한국게임학회 논문지
- /
- 제8권4호
- /
- pp.107-114
- /
- 2008
최근 게임에도 다국어를 대상으로 하는 음성인식에 대한 요구와 여러 나라의 서로 다른 언어로 표현된 음성을 하나의 음성 모델로 표현하는 다국어 시스템의 개발에 대한 필요성이 점차 증가하고 있다. 이에 따라 다양한 언어로 구성되어 있는 음성을 하나의 음성 모델로 표현할 수 있는 다국어 음성인식 시스템의 발전에 대한 연구가 필요하다. 본 논문에서는 다국어 음성 모델을 통합적으로 구축하기 위한 기본 연구로 한국어 음성과 영어 음성을 국제음소기호(IPA)로 인식하는 시스템을 연구하였고 한국어와 영어 음소를 동시에 만족하는 IPA모델을 찾는데 중점을 두어 실험한 결과 한국어 음성에 대하여 90.62%, 영어 음성에 대하여 91.71%라는 인식률을 얻을 수 있었다.
PDF

예제 기반 대화 시스템을 위한 양태 분류 (Modality Classification for an Example-Based Dialogue System)

김민정;홍금원;송영인;이연수;이도길;임해창
- 대한음성학회지:말소리
- /
- 제68권
- /
- pp.75-93
- /
- 2008
An example-based dialogue system tries to utilize many pairs which are stored in a dialogue database. The most important part of the example-based dialogue system is to find the most similar utterance to user's input utterance. Modality, which is characterized as conveying the speaker's involvement in the propositional content of a given utterance, is one of the core sentence features. For example, the sentence "I want to go to school." has a modality of hope. In this paper, we have proposed a modality classification system which can predict sentence modality in order to improve the performance of example-based dialogue systems. We also define a modality tag set for a dialogue system, and validate this tag set using a rule-based modality classification system. Experimental results show that our modality tag set and modality classification system improve the performance of an example-based dialogue system.
PDF

검색결과 313건 처리시간 0.017초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)