• Title/Summary/Keyword: phonetic data

Search Result 200, Processing Time 0.019 seconds

Development of English Speech Recognizer for Pronunciation Evaluation (발성 평가를 위한 영어 음성인식기의 개발)

  • Park Jeon Gue;Lee June-Jo;Kim Young-Chang;Hur Yongsoo;Rhee Seok-Chae;Lee Jong-Hyun
    • Proceedings of the KSPS conference
    • /
    • 2003.10a
    • /
    • pp.37-40
    • /
    • 2003
  • This paper presents the preliminary result of the automatic pronunciation scoring for non-native English speakers, and shows the developmental process for an English speech recognizer for the educational and evaluational purposes. The proposed speech recognizer, featuring two refined acoustic model sets, implements the noise-robust data compensation, phonetic alignment, highly reliable rejection, key-word and phrase detection, easy-to-use language modeling toolkit, etc., The developed speech recognizer achieves 0.725 as the average correlation between the human raters and the machine scores, based on the speech database YOUTH for training and K-SEC for test.

  • PDF

A Fundamental Phonetic Investigation of Korean Vowels (한국어 모음의 음성학적 기반연구)

  • Moon, Seung-Jae
    • Proceedings of the KSPS conference
    • /
    • 2007.05a
    • /
    • pp.203-206
    • /
    • 2007
  • The purpose of this study was to investigate and quantitatively describe the acoustic characteristics of current Korean monophthongs. Recordings were made of 33 men and 27 women producing the vowels /i, e, ${\varepsilon}$, a, (표현불가), O, u, (표현불가)/ in a carrier phrase "This character is _." A listening test was conducted in which 19 participants judged each vowel. F1, F2, and F3 were measured from the vowels judged as intended vowels by more than 17 people from the listening test. Analysis of formant data shows some interesting results including the undeniable confirmation of 7-vowel system in current Korean.

  • PDF

Text-driven Speech Animation with Emotion Control

  • Chae, Wonseok;Kim, Yejin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.8
    • /
    • pp.3473-3487
    • /
    • 2020
  • In this paper, we present a new approach to creating speech animation with emotional expressions using a small set of example models. To generate realistic facial animation, two example models called key visemes and expressions are used for lip-synchronization and facial expressions, respectively. The key visemes represent lip shapes of phonemes such as vowels and consonants while the key expressions represent basic emotions of a face. Our approach utilizes a text-to-speech (TTS) system to create a phonetic transcript for the speech animation. Based on a phonetic transcript, a sequence of speech animation is synthesized by interpolating the corresponding sequence of key visemes. Using an input parameter vector, the key expressions are blended by a method of scattered data interpolation. During the synthesizing process, an importance-based scheme is introduced to combine both lip-synchronization and facial expressions into one animation sequence in real time (over 120Hz). The proposed approach can be applied to diverse types of digital content and applications that use facial animation with high accuracy (over 90%) in speech recognition.

Modified Phonetic Decision Tree For Continuous Speech Recognition

  • Kim, Sung-Ill;Kitazoe, Tetsuro;Chung, Hyun-Yeol
    • The Journal of the Acoustical Society of Korea
    • /
    • v.17 no.4E
    • /
    • pp.11-16
    • /
    • 1998
  • For large vocabulary speech recognition using HMMs, context-dependent subword units have been often employed. However, when context-dependent phone models are used, they result in a system which has too may parameters to train. The problem of too many parameters and too little training data is absolutely crucial in the design of a statistical speech recognizer. Furthermore, when building large vocabulary speech recognition systems, unseen triphone problem is unavoidable. In this paper, we propose the modified phonetic decision tree algorithm for the automatic prediction of unseen triphones which has advantages solving these problems through following two experiments in Japanese contexts. The baseline experimental results show that the modified tree based clustering algorithm is effective for clustering and reducing the number of states without any degradation in performance. The task experimental results show that our proposed algorithm also has the advantage of providing a automatic prediction of unseen triphones.

  • PDF

Experimental Phonetic Study of the Syllable Duration of Korean with Respect to the Positional Effect

  • Lee Hyunbok;Seong Cheol-jae
    • MALSORI
    • /
    • no.31_32
    • /
    • pp.195-205
    • /
    • 1996
  • The aim of this paper is to describe the prosodic structure of Korean related to the syllable duration varying with its positional difference. An attempt is made in this study to analyze and describe the concrete correlation between the syllable lengthening and its position in the utterance at the initial and final positions. Using the syllable [na] at the final and initial position of a prosodic phrase in the Korean version of 'the North Wind and the Sun', it has found that the ratio of phrase final versus phrase initial syllable lengthening was approximately 1.8:1 for 4 subjects taking part in the test. In the case of nonsense data, we found that the ratio was approximately 1.6:1 for 2 out of 3 subjects. The results of this study might indicate that Korean tends to have a high rate of final lengthening. We can tentatively classify it, therefore, as a stress-timed language. Still, there is no denying that further studies should be done before we can be absolutely certain about the classification of languages along the dichotomy scale.

  • PDF

Implementation of an Effective Rule Base System for the Change of Korean Vocal Sound (한국어 음운 변동 처리를 위한 효율적인 Rule Base System의 구성)

  • 이규영;이상범
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.28B no.12
    • /
    • pp.9-18
    • /
    • 1991
  • In this Paper, a rule-based method for the phenomenon of Korean vocal sound change is proposed. This method could be used to solve a problem between symbolic(Hangul)and phonetic language(Korean) for the study of Korean speech processing. A rule on the phenomenon of vocal sound rearranged for the rule base with a end-consonents on the authority of standard pronunciation rule. The proposed rule base system is simplified by the implementation for the vocal sound change. Also, it is useful to create the data base with phonetic value for the Korean voice processing by syllable unit.

  • PDF

Study of Boundary Tone in Mandarin Chinese (표준 중국어의 경계억양에 관한 연구)

  • Sohn Nam-Ho
    • Proceedings of the KSPS conference
    • /
    • 2003.05a
    • /
    • pp.43-47
    • /
    • 2003
  • This paper is phonetic study of $F_{0}$ range and boundary tone in Mandarin Chinese. The production data from 6 Chinese speakers show that there are declination, pitch resetting and tonal variation of boundary tone. In declarative sentence, $F_{0}$ declines gradually over the utterance but mid-sentence boundary prevents $F_{0}$ of following syllable from declining because of pitch resetting. $F_{0}$ range of syllable is expanded before the mid- and final sentence boundaries. In interrogative one, $F_{0}$ ascends gradually over the utterance and mid-sentence boundary makes $F_{0}$ of following syllable rise more. $F_{0}$ range of sentence final syllable is expanded and $F_{0}$ contour shows rising curve.

  • PDF

Meta-data Standardization of Speech Database (음성 DB의 메타데이타 표준화)

  • Kim Sanghun
    • Proceedings of the KSPS conference
    • /
    • 2003.10a
    • /
    • pp.61-64
    • /
    • 2003
  • In this paper, we introduce a new description method of annotation information of speech database. As one of structured description methods, XML based description which has been standardized by W3C will be applied to represent metadata of speech database. It will be continuously revised through the speech technology standard forum during this year

  • PDF

Proposed Methodology for Building Korean Machine Translation Data sets Considering Phonetic Features (단어의 음성학적 특징을 이용한 한국어 기계 번역 데이터 세트 구축 방안)

  • Zhang Qinghao;Yang Hongjian;Serin Kim;Hyuk-Chul Kwon
    • Annual Conference on Human and Language Technology
    • /
    • 2022.10a
    • /
    • pp.592-595
    • /
    • 2022
  • 한국어에서 한자어와 외래어가 차지하는 비중은 매우 높다. 일상어의 경우 한자어와 외래어의 비중이 약 53%, 전문어의 경우 약 92%에 달한다. 한자어나 외래어는 중국이나 다른 나라로부터 영향을 받아 한국에서 쓰이는 단어들이다. 한국어에서 사용되는 한자어와 외래어의 한글 표기과 원어 표기를 발음해보면, 발음이 상당히 유사하다는 것을 알 수 있다. 한자어인 도서관(图书馆)을 중국어로 발음해보면 thu.ʂu.kwan'로 해당 단어에 대한 한국 사람의 발음과 상당히 유사하다. 본 논문에서는 Source Length, Source IPA Length, Target Length, Target IPA Length, IPA Distance 등 총 5가지의 음성학적 특징을 고려한 한국어-중국어 한국어-영어 단어 기계번역 데이터 세트를 구축하고자 한다.

  • PDF

The Validation of Speech Recognition Performance according to Microphones (마이크로폰의 종류에 따른 음성인식성능의 검토)

  • Kim Yoen-Whon;Lee Kwang-Hyun;Jung Young-Jo;Kim Bong-Wan;Lee Yong-Ju
    • Proceedings of the KSPS conference
    • /
    • 2003.05a
    • /
    • pp.183-186
    • /
    • 2003
  • Speech recognition performance depends on various factors. One of the factors is the characteristic of a microphone which is used when speech data is collected. Thus, in the present experiment speech databases for tests are created through varying types of microphones. Then, acoustic models are built based on these databases, and each of the acoustic models is assessed by the data to determine recognition performance depending on various microphones.

  • PDF