• Title/Summary/Keyword: Phoneme

Search Result 458, Processing Time 0.024 seconds

Development of Automatic Lip-sync MAYA Plug-in for 3D Characters (3D 캐릭터에서의 자동 립싱크 MAYA 플러그인 개발)

  • Lee, Sang-Woo;Shin, Sung-Wook;Chung, Sung-Taek
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.18 no.3
    • /
    • pp.127-134
    • /
    • 2018
  • In this paper, we have developed the Auto Lip-Sync Maya plug-in for extracting Korean phonemes from voice data and text information based on Korean and produce high quality 3D lip-sync animation using divided phonemes. In the developed system, phoneme separation was classified into 8 vowels and 13 consonants used in Korean, referring to 49 phonemes provided by Microsoft Speech API engine SAPI. In addition, the pronunciation of vowels and consonants has variety Mouth Shapes, but the same Viseme can be applied to some identical ones. Based on this, we have developed Auto Lip-sync Maya Plug-in based on Python to enable lip-sync animation to be implemented automatically at once.

Frequency Related Information and Syllable Structure Constraints on Sino-Korean (한국 한자음의 빈도 관련 정보 및 음절 구조 제약)

  • Shin, Ji-Young
    • Phonetics and Speech Sciences
    • /
    • v.1 no.2
    • /
    • pp.129-140
    • /
    • 2009
  • The purpose of the present study is to investigate frequency related information and syllable structure constraints on Sino-Korean. Previous studies on Sino-Korean have mostly investigated the historical change of sounds and reviewed archaic features of Chinese language in Sino-Korean. Unfortunately, there is little study on the sounds of contemporary Sino-Korean in terms of syllable structure constraints. For the purpose of the present study, sounds of 7,742 Chinese characters used in Sino-Korean (7,795 syllables) were investigated and syllable matrices made based on the results of frequency related information. As a result, 483 syllable types were observed and the most frequently observed syllables were as follows: /ku/ (103) > /ki/ (100) > /ju/ (87) > /pi/ (86). Only 16 out of 19 consonants are used for Sino-Korean. /$t^{\ast}$/ and /$p^{\ast}$/ are never used in Sino-Korean and /kh, $s^{\ast}$, $k^{\ast}$/ occur only a few times (3, 2, 1 respectively). /k/ (17.5%) shows the highest frequency and /n, ${\eta}$, 1, tc, m/ occupied the next rankings. Among 20 vowel types, /a/ showed the highest frequency and /o, u, i, $j{\Lambda}$, ${\Lambda}$/ occupied the next rankings. Based on the syllable matrices, gaps were observed and classified into accidental or systematic ones. Onset and nucleus, nucleus and coda, onset and coda, and other syllable structure constraints of Sino-Korean were listed.

  • PDF

Phoneme Extraction from Freely Hand Written Han Gul (자유 필기체 한글에서의 자모 추출)

  • Oh, Weon-Geun;Shin, Young-Geon;Ahn, Young-Kyung
    • Annual Conference on Human and Language Technology
    • /
    • 1989.10a
    • /
    • pp.142-147
    • /
    • 1989
  • 필기체 문자는 인쇄체 문자와는 달리, 복잡한 변형이 따르므로, 인식 하는데 많은 문제점이 따른다. 그렇기 때문에 일반적인 필기체 인식에 있어서는 필기 자체에 대한 제한을 두어 변형을 적게한 문자를 인식 대상으로 삼고 있다. 이러한 문자는, 설정된 조건만 확실하게 만족한다면, 비교적 간단하게 인식 할 수 있다. 반면에, 자유 필기체 문자는, 제한 필기체 문자와는 달리 변형이 크기 때문에, 그 인식에는 많은 연구가 필요하다. 본 연구에서는, 자유 필기체 한글의 자모를 추출하는데 있어 두개의 parameter space method를 이용했다. 화상내에서의 혼합은, 기본적으로 5 개의 element ($\mid,\;\setminus,\;/,\;-,\;o$)로 구성되어 있고, 이 element를 정의하는데는 최소한 4 개의 parameter, 즉 element의 위치 [x, y], 크기 [1] 및 type [T] 등이 필요하다. 입력 화상에서 추출된 직선 및 원의 성분은 [x, y, l] 과 [x, y, T]의 2 개의 3-D parameter space 에 누적되고, parameter space 상에서의 병합 분할 과정을 거쳐, element 가 형성된다. 추출된 element 들은, parameter space 상에서의 방향성 및 상호 위치 관계에 의한 조합 형태로서, 미리 기술되어진 자모 모델과 비교되어 인식된다. 본 방법의 특정은, 문자의 크기에 무관하고, 해석방법에 의해서는, 끊어진 element나 불필요한 element 등의 왜곡된 element 들의 처리가 가능한 점, 4 차원 parameter space를 두개의 3 차원 parameter space로 분리, 처리시간과 기억용량의 절약을 기한점 등을 들 수 있다.

  • PDF

The Phonemic Characteristics of Disfluencies in Children and Adults Who Stutter (말더듬 아동과 성인에게서 나타난 비유창성의 음운특성)

  • Han, Jin-Soon;Lee, Eun-Ju;Sim, Hyun-Sub
    • Speech Sciences
    • /
    • v.12 no.3
    • /
    • pp.59-77
    • /
    • 2005
  • The aim of the present study is to investigate how the phonemic characteristics influence on the disfluencies of children and adults who stutter. The participants were 10 children(9 boys and 1 girl) and 10 male adults. After having the participants to read out the Paradise-Fluency Assessment(Sim, Shin & Lee, 2004) passages, each of the productions were divided into syllables and words, and then the frequencies and the ratios of their disfluenceis were analyzed according to the specified phonemic features. In terms of the frequency of the disfluency, the participants stuttered more in the words which start with consonant than vowel. But they showed more disfluencies in the words initiated with vowel than consonant when the ratio of each phoneme's presences were considered. There found different tendencies among the phonemic features related with their disfluencies occuring with ralatively high frequency or ratio. It was difficult to find out the exact relationships among the order of the sound acquisition, phonemic complexity, and the disfluencies. To study the exact influence of the phonemic features upon the disfluencies, it comes important to consider the frequency of the stuttering itself together with the ratio of the disfluencies in which the opportunity of the specific sound's presence was considered. To compare the results of the different studies which has similar purposes, it seems important to consider the tasks and the methodologies in depth.

  • PDF

Perceptual Characteristics of Korean Consonants Distorted by the Frequency Band Limitation (주파수 대역 제한에 의한 한국어 자음의 지각 특성 분석)

  • Kim, YeonWhoa;Choi, DaeLim;Lee, Sook-Hyang;Lee, YongJu
    • Phonetics and Speech Sciences
    • /
    • v.6 no.1
    • /
    • pp.95-101
    • /
    • 2014
  • This paper investigated the effects of frequency band limitation on perceptual characteristics of Korean consonants. Monosyllabic speech (144 syllables of CV type, 56 syllables of VC type, 8 syllables of V type) produced by two announcers were low- and high-pass filtered with cutoff frequencies ranging from 300 to 5000 Hz. Six listeners with normal hearing performed perception test by types of filter and cutoff frequencies. We reported phoneme recognition rates and types of perception error of band-limited Korean consonants to examine how frequency distortion in the process of speech transmission affect listener's perception. The results showed that recognition rates varied with the following factors: position in a syllable, manner of articulation, place of articulation, and phonation types. Consonants in the final position were stronger to the frequency band limitation than those in the initial position. Fricatives and Affricates are stronger than stops. Fortis consonants were less stronger than their lenis or aspirated counterparts. Types of perception error also varied depending on such factors as consonant's place of articulation: In case of bilabial stops, they were perceived as alveolar stops with while in cases of alveolar and velar stops, there were changes in phonation types without any change in the place of articulation.

Sensitive Period of Auditory Perception and Linguistic Discrimination

  • Cha, Kyung-Whan;Jo, Hannah
    • Phonetics and Speech Sciences
    • /
    • v.6 no.1
    • /
    • pp.59-67
    • /
    • 2014
  • The purpose of this study is to scientifically examine Kuhl's (2011), originally Johnson and Newport's (1989) critical period graph, from a perspective of auditory perception and linguistic discrimination. This study utilizes two types of experiments (auditory perception and linguistic phoneme discrimination) with five different age groups (5 years, 6-8 years, 9-13 years, 15-17 years, and 20-26 years) of Korean English learners. Auditory perception is examined via ultrasonic sounds that are commonly used in the medical field. In addition, each group is measured in terms of their ability to discriminate minimal pairs in Chinese. Since almost all Korean students already have some amount of English exposure, the researchers selected phonemes in Chinese, an unexposed foreign language for all of the subject groups. The results are almost completely in accordance with Kuhl's critical period graph for auditory perception and linguistic discrimination; a sensitive age is found at 8. The results show that the auditory capability of kindergarten children is significantly better than that of other students, measured by their ability to perceive ultrasonic sounds and to distinguish ten minimal pairs in Chinese. This finding strongly implies that human auditory ability is a key factor for the sensitive period of language acquisition.

A Study on the Phoneme Based Analysis of Korean Initial Plosives Using Statistical Method and Perception Tests (통계적 방법과 인지실험을 통한 한국어 초성파열음의 음소단위 분석에 관한 연구)

  • Jo Cheol-Woo;Lee Woo-Sun;Lee Cyu-Ho;Kim Jong-Ahn;Lim Gwang-Il;Lee Tae-Won
    • The Journal of the Acoustical Society of Korea
    • /
    • v.8 no.5
    • /
    • pp.78-85
    • /
    • 1989
  • This paper describes a statistical methods and perception test for extracting the parameters to be used for the synthesis-by-rule of Korean plosives. Formant synthesizer is chosen for the synthesis of the phonemes. Speech materials for the analysis consists of 72 CV monosyllables from the single male speaker. The analysis is done mainly focused on the variation of parameters in time and frequency domain, then perception tests are executed to estimate the effects of variations of the formant transitions.

  • PDF

Analysis of Unaspirated sound for Korean (한국어의 경음에 대한 분석)

  • Lim Soo-Ho;Kim Joo-Gon;Kim Bum-Guk;Jung Ho-Youl;Chung Hyun-Yeol
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.41-44
    • /
    • 2004
  • 본 논문에서는 한국어에만 나타나는 경음에 대하여 음운학적, 음향학적 특성을 고찰하고 이를 기반으로 음성인식 실험을 수행한 후 그 결과를 분석하였다. 음성인식 실험을 위하여 입력 음성을 48개의 유사음소단위 (PLU; Phoneme Likely Unit)로 레이블링을 한 후 각각의 음소군에 대하여 LPC (Liner Predictive Coding) 분해능을 증가시키면서 음소인식 및 단어인식 실험을 수행하였다. 그 결과, 음소 인식 실험에서 경음군의 인식률이 가장 낮게 나타나 경음에 대한 분석이 보다 많이 필요함을 알 수 있었다. 또한 PLC의 분해 차원이 23차 일 때 경음과 전체 음소 인식률이 각각 $34.11\%,\;46.1\%$로 나타나 가장 양호함을 알 수 있었으며 단어인식 실험에서도 LPC 23차와 25차 일 때 $81.68\%,\;81.87\%$로 인식률이 가장 좋음을 알 수 있었다. 이상의 실험 결과에서 한국어의 경음은 전체 시스템의 인식 성능과 밀접한 관계가 있음을 알 수 있었다.

  • PDF

Performance Evaluation of Speech Recognition Using the Reconstructed Feature Parameter with Voiced-Unvoiced Measure (유ㆍ무성음 척도를 포함한 재구성 특징 파라미터의 음성 인식 성능평가)

  • 이광석;한학용;고시영;허강인
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.7 no.2
    • /
    • pp.177-182
    • /
    • 2003
  • In this study, we research the robust speech recognition for the syllables and phoneme units with the feature parameter including the voiced-unvoiced measures for the confusable words. In order to make it possible, we propose the measure representing the voiced-unvoiced degree by using the HPS(Harmonic Product Spectrum) information, used on pitch detection. We proposed this measures with the sharpnes, peak count and height measure of HPS. We reconstructed the feature parameter including this measures, then we performs the speech recognition experiments and compared with the typical feature parameters under the CVC type confusable syllables DB.

A Recognition Time Reduction Algorithm for Large-Vocabulary Speech Recognition (대용량 음성인식을 위한 인식기간 감축 알고리즘)

  • Koo, Jun-Mo;Un, Chong-Kwan;,
    • The Journal of the Acoustical Society of Korea
    • /
    • v.10 no.3
    • /
    • pp.31-36
    • /
    • 1991
  • We propose an efficient pre-classification algorithm extracting candidate words to reduce the recognition time in a large-vocabulary recognition system and also propose the use of spectral and temporal smoothing of the observation probability to improve its classification performance. The proposed algorithm computes the coarse likelihood score for each word in a lexicon using the observation probabilities of speech spectra and duration information of recognition units. With the proposed approach we could reduce the computational amount by 74% with slight degradation of recognition accuracy in 1160-word recognition system based on the phoneme-level HMM. Also, we observed that the proposed coarse likelihood score computation algorithm is a good estimator of the likelihood score computed by the Viterbi algorithm.

  • PDF