• Title/Summary/Keyword: Phoneme

Search Result 458, Processing Time 0.024 seconds

Implementation of Serious Games with Language-Based Cognitive Enhancement for BIF Children (경계선지적기능 아동을 위한 언어기반 인지강화 기능성 게임 구현)

  • Ryu, Su-Rin;Park, Hyunju;Chung, Dong Gyu;Baik, Kyoungsun;Yun, Hongoak
    • Journal of Digital Contents Society
    • /
    • v.19 no.6
    • /
    • pp.1051-1060
    • /
    • 2018
  • This study aims to propose instituting the serious games of language-based cognitive enhancement for the BIF children. The program consists of 4 cognitive areas (perception, attention, working memory, knowledge inference) in 4 language dimensions (phoneme, word, sentence, discourse). 16 games of 4 areas/2 dimensions with 3 difficulty levels were implemented in a mobile station and pilot-tested to children including BIFs. The results from the pilot tests supported for the validity and effectiveness of our games: Children's game performance correlated with their IQ scores (overall and sub-areas) revealing significant differences between the groups. The stroop scores in pre-and-post training hinted the increase of children's cognitive control.

Natural 3D Lip-Synch Animation Based on Korean Phonemic Data (한국어 음소를 이용한 자연스러운 3D 립싱크 애니메이션)

  • Jung, Il-Hong;Kim, Eun-Ji
    • Journal of Digital Contents Society
    • /
    • v.9 no.2
    • /
    • pp.331-339
    • /
    • 2008
  • This paper presents the development of certain highly efficient and accurate system for producing animation key data for 3D lip-synch animation. The system developed herein extracts korean phonemes from sound and text data automatically and then computes animation key data using the segmented phonemes. This animation key data is used for 3D lip-synch animation system developed herein as well as commercial 3D facial animation system. The conventional 3D lip-synch animation system segments the sound data into the phonemes based on English phonemic system and produces the lip-synch animation key data using the segmented phoneme. A drawback to this method is that it produces the unnatural animation for Korean contents. Another problem is that this method needs the manual supplementary work. In this paper, we propose the 3D lip-synch animation system that can segment the sound and text data into the phonemes automatically based on Korean phonemic system and produce the natural lip-synch animation using the segmented phonemes.

  • PDF

Time-Synchronization Method for Dubbing Signal Using SOLA (SOLA를 이용한 더빙 신호의 시간축 동기화)

  • 이기승;지철근;차일환;윤대희
    • Journal of Broadcast Engineering
    • /
    • v.1 no.2
    • /
    • pp.85-95
    • /
    • 1996
  • The purpose of this paper Is to propose a dubbed signal time-synchroniztion technique based on the SOLA(Synchronized Over-Lap and Add) method which has been widely used to modify the time scale of speech signal. In broadcasting audio recording environments, the high degree of background noise requires dubbing process. Since the time difference between the original and the dubbed signal ranges about 200mili seconds, process is required to make the dubbed signal synchronize to the corresponding image. The proposed method finds he starting point of the dubbing signal using the short-time energy of the two signals. Thereafter, LPC cepstrum analysis and DTW(Dynamic Time Warping) process are applied to synchronize phoneme positions of the two signals. After determining the matched point by the minimum mean square error between orignal and dubbed LPC cepstrums, the SOLA method is applied to the dubbed signal, to maintain the consistency of the corresponding phase. Effectiveness of proposed method is verified by comparing the waveforms and the spectrograms of the original and the time synchronized dubbing signal.

  • PDF

Application and Technology of Voice Synthesis Engine for Music Production (음악제작을 위한 음성합성엔진의 활용과 기술)

  • Park, Byung-Kyu
    • Journal of Digital Contents Society
    • /
    • v.11 no.2
    • /
    • pp.235-242
    • /
    • 2010
  • Differently from instruments which synthesized sounds and tones in the past, voice synthesis engine for music production has reached to the level of creating music as if actual artists were singing. It uses the samples of human voices naturally connected to the different levels of phoneme within the frequency range. Voice synthesis engine is not simply limited to the music production but it is changing cultural paradigm through the second creations of new music type including character music concerts, media productions, albums, and mobile services. Currently, voice synthesis engine technology makes it possible that users input pitch, lyrics, and musical expression parameters through the score editor and they mix and connect voice samples brought from the database to sing. New music types derived from such a development of computer music has sparked a big impact culturally. Accordingly, this paper attempts to examine the specific case studies and the synthesis technologies for users to understand the voice synthesis engine more easily, and it will contribute to their variety of music production.

The Acquisition Process of Vowel System in Korean (한국어 모음 체계 습득 과정)

  • 안미리;김응모;김태경
    • Korean Journal of Cognitive Science
    • /
    • v.15 no.1
    • /
    • pp.1-11
    • /
    • 2004
  • The aim of this study is to reveal the order and the age of mastery of phonemic contrast in vowel sounds of Korean. For this purpose, we made an observation of the correspondences between the sounds produced by children of 12-35 months and the target sounds produced by adults. The provisional order and the age of contrast acquisition shown from the results of this study are as follows. First, the differential production of vowels by the feature relating to the body of the tongue precedes the differential production of vowels by the feature relating to the lip rounding. Second, as for the differential production of vowels by the feature relating to the body of the tongue, the contrast between the low vowels and the others is accomplished first, and the contrast between the high and low vowels and the contrast between the front and the back vowels are established around the age of 24 months. Third, as for the differential production of vowels by the feature relating to the lip rounding, the contrast between the rounded and the unrounded vowel is not accomplished until 36 months. Finally, we observed, prior to the completion of the differential production of phonemes, children use a specific phoneme excessively. This passing phrase could be interpreted as a result of over-application of a distinctive feature in the course of acquisition of it.

  • PDF

Word Recognition Using K-L Dynamic Coefficients (K-L 동적 계수를 이용한 단어 인식)

  • 김주곤
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1998.06c
    • /
    • pp.103-106
    • /
    • 1998
  • 본 논문에서는 음성인식 시스템의 인식 정도의 향상을 위해서 동적 특징으로서 K-L(Karhanen-Loeve)계수를 이용하여 음소모델을 구성하는 방법을 제안하고, 음소, 단어, 숫자음 인식 실험을 통하여 그 유효성을 검토하였다. 인식 실험을 위한 음성자료는 한국 전자통신 연구소에서 채록한 445단어와 국어정보공학연구소에서 채록한 4연속 숫자음을 사용하였으며, K-L계수 동적 특징의 유효성을 확인하기 위해 정적 특징으로서 멜-켑스트럼과 동적 특징으로서 K-L계수 및 회귀계수를 추출한 후 음소, 단어, 숫자음 인식 실험을 수행하였다. 인식의 기본 단위로는 48개의 유사음소단위(Phoneme Likely Unite ; PLUs)를 음소모델로 사용하였으며, 단어와 숫자음 인식을 위해서는 유한상태 오토마타(Finite State Automata; FSA)에 의한 구문제어를 통한 OPDP(One Pass Dynamic Programming)법을 이용하였다. 인식 실험 결과, 음소인식에 있어서는 정적특징인 멜-켑스트럼을 사용한 경우 39.8%, K-L 동적 계수를 사용한 경우가 52.4%로 12.6%의 향상된 인식률을 얻었다. 또한, 멜-켑스트럼과 회수계수를 사용한 경우 60.1%, K-L계수와 회귀계수를 결합한 경우에 있어서도 60.4%로 높은 인식률은 얻었다. 이 결과를 단어인식에 확장하여 인식 실험을 수행한 결과, 기존의 멜-켑스트럼 계수를 사용한 경우 65.5%, K-L계수를 사용한 경우 75.8%로 10.3% 향상된 인식률을 얻었으며, 멜-켑스트럼과 회귀계수를 결합한 경우 91.2%, K-L계수와 회귀계수를 결합한 경우 91.4%의 높은 인식률을 보였다. 도한, 4연속 숫자음에 적용한 경우에 있어서도 멜-켑스트럼을 사용한 경우 67.5%, K-L계수를 사용한 경우 75.3%로 7.8%의 향상된 인식률을 보였으며 K-L계수와 회귀계수를 결합한 경우에서도 비교적 높은 인식률을 보여 숫자음에 대해서도 K-L계수의 유효성을 확인할 수 있었다.

  • PDF

The Study on Korean Prosody Generation using Artificial Neural Networks (인공 신경망의 한국어 운율 발생에 관한 연구)

  • Min Kyung-Joong;Lim Un-Cheon
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.337-340
    • /
    • 2004
  • The exactly reproduced prosody of a TTS system is one of the key factors that affect the naturalness of synthesized speech. In general, rules about prosody had been gathered either from linguistic knowledge or by analyzing the prosodic information from natural speech. But these could not be perfect and some of them could be incorrect. So we proposed artificial neural network(ANN)s that can be trained to team the prosody of natural speech and generate it. In learning phase, let ANNs learn the pitch and energy contour of center phoneme by applying a string of phonemes in a sentence to ANNs and comparing the output pattern with target pattern and making adjustment in weighting values to get the least mean square error between them. In test phase, the estimation rates were computed. We saw that ANNs could generate the prosody of a sentence.

  • PDF

A Study on Recognition Units for Korean Speech Recognition (한국어 분절음 인식을 위한 인식 단위에 대한 연구)

  • ;;Michael W. Macon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.6
    • /
    • pp.47-52
    • /
    • 2000
  • In the case of making large vocabulary speech recognition system, it is better to use the segment than the syllable or the word as the recognition mit. In this paper, we study on the proper recognition units for Korean speech recognition. For experiments, we use the speech toolkit of OGI in U.S.A. The result shows that the recognition rate of the case in which the diphthong is established as a single unit is superior to that of the case in which the diphthong is established as two units, i.e. a glide plus a vowel. And also, the recognition rate of the case in which the biphone is used as the recognition unit is better than that of the case in which the mono-phoneme is used.

  • PDF

Speech Recognition of the Korean Vowel 'ㅗ' Based on Time Domain Waveform Patterns (시간 영역 파형 패턴에 기반한 한국어 모음 'ㅗ'의 음성 인식)

  • Lee, Jae Won
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.11
    • /
    • pp.583-590
    • /
    • 2016
  • Recently, the rapidly increasing interest in IoT in almost all areas of casual human life has led to wide acceptance of speech recognition as a means of HCI. Simultaneously, the demand for speech recognition systems for mobile environments is increasing rapidly. The server-based speech recognition systems are typically fast and show high recognition rates; however, an internet connection is necessary, and complicated server computation is required since a voice is recognized by units of words that are stored in server databases. In this paper, we present a novel method for recognizing the Korean vowel 'ㅗ', as a part of a phoneme based Korean speech recognition system. The proposed method involves analyses of waveform patterns in the time domain instead of the frequency domain, with consequent reduction in computational cost. Elementary algorithms for detecting typical waveform patterns of 'ㅗ' are presented and combined to make final decisions. The experimental results show that the proposed method can achieve 89.9% recognition accuracy.

Research on the Bottom Boundary Line on the Southeast Area of the Chungcheongdo Dialect in Yeongdong (영동지역어내의 충청방언 남동부 하한선 연구)

  • Seong, Hee-Jae
    • Lingua Humanitatis
    • /
    • v.8
    • /
    • pp.265-289
    • /
    • 2006
  • The geographical characteristics of Yeongdong(永同) the southernmost part of the Chungcheongbukdo province, has attracted attention among the academic circle as one of the dialectal contact regions since it adjoins the Gyeongsang and Jeolla dialects. Unlike the local language in Mooju (Jellado dialect) adjacent to the Southwest part, the local language in Yeongdong is quite different from that of Kimcheon (Gyeongsang dialect). More specifically, it is noteworthy that the boundary line of the Gyeongsang dialect is found in this region, which is different from the administrative division. In other words, the local language in Yeongdong is divided into the Chungcheong dialect and the Gyeongsang dialect, and furthermore each dialect region still has the characteristics of the other region's dialect. For example, the phonological structure of Yeongdong Chungcheongdo dialect has very unique characteristics of the fudged dialect, which is seemingly influenced by the Gyeongsang dialect. The present study is to define the bottom boundary line of the southeast area of the Chungcheong dialect by identifying the boundary line between the Gyeongsang dialect and the Chungcheong dialect, and to clarify its specific sound system generated by the contact of these two dialects. For this, the author collected and analyzed data of the local language around Yeongdong and adjacent areas. It was found that Cheongwha-ri, Deokjin-ri, and Sanjeo-ri at Yeongsan-myeon, and Mugeunjeom, Sangga-ri, and Jungga-ri at Yeongdong-eup, among the regions that belongs to Chungcheong dialect within the local language of Yeongdong, show the characteristics of the Gyeongsang dialect. Accordingly, the western areas of these villages become the southeast boundary line of the Chungcheong dialect. Also, the unique phonological characteristics of the Yeongdong Chungcheong dialect is affected by the Gyeongsang dialect, among which "rhythms, y deletion, nasal phoneme deletion, and w deletion" appeared. It is thought to be the unique fudged dialectal phenomenon that appeared only in this region. The research result is expected to be of some help in finding out various aspects of dialectal contacts as well as clarifying the phonological features of the local language in Yeongdong, and thereby contributing to exact divisioning of the Chungcheong dialect.

  • PDF