• Title/Summary/Keyword: phonemic

Search Result 95, Processing Time 0.022 seconds

Automatic segmentation for continuous spoken Korean language recognition based on phonemic TDNN (음소단위 TDNN에 기반한 한국어 연속 음성 인식을 위한 데이타 자동분할)

  • Baac, Coo-Phong;Lee, Geun-Bae;Lee, Jong-Hyeok
    • Annual Conference on Human and Language Technology
    • /
    • 1995.10a
    • /
    • pp.30-34
    • /
    • 1995
  • 신경망을 이용하는 연속 음성 인식에서 학습이라 함은 인위적으로 분할된 음성 데이타를 토대로 진행되는 것이 지배적이었다. 그러나 분할된 음성데이타를 마련하기 위해서는 많은 시간과 노력, 숙련 등을 요구할 뿐만아니라 그 자체가 인식도메인의 변화나 확장을 어렵게 하는 하나의 요인 되기도 한다. 그래서 분할된 음성데이타의 사용을 가급적 피하고 그러면서도 성능을 떨어뜨리지 않는 신경망 학습법들이 나타나고 있다. 본 논문에서는 학습된 인식기를 이용하여 자동으로 한국어 음성데이타를 분할한 후 그 분할된 데이타를 이용하여 다시 인식기를 재학습시켜나가는 반복 과정을 소개하고자 한다. 여기에는 TDNN이 인식기로 사용되며 인식단위는 음소이다. 학습은 cross-validation 기법을 이용하여 제어된다.

  • PDF

A Study on English Reduced Vowels Produced by Korean Learners and Native Speakers of English (한국인 영어학습자와 영어원어민이 발화한 영어 약화모음에 관한 연구)

  • Shin, Seung-Hoon;Yoon, Nam-Hee;Yoon, Kyu-Chul
    • Phonetics and Speech Sciences
    • /
    • v.3 no.4
    • /
    • pp.45-53
    • /
    • 2011
  • Flemming and Johnson (2007) claim that there is a fundamental distinction between the mid central vowel [ə] and the high central vowel [?] in that [ə] occurs in an unstressed word-final position while [?] appears elsewhere. Compared to English counterparts, Korean [ə] and [?] are full vowels and they have phonemic contrast. The purpose of this paper is to explore the acoustic quality of two English reduced vowels produced by Korean learners and native speakers of English in terms of their two formant frequencies. Sixteen Korean learners of English and six native speakers of English produced four types of English words and two types of Korean words with different phonological and morphological patterns. The results show that Korean learners of English produced the two reduced vowels of English and their Korean counterparts differently in Korean and English words.

  • PDF

Parallel sound change between segmental and suprasegmental properties: An individual level observation

  • Lee, Hyunjung
    • Phonetics and Speech Sciences
    • /
    • v.8 no.4
    • /
    • pp.23-29
    • /
    • 2016
  • The present study tested if individual speakers showing great sound change in segments (i.e., vowels and fricatives) also had innovative changing patterns in suprasegmental properties (i.e., lexical pitch accents) in Kyungsang Korean. The acoustic analysis at a group level first confirmed the presence of group level differences in distinguishing /ɨ-ʌ/ and /s-s'/ both of which had different phonemic distinction from Seoul Korean. Younger speakers had more innovative segmental change than older speakers, and even within the younger generation, female speakers produced more innovative phonetic variants than male speakers. Regarding the individual observation within the younger group, the younger speakers with large acoustic distinction in vowels and fricatives also showed acoustically less distinct accent patterns, indicating the innovative sound change pattern consistent across segment and suprasegmental properties. The group and individual observations suggested that linguistic innovators introduced new phonetic variants with consistent degree of changing pattern between segment and suprasegmental properties.

A Speech Representation and Recognition Method using Sign Patterns (부호패턴에 의한 음성표현과 인식방법)

  • Kim Young Hwa;Kim Un Il;Lee Hee Jeong;Park Byung Chul
    • The Journal of the Acoustical Society of Korea
    • /
    • v.8 no.5
    • /
    • pp.86-94
    • /
    • 1989
  • In this paper the method using a sign pattern( +,- ) of Mel-cepstrum coefficients as a new speech representation is proposed. Relatively stable patterns can be obtained for speech signals which has strong stationarity like vowels and nasals, and the phonemic difference according to the individuality of speakers can be absorbed without affecting characteristics of the phoneme. In this paper we show that the reduction of recognition procedure of phonemes and training procedure of phoneme models can be achieved through the representation of Korean phonemes using such a sign pattern.

  • PDF

Modeling Cross-morpheme Pronunciation Variations for Korean Large Vocabulary Continuous Speech Recognition (한국어 연속음성인식 시스템 구현을 위한 형태소 단위의 발음 변화 모델링)

  • Chung Minhwa;Lee Kyong-Nim
    • MALSORI
    • /
    • no.49
    • /
    • pp.107-121
    • /
    • 2004
  • In this paper, we describe a cross-morpheme pronunciation variation model which is especially useful for constructing morpheme-based pronunciation lexicon to improve the performance of a Korean LVCSR. There are a lot of pronunciation variations occurring at morpheme boundaries in continuous speech. Since phonemic context together with morphological category and morpheme boundary information affect Korean pronunciation variations, we have distinguished phonological rules that can be applied to phonemes in within-morpheme and cross-morpheme. The results of 33K-morpheme Korean CSR experiments show that an absolute reduction of 1.45% in WER from the baseline performance of 18.42% WER was achieved by modeling proposed pronunciation variations with a possible multiple context-dependent pronunciation lexicon.

  • PDF

Comparing the Intelligibility of Spastic and Flaccid Types (경직형과 이완형 마비말장애의 명료도 비교)

  • Kim Soo-Jin
    • MALSORI
    • /
    • no.48
    • /
    • pp.1-17
    • /
    • 2003
  • Among the types of dysarthria, spastic and flaccid types are the most prominent manifestations. The objectives of the present research are (1) to discover the phonetic contrasts that differentiate spastic dysarthria from flaccid dysarthria, (2) to analyze the degrees of predictability of each phonetic contrast for intelligibility in spastic and flaccid dysarthrias and to compare them. The 'phonemic contrast word intelligibility pairs' for dysarthric speakers were tested and proved to be useful for clinical assessment of and research on dysarthria. In the group of spastic type, it showed that initial fricative vs. affricate and front vs. back vowel contrasts are transmitted relatively less effectively than flaccid type. In the group of flaccid type, initial glottal vs null contrast is transmitted less effectively than spastic type. The overall intelligibility of spastic dysarthria was predicted by multiple regression analysis with 88% accuracy by three phonetic contrasts(initial fricative vs. affricate; front vs. back vowels; initial consonant correlates). And the intelligibility of flaccid dysarthria was predicted by two phonetic contrasts(initial nasal vs. stop, front vs. back vowels) with 60% accuracy.

  • PDF

Duration of bodies and rhymes in Korean and English syllables (한국어와 영어 음절의 지속시간에 대한 비교연구 -음절체와 각운을 중심으로-)

  • Paik Euna;Noh Dongwoo;Jeong Okran;Kang Sookyoon
    • Proceedings of the KSPS conference
    • /
    • 2003.10a
    • /
    • pp.169-172
    • /
    • 2003
  • The purpose of this study was to provide preliminary data on the acoustical differences of one syllable words spoken by speakers with different language backgrounds. 20 native speakers of Korean and English were asked to read 7 one-syllable words written in their native language. The phonetic and phonemic characteristics of 7 words were similar between two languages. The ratio of duration of the body (onset+nucleus) and the rhyme(nucleus+coda) relative to the duration of each syllable were calculated using CSL (Computerized Speech Laboratory). The results corresponds to the body-coda structure of the Korean syllable which is supported by the recent experimental psychological studies. More acoustic studies on the Korean syllable structure are required to establish clinical foundation for the phonological awareness and the reading intervention programs.

  • PDF

A Harmony in Language and Music (언어와 음악의 상관관계 고찰을 위한 연구)

  • 이재강
    • Lingua Humanitatis
    • /
    • v.2 no.1
    • /
    • pp.287-301
    • /
    • 2002
  • Either in music or in language, sound plays its role by taking up the fixed multi-spaces in one's consciousness. Music space differs from auditory space whose aim Is to perceive the positions and identities of the outer things. While auditory space is based on the interests of the outer things, music space is based on the indifference. We discuss the notion of space because it is where symbols reside. Categorial perception about the phonemic restoration describes the ability of a listener how to use his own intelligence to acknowledge and fill the missing points; however, musical perception can be explained as a positive regression to avoid colloquial logic and danger of segmentation in the course of auditory experience and phonation acquisition by an infant. About the question on the difference of the listening to the language sound and other sound, auditory mechanism proceeds language sound the same as other types of sound. But there are another theories which claim that brain proceeds the farmer differently from the latter. The function of music has not been discovered as clear as that of language; music has much more meanings in comparison with language.

  • PDF

An Implementation Method for The Phonemic and Syllabic Character Attributes of Hangul Character (한글 문자의 음소 및 음절 문자 특성의 구현 방안)

  • Byun, Jeong-Yong;Kang, Jin-Gon
    • Annual Conference on Human and Language Technology
    • /
    • 1994.11a
    • /
    • pp.288-294
    • /
    • 1994
  • 훈민정음 해례에 따르면 한글문자는 음소 및 음절 문자 특성을 가지고 있다. 이러한 특성들을 컴퓨터 시스템에서 구현함에 있어서 야기되어 왔던 각종 문제를 분석한 다음 이들 문제들에 대하여 한글문자의 특성을 제약함이 없이 컴퓨터에 대한 기술을 개발함으로써 해결책을 모색한다. 본 논문은 훈민정음 해례에서 밝힌 한글 문자의 음소 및 음절 문자 특성에 따라서 기존의 코드 체계를 평가하며, 그리고 이들에 대한 구현 방안을 제시하고자 한다. 또한 이러한 특성을 반영한 한글 입출력들인 '셔블'을 개발하고 이에 대한 검증을 시도하였다.

  • PDF

SPEECH TRAINING TOOLS BASED ON VOWEL SWITCH/VOLUME CONTROL AND ITS VISUALIZATION

  • Ueda, Yuichi;Sakata, Tadashi
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2009.01a
    • /
    • pp.441-445
    • /
    • 2009
  • We have developed a real-time software tool to extract a speech feature vector whose time sequences consist of three groups of vector components; the phonetic/acoustic features such as formant frequencies, the phonemic features as outputs on neural networks, and some distances of Japanese phonemes. In those features, since the phoneme distances for Japanese five vowels are applicable to express vowel articulation, we have designed a switch, a volume control and a color representation which are operated by pronouncing vowel sounds. As examples of those vowel interface, we have developed some speech training tools to display a image character or a rolling color ball and to control a cursor's movement for aurally- or vocally-handicapped children. In this paper, we introduce the functions and the principle of those systems.

  • PDF