• Title/Summary/Keyword: Phoneme

Search Result 458, Processing Time 0.026 seconds

The Lexical Access of Regular and Irregular Korean Verbs in the Mental Lexicon (한국어 규칙 동사와 불규칙 동사의 심성 어휘집 접근 과정)

  • Park, Hee-Jin;Koo, Min-Mo;Nam, Ki-Chun
    • Korean Journal of Cognitive Science
    • /
    • v.23 no.1
    • /
    • pp.1-23
    • /
    • 2012
  • This study investigated the lexical access processing of inflected Korean verbs in the mental lexicon. In Korean, verbs can be classified into two main types of inflections, which are regular and irregular inflections, which can be further divided into three types of regular inflections and two types of irregular inflections. A masked priming lexical decision task was used and the priming effects were compared. Experiments were carried out using the five different types of verbal inflections in Korean: (1) No change-regularity (regular verbs with no orthographical or phonological changes), (2) Phonological change-regularity (regular verbs with phonological changes to the stem only), (3) Orthographical change-regularity (regular verbs that only undergo orthographical changes), (4) Stem change-irregularity (the stem is omitted or alternated with the other phoneme of the stem in irregular verbs), (5) Ending change-irregularity (irregular verbs with changes in the endings by phoneme substitution). The first three types are regarded as regular verbal inflections whereas the latter two types are regarded as irregular verbal inflections. The infinitive forms of the verb were presented as target words and three different conditions were presented as prime words. The three conditions included regular verbal inflection, irregular verbal inflection, and a control condition in which morphologically and semantically unrelated primes were presented. In addition, different stimulus onset asynchrony (SOA) were manipulated (43ms, 72ms, 230ms) to examine the time frame of the morphological decomposition process in word recognition. The results revealed that there were significant priming effects in all three SOAs across conditions. Hence, there was no significant differences in priming effects between regular and irregular verbal inflection conditions. This may suggest that Korean verb processing does not adopt different processing routes for regular and irregular inflections, which can also be an indication of earlier morphological information processing for Korean verbs.

  • PDF

Knowledge based Text to Facial Sequence Image System for Interaction of Lecturer and Learner in Cyber Universities (가상대학에서 교수자와 학습자간 상호작용을 위한 지식기반형 문자-얼굴동영상 변환 시스템)

  • Kim, Hyoung-Geun;Park, Chul-Ha
    • The KIPS Transactions:PartB
    • /
    • v.15B no.3
    • /
    • pp.179-188
    • /
    • 2008
  • In this paper, knowledge based text to facial sequence image system for interaction of lecturer and learner in cyber universities is studied. The system is defined by the synthesis of facial sequence image which is synchronized the lip according to the text information based on grammatical characteristic of hangul. For the implementation of the system, the transformation method that the text information is transformed into the phoneme code, the deformation rules of mouse shape which can be changed according to the code of phonemes, and the synthesis method of facial sequence image by using deformation rules of mouse shape are proposed. In the proposed method, all syllables of hangul are represented 10 principal mouse shape and 78 compound mouse shape according to the pronunciation characteristics of the basic consonants and vowels, and the characteristics of the articulation rules, respectively. To synthesize the real time facial sequence image able to realize the PC, the 88 mouth shape stored data base are used without the synthesis of mouse shape in each frame. To verify the validity of the proposed method the various synthesis of facial sequence image transformed from the text information is accomplished, and the system that can be applied the PC is implemented using the proposed method.

Conformer with lexicon transducer for Korean end-to-end speech recognition (Lexicon transducer를 적용한 conformer 기반 한국어 end-to-end 음성인식)

  • Son, Hyunsoo;Park, Hosung;Kim, Gyujin;Cho, Eunsoo;Kim, Ji-Hwan
    • The Journal of the Acoustical Society of Korea
    • /
    • v.40 no.5
    • /
    • pp.530-536
    • /
    • 2021
  • Recently, due to the development of deep learning, end-to-end speech recognition, which directly maps graphemes to speech signals, shows good performance. Especially, among the end-to-end models, conformer shows the best performance. However end-to-end models only focuses on the probability of which grapheme will appear at the time. The decoding process uses a greedy search or beam search. This decoding method is easily affected by the final probability output by the model. In addition, the end-to-end models cannot use external pronunciation and language information due to structual problem. Therefore, in this paper conformer with lexicon transducer is proposed. We compare phoneme-based model with lexicon transducer and grapheme-based model with beam search. Test set is consist of words that do not appear in training data. The grapheme-based conformer with beam search shows 3.8 % of CER. The phoneme-based conformer with lexicon transducer shows 3.4 % of CER.

Phonological Status of Korean /w/: Based on the Perception Test

  • Kang, Hyun-Sook
    • Phonetics and Speech Sciences
    • /
    • v.4 no.3
    • /
    • pp.13-23
    • /
    • 2012
  • The sound /w/ has been traditionally regarded as an independent segment in Korean regardless of the phonological contexts in which it occurs. There have been, however, some questions regarding whether it is an independent phoneme in /CwV/ context (cf. Kang 2006). The present pilot study examined how Korean /w/ is realized in $/S^*wV/$ context by performing some perception tests. Our assumption was that if Korean /w/ is a part of the preceding complex consonant like $/C^w/$, it should be more or less uniformly articulated and perceived as such. If /w/ is an independent segment, it will be realized with speaker variability. Experiments I and II examined the identification rates as "labialized" of the spliced original stimuli of $/S^*-V/$ and $/S^{w*}-^wV/$, and the cross-spliced stimuli $/S^{w*}-V/$ and $/S^*-^wV/$. The results showed that round qualities of /w/ are perceived at significantly different temporal point with speaker and context variability. We therefore conclude that /w/ in $/S^*wV/$ context is an independent segment, not a part of the preceding segment. Full-scale examination of the production test in the future should be performed to verify the conclusion we suggested in this paper.

Alveolar Fricative Sound Errors by the Type of Morpheme in the Spontaneous Speech of 3- and 4-Year-Old Children (자발화에 나타난 형태소 유형에 따른 3-4세 아동의 치경마찰음 오류)

  • Kim, Soo-Jin;Kim, Jung-Mee;Yoon, Mi-Sun;Chang, Moon-Soo;Cha, Jae-Eun
    • Phonetics and Speech Sciences
    • /
    • v.4 no.3
    • /
    • pp.129-136
    • /
    • 2012
  • Korean alveolar fricatives are late-developing speech sounds. Most previous research on phonemes used individual words or pseudo words to produce sounds, but word-level phonological analysis does not always reflect a child's practical articulation ability. Also, there has been limited research on articulation development looking at speech production by grammatical morphemes despite its importance in Korean language. Therefore, this research examines the articulation development and phonological patterns of the /s/ phoneme in terms of morphological types produced in children's spontaneous conversational speech. The subjects were twenty-two typically developing 3- and 4-year-old Koreans. All children showed normal levels in three screening tests: hearing, vocabulary, and articulation. Spontaneous conversational samples were recorded at the children's homes. The results are as follows. The error rates decreased with increasing age in all morphological contexts. Also, error percentages within an age group were significantly lower in lexical morphemes than in grammatical morphemes. The stopping of fricative sounds was the main error pattern in all morphological contexts and reduced as age increased. This research shows that articulation performance can differ significantly by morphological contexts. The present study provides data that can be used to identify the difficult context for articulatory evaluation and therapy of alveolar fricative sounds.

Recurrent Neural Network with Backpropagation Through Time Learning Algorithm for Arabic Phoneme Recognition

  • Ismail, Saliza;Ahmad, Abdul Manan
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2004.08a
    • /
    • pp.1033-1036
    • /
    • 2004
  • The study on speech recognition and understanding has been done for many years. In this paper, we propose a new type of recurrent neural network architecture for speech recognition, in which each output unit is connected to itself and is also fully connected to other output units and all hidden units [1]. Besides that, we also proposed the new architecture and the learning algorithm of recurrent neural network such as Backpropagation Through Time (BPTT, which well-suited. The aim of the study was to observe the difference of Arabic's alphabet like "alif" until "ya". The purpose of this research is to upgrade the people's knowledge and understanding on Arabic's alphabet or word by using Recurrent Neural Network (RNN) and Backpropagation Through Time (BPTT) learning algorithm. 4 speakers (a mixture of male and female) are trained in quiet environment. Neural network is well-known as a technique that has the ability to classified nonlinear problem. Today, lots of researches have been done in applying Neural Network towards the solution of speech recognition [2] such as Arabic. The Arabic language offers a number of challenges for speech recognition [3]. Even through positive results have been obtained from the continuous study, research on minimizing the error rate is still gaining lots attention. This research utilizes Recurrent Neural Network, one of Neural Network technique to observe the difference of alphabet "alif" until "ya".

  • PDF

Duration of the Japanese 'sokuon' and 'haneruon' in Korean and Japanese pronunciation (촉음과 발음에 관한 한국인과 일본인의 지속시간 연구)

  • Lee Jae Kang
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • autumn
    • /
    • pp.325-328
    • /
    • 1999
  • The aim of this paper is to measure the duration of Japanese 'sokuon' [t/k] and 'haneron' [m/n] pronounced by Korean and Japanese. It is revealed in this study that gemination of the Japanese 'sokuon' in Korean pronunciation lasts 1.5 times longer than a single consonant, whereas it lasts 2 times longer in Japanese pronunciation. The difference between Korean and Japanese seems to show the difficulty of perceiving and learning a foreign rhythmic pattern non-existent in the leaner's language. The gemination of [s] phoneme lasts 2 times as long as a single consonant in both Korean and Japanese pronunciation. On average, the duration of Japanese 'sokuon' [t/k/s] is 1.7 times longer than a single consonant in Korean pronunciation, whereas 2 times longer in Japanese pronunciation. The pronunciation of Japanese 'haneruon' by either Korean or Japanese produces a similar result: 1) gemination lasts longer than a single consonant, 2) the duration of the single [m] is longer than that of the single [n]; 3) gemination of [n] is 3 times as long as a single [n], whereas gemination of [m] is 2 times as long as a single [m].

  • PDF

An acoustic study of word-timing with references to Korean (한국어 분류에 관한 음향음성학적 연구)

  • 김대원
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06c
    • /
    • pp.323-327
    • /
    • 1994
  • There have been three contrastive claims over the classification of Korean. To answer the classification question, timing variables which would determine the durations of syllable, word and foot were investigated with various words either in isolation or in sentence contexts using Soundcoup/16 on Macintosh P.C., and a total of 284 utterances, obtained from six Korean speakers, were used. It was found 1) that the durational pattern for words tended to maintain in utterances, regardless of position , subjects and dialects 2) that the syllable duration was determined both by the types of phoneme and by the number of phonemes, the word duration both by the syllable complexity and by the number of syllables, and the foot duration by the word complexity, 3) that there was a constractive relationship between foot length in syllables and foot duration and 4) that the foot duration varied generally with word complexity if the same word did not occur both in the first foot and in the second foot. On the basis of these, it was concluded that Korean is a word timed language where, all else being equal, including tempo, emphasis, etc., the inherent durational pattern for words tends to maintain in utterances. The main difference between stress timing, syllable timing and word timing were also discussed.

  • PDF

Implementation of Automatic Phoneme Labelling System Using Context-dependent Demi-phone Unit and Performance Evaluation (문맥종속 반음소단위에 의한 자동 음운 레이블링 시스템의 구현 및 성능평가)

  • Park Soon-Cheol;Kim Tae-Hwan;Kim Bong-Wan;Lee Yong-Ju
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • autumn
    • /
    • pp.65-70
    • /
    • 1999
  • 음소 단위로 레이블링된 데이터베이스는 음성연구에 있어 매우 중요하다. 그러나 수작업에 의한 음소분할 및 레이블링 작업은 많은 시간과 노력이 필요하기 때문에 자동 음소분할 및 레이블링 시스템에 대한 많은 연구가 진행되고 있다. 저자들은 자동레이블링 시스템에서 레이블링 분할의 단위로monophone과 triphone의 장점을 포함하는 문맥 종속 반음소 단위 모델을 이용한 자동 음소분할 및 레이블링 시스템을 제안한바 있다[1]. 본 논문에서는 문맥종속 반음소 단위 자동음소분할 및 레이블링 시스템의 성능을 개선하기 위하여, 반음소의 단위를 개선하였다. 기존에 제안된 반음소 단위는 음소의 중점을 기준으로 left/right의 반음소 단위로 양분하였다. 본 논문에서는 음소의 길이가 120ms 이상일 경우 음소의 천이구간의 특성을 잘 나타낼 수 있도록, 음소의 앞뒤구간 각각 60ms를 전반음소와 후반음소로 나누고, 나머지 안정구간을 별도의 모델로 구성하였다. 본 논문에서 제안한 반음소 단위의 성능을 평가하기 위하여 PBW 452단어를 발성한 남자 30명분의 데이터를 이용하여 레이블링 시스템을 훈련하고, 훈련에 사용하지 않은 남자 4명분의 데이터를 이용하여 테스트 하였다. 실험결과, 기존의 반음소 단위에 비하여 10ms에서 $69.09\%$$1.65\%$, 20ms에서 $85.32\%$$1.02\%$의 성능향상을 가져왔다.

  • PDF

The Automated Threshold Decision Algorithm for Node Split of Phonetic Decision Tree (음소 결정트리의 노드 분할을 위한 임계치 자동 결정 알고리즘)

  • Kim, Beom-Seung;Kim, Soon-Hyob
    • The Journal of the Acoustical Society of Korea
    • /
    • v.31 no.3
    • /
    • pp.170-178
    • /
    • 2012
  • In the paper, phonetic decision tree of the triphone unit was built for the phoneme-based speech recognition of 640 stations which run by the Korail. The clustering rate was determined by Pearson and Regression analysis to decide threshold used in node splitting. Using the determined the clustering rate, thresholds are automatically decided by the threshold value according to the average clustering rate. In the recognition experiments for verifying the proposed method, the performance improved 1.4~2.3 % absolutely than that of the baseline system.