• 제목/요약/키워드: syllables

검색결과 370건 처리시간 0.022초

한국어 인공신경망 기계번역의 서브 워드 분절 연구 및 음절 기반 종성 분리 토큰화 제안 (Research on Subword Tokenization of Korean Neural Machine Translation and Proposal for Tokenization Method to Separate Jongsung from Syllables)

  • 어수경;박찬준;문현석;임희석
    • 한국융합학회논문지
    • /
    • 제12권3호
    • /
    • pp.1-7
    • /
    • 2021
  • 인공신경망 기계번역(Neural Machine Translation, NMT)은 한정된 개수의 단어만을 번역에 이용하기 때문에 사전에 등록되지 않은 단어들이 입력으로 들어올 가능성이 있다. 이러한 Out of Vocabulary(OOV) 문제를 완화하고자 고안된 방법이 서브 워드 분절(Subword Tokenization)이며, 이는 문장을 단어보다 더 작은 서브 워드 단위로 분할하여 단어를 구성하는 방법론이다. 본 논문에서는 일반적인 서브 워드 분절 알고리즘들을 다루며, 나아가 한국어의 무한한 용언 활용을 잘 다룰 수 있는 사전을 만들기 위해 한국어의 음절 중 종성을 분리하여 서브 워드 분절을 학습하는 새로운 방법론을 제안한다. 실험결과 본 논문에서 제안하는 방법론이 기존의 서브 워드 분리 방법론보다 높은 성능을 거두었다.

한국어 음운구 억양 유형의 변별적 특성과 변이 조건에 대한 연구: 음절 수와 분절음 종류의 영향을 중심으로 (Distinguishing features and variability of intonation patterns in Korean phonological phrases: The effects of syllable count and segmental content)

  • 오재혁
    • 말소리와 음성과학
    • /
    • 제14권3호
    • /
    • pp.27-40
    • /
    • 2022
  • 이 연구는 한국어 음운구 억양 유형의 변별적 특성과 변이 조건을 밝히기 위한 목적의 일환으로 음운론적인 조건인 음절 수와 분절음 종류가 음운구 억양에 미치는 영향에 대해서 살펴보았다. 4음절을 기준으로, 음운구 억양은 LHLH를 기본형으로 설정할 수 있으며, 음절 수와 분절음 종류가 변이를 만드는 조건으로 작용한다고 할 수 있다. 음절 수는 억양을 곡선에서 직선으로 바꾸는데, 그 기준은 3음절 이하이다. 분절음은 음높이 대역과 음높이 변동에 영향을 미치는데, 첫 번째 분절음은 음운구 억양이 형성되는 음높이 대역에 영향을 미치고, 그 이하의 분절음은 음높이 변동에 영향을 미친다. 첫 번째 분절음이 [+기식성], [+긴장성], [+지속성]을 지니면 높은 대역, 그렇지 않으면 낮은 대역에서 억양이 형성된다. 높은 대역에서 실현되는 억양에서 두 번째 이하의 분절음이 [-기식성], [-긴장성], [-지속성]을 지니게 되면 음높이를 낮은 대역의 최하위까지 하강시키고, 낮은 대역에서 실현되는 억양에서는 [+기식성], [+긴장성], [+지속성]을 지닌 분절음이 LHLH의 두 번째 하강을 저지한다.

Comparison of McGurk Effect across Three Consonant-Vowel Combinations in Kannada

  • Devaraju, Dhatri S;U, Ajith Kumar;Maruthy, Santosh
    • Journal of Audiology & Otology
    • /
    • 제23권1호
    • /
    • pp.39-48
    • /
    • 2019
  • Background and Objectives: The influence of visual stimulus on the auditory component in the perception of auditory-visual (AV) consonant-vowel syllables has been demonstrated in different languages. Inherent properties of unimodal stimuli are known to modulate AV integration. The present study investigated how the amount of McGurk effect (an outcome of AV integration) varies across three different consonant combinations in Kannada language. The importance of unimodal syllable identification on the amount of McGurk effect was also seen. Subjects and Methods: Twenty-eight individuals performed an AV identification task with ba/ga, pa/ka and ma/ṇa consonant combinations in AV congruent, AV incongruent (McGurk combination), audio alone and visual alone condition. Cluster analysis was performed using the identification scores for the incongruent stimuli, to classify the individuals into two groups; one with high and the other with low McGurk scores. The differences in the audio alone and visual alone scores between these groups were compared. Results: The results showed significantly higher McGurk scores for ma/ṇa compared to ba/ga and pa/ka combinations in both high and low McGurk score groups. No significant difference was noted between ba/ga and pa/ka combinations in either group. Identification of /ṇa/ presented in the visual alone condition correlated negatively with the higher McGurk scores. Conclusions: The results suggest that the final percept following the AV integration is not exclusively explained by the unimodal identification of the syllables. But there are other factors which may also contribute to making inferences about the final percept.

Comparison of McGurk Effect across Three Consonant-Vowel Combinations in Kannada

  • Devaraju, Dhatri S;U, Ajith Kumar;Maruthy, Santosh
    • 대한청각학회지
    • /
    • 제23권1호
    • /
    • pp.39-48
    • /
    • 2019
  • Background and Objectives: The influence of visual stimulus on the auditory component in the perception of auditory-visual (AV) consonant-vowel syllables has been demonstrated in different languages. Inherent properties of unimodal stimuli are known to modulate AV integration. The present study investigated how the amount of McGurk effect (an outcome of AV integration) varies across three different consonant combinations in Kannada language. The importance of unimodal syllable identification on the amount of McGurk effect was also seen. Subjects and Methods: Twenty-eight individuals performed an AV identification task with ba/ga, pa/ka and ma/ṇa consonant combinations in AV congruent, AV incongruent (McGurk combination), audio alone and visual alone condition. Cluster analysis was performed using the identification scores for the incongruent stimuli, to classify the individuals into two groups; one with high and the other with low McGurk scores. The differences in the audio alone and visual alone scores between these groups were compared. Results: The results showed significantly higher McGurk scores for ma/ṇa compared to ba/ga and pa/ka combinations in both high and low McGurk score groups. No significant difference was noted between ba/ga and pa/ka combinations in either group. Identification of /ṇa/ presented in the visual alone condition correlated negatively with the higher McGurk scores. Conclusions: The results suggest that the final percept following the AV integration is not exclusively explained by the unimodal identification of the syllables. But there are other factors which may also contribute to making inferences about the final percept.

Definition end Function of Two Song Types of the Bush Warbler (Cettia diphone boreoalis)

  • Shi-Ryong Park;Eui-Dong Han;Ha-Cheol Sung
    • Animal cells and systems
    • /
    • 제3권2호
    • /
    • pp.149-151
    • /
    • 1999
  • It has been suggested that the bush warbler (Cettia diphone borealis) uses different song types in various situations. We analyzed song features and conducted playback experiments in order to reveal the function of songs of the bush warbler. Two song types were identified. The short song type has a shorter song duration than that of normal song types and consists of only one or two syllables. Due to its short syllable and low amplitude of the whistle portion, we were able to discriminate the short song type (S song type). from the normal song type (N song Type). In the playback experiments, bush warblers sang high rates of short song type for the first three minutes after playback. After 6 minutes of playback, males changed to singing normal songs. These results suggest that the short song of the bush warbler may function to threaten or drive off intruding males.

  • PDF

대화체와 낭독체의 운율에 관한 연구 (Some Prosodic Aspects of Read Speech and Dialogue in Korean)

  • 박지혜
    • 대한음성학회지:말소리
    • /
    • 제43호
    • /
    • pp.11-23
    • /
    • 2002
  • In this paper, speech style is divided into two - read speech and dialogue. In the experiment, read speech and dialogue use the same sentence to control discrepancy from different sentence. While the number of AP in read speech is less than in dialogue, the number of IP in read speech is more than in dialogue. The number of syllables which consist of AP is more various in dialogue. Intonational patterns of the first AP in IP make a difference. In dialogue, there is a pattern which has many high tones - LHH. The FO range in dialogue is wider than in read speech.

  • PDF

효율적 한국어 음성 인식을 위한 PTM 음절 모델 (Phonetic Tied-Mixture Syllable Model for Efficient Decoding in Korean ASR)

  • 김봉완;이용주
    • 대한음성학회지:말소리
    • /
    • 제50호
    • /
    • pp.139-150
    • /
    • 2004
  • A Phonetic Tied-Mixture (PTM) model has been proposed as a way of efficient decoding in large vocabulary continuous speech recognition systems (LVCSR). It has been reported that PTM model shows better performance in decoding than triphones by sharing a set of mixture components among states of the same topological location[5]. In this paper we propose a Phonetic Tied-Mixture Syllable (PTMS) model which extends PTM technique up to syllables. The proposed PTMS model shows 13% enhancement in decoding speed than PTM. In spite of difference in context dependent modeling (PTM : cross-word context dependent modeling, PTMS : word-internal left-phone dependent modeling), the proposed model shows just less than 1% degradation in word accuracy than PTM with the same beam width. With a different beam width, it shows better word accuracy than in PTM at the same or higher speed.

  • PDF

Thai Phoneme Segmentation using Dual-Band Energy Contour

  • Ratsameewichai, S.;Theera-Umpon, N.;Vilasdechanon, J.;Uatrongjit, S.;Likit-Anurucks, K.
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2002년도 ITC-CSCC -1
    • /
    • pp.110-112
    • /
    • 2002
  • In this paper, a new technique for Thai isolated speech phoneme segmentation is proposed. Based on Thai speech feature, the isolated speech is first divided into low and high frequency components by using the technique of wavelet decomposition. Then the energy contour of each decomposed signal is computed and employed to locate phoneme boundary. To verity the proposed scheme, some experiments have been performed using 1,000 syllables data recorded from 10 speakers. The accuracy rates are 96.0, 89.9, 92.7 and 98.9% for initial consonant, vowel, final consonant and silence, respectively.

  • PDF

다차원 척도 구성법을 이용한 한국어 음소의 분석 (Analysis of Korean Phonemes Using Multi-Dimentional Scaling Method)

  • 권영욱;정현열
    • 전자공학회논문지B
    • /
    • 제29B권11호
    • /
    • pp.22-30
    • /
    • 1992
  • Using Multi-Dimentional Scaling(MDS) method, this paper analyzes the differences of acoustic properties of Korean phonemes projected as distances on a plan space. The phonemes were extracted from mono-syllables frequently occurring in daily conversation. From the distances between vowels we found that the distances between vowels /∂/ and /w/, /o/ and /u/, and vowels /$\varepsilon$/ and /e/ were relatively too short separate automatically. From the analysis of consonants. we found short distances between 1) phonemes in each phoneme group, 2) nasals and liquid /r/ of work initial, 3) nasal /n,m/ and liquid /l/ of word finals. But nasals, liquids and plosives of word initials had enough distances to be separated from those of word finals in automatic recogniation.

  • PDF

영어 학습 시의 발성 교정 기술에 관한 연구 (Study on the pronunciation correction in English Learning)

  • 김재민;백승권;한민수
    • 한국음향학회:학술대회논문집
    • /
    • 한국음향학회 2000년도 하계학술발표대회 논문집 제19권 1호
    • /
    • pp.119-122
    • /
    • 2000
  • In this paper, we implement an elementary system to correct accent, pronunciation, and intonation in English spoken by non-native English speakers. In case of the accent evaluation, energy and pitch information are used to find stressed syllables, and then we extract the segment information of input patterns using a dynamic time warping method to discriminate and evaluate accent position. For the pronunciation evaluation. we utilize the segment information using the same algorithm as in accent evaluation and calculate the spectral distance measure for each phoneme between input and reference. For the intonation evaluation. we propose nine pattern of slope to estimate pitch contour, then we grade test sentences by accumulated error obtained by the distance measure and estimated slope. Our result shows that 98 percent of accent and 71 percent of pronunciation evaluation agree with perceptual measure. As the result of the intonation evaluation. system represent the similar order of grade for the four sentences having different intonation patterns compared with perceptual evaluation.

  • PDF