• Title/Summary/Keyword: pronunciation variation

Search Result 20, Processing Time 0.023 seconds

Visualization of Korean Speech Based on the Distance of Acoustic Features (음성특징의 거리에 기반한 한국어 발음의 시각화)

  • Pok, Gou-Chol
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.13 no.3
    • /
    • pp.197-205
    • /
    • 2020
  • Korean language has the characteristics that the pronunciation of phoneme units such as vowels and consonants are fixed and the pronunciation associated with a notation does not change, so that foreign learners can approach rather easily Korean language. However, when one pronounces words, phrases, or sentences, the pronunciation changes in a manner of a wide variation and complexity at the boundaries of syllables, and the association of notation and pronunciation does not hold any more. Consequently, it is very difficult for foreign learners to study Korean standard pronunciations. Despite these difficulties, it is believed that systematic analysis of pronunciation errors for Korean words is possible according to the advantageous observations that the relationship between Korean notations and pronunciations can be described as a set of firm rules without exceptions unlike other languages including English. In this paper, we propose a visualization framework which shows the differences between standard pronunciations and erratic ones as quantitative measures on the computer screen. Previous researches only show color representation and 3D graphics of speech properties, or an animated view of changing shapes of lips and mouth cavity. Moreover, the features used in the analysis are only point data such as the average of a speech range. In this study, we propose a method which can directly use the time-series data instead of using summary or distorted data. This was realized by using the deep learning-based technique which combines Self-organizing map, variational autoencoder model, and Markov model, and we achieved a superior performance enhancement compared to the method using the point-based data.

Etymology of Kimchi: Philological Approach and Historical Perspective ('김치'의 어원 연구)

  • Paek, Doo-Hyeon
    • Journal of the Korean Society of Food Culture
    • /
    • v.34 no.2
    • /
    • pp.112-128
    • /
    • 2019
  • The history of modern Korean 'kimchi' can be traced through the history of the wordforms 'dihi' (디히), 'dimchʌi' (딤?), and 'thimchʌi' (팀?) in ancient Korean texts. As native Korean words, the 'dihi' word line ('dihi', 'dii', 'jihi', and 'ji') constitutes an old substratum. This word line coexisted with the 'dimchʌi' word line (dimchʌi, jimchʌi, and kim∫chi) from the Hanja '沈菜'. 'Ji', which is the last word variation of 'dihi', and is still used today as the unique form in several Korean dialects. In standard Korean, however, it only serves as a suffix to form the derivative names of various kimchi types. 'Dimchʌi' is believed to have appeared around the $6^{th}-7^{th}$ centuries, when Silla began to master Chinese characters. Hence,'dimchʌi' reflects either the Archaic Chinese (上古音) or the Old Chinese (中古音) pronunciation of the Hanja, '沈菜'. With the palatalization of the plosive alveolar [t], 'dimchʌi' changed to 'jimchʌi'. The Yangban intellectuals' rejection of the palatalization of the plosive velar [k] led to the hypercorrection of 'jimchʌi' into 'kimchʌi'. It is precisely the hypercorrect 'kimchʌe' that gave the wordform 'kim∫chi', which has eventually become the standard and predominant form in today's Korean language. Regarding 'thimchʌe', it reflects the Middle Chinese (Yuan Dynasty) pronunciation of the Hanja '沈菜' and was used mainly in writing by Yangban intellectuals.

A Comparative Study on the Public Speech Spectrum between ROK and USA Politicians (한국과 미국 정치인 대중연설 음성의 스펙트럼 비교 연구)

  • Chung, Eun-Ee;Lee, Sang-Ho
    • Journal of Digital Contents Society
    • /
    • v.17 no.3
    • /
    • pp.143-155
    • /
    • 2016
  • In this study, we focused on the importance of politicians' voices in sending a message. Different factors for a voice may play different roles in sending a message and affect message recipients' responsiveness, understanding, and so on. For this reason, it can be said that an analytical study on voices in sending a diversity of messages is a meaningful attempt. We took interest in politicians' voices because we determined that a voice should be very important to politicians frequently sending a message through speech to the nation and others. This study aimed to investigate the voices of politicians, who represent their nation. We intended to select politicians representing ROK(Republic of Korea; South Korean) and USA(United States of America), choose representative speeches to the nation, make a comparative analysis of their voices in the speeches, and draw implications. We analyzed a total of eight voices - four ROK politicians and four USA ones, male and female - to characterize them and suggest guidelines for a voice with clearer message delivery. We analyzed the politicians' voices on the basis of such vocal properties as vocal pitch, accuracy of pronunciation, resonance, and intonation variation and found that the ROK politicians were somewhat poorer at utilizing their voice than the US ones. In particular, they were remarkably poorer at accurate pronunciation, which exerts a significant impact on message passing.

An Acoustical Comparison of English Tense and Lax Vowels Produced by Korean and American Males (한국인남성과 미국인남성이 발음한 영어 긴장.이완모음의 음향적 비교)

  • Yang, Byung-Gon
    • Speech Sciences
    • /
    • v.15 no.4
    • /
    • pp.19-27
    • /
    • 2008
  • Several studies on the pronunciation of English vowels point out that Korean learners have difficulty distinguishing English tense and lax vowel pairs. The acoustic comparisons of those studies are mostly based on the formant measurement at one time point of a given vowel section. However, the English lax vowels usually show dynamic changes across their syllable peaks and subjects' English levels account for various conflicting results. The purposes of this paper are to compare the temporal duration and dynamic formant tracks of English tense and lax vowel pairs produced by five Korean and five American males. The subjects were graduate students of an American state university. Results showed that both the Korean and American males produced the vowels with comparable durations. The duration of the front tense-lax vowel pair was longer than that of the back vowel pair. From the formant track comparisons, the American males produced the tense and lax pairs much more distinctly than the Korean male speakers. The results suggest that the Korean males should pay attention to the F1 and F2 movements, i.e., the jaw and tongue movements, in order to match those of the American males. Further studies are recommended on the auditorily acceptable ranges of F2 variation for the lax vowels.

  • PDF

Pronunciation Variation Modeling for Korean Point-of-Interest Data Usins Prosodic Information (운율 정보를 이용한 한국어 위치 정보 데이터의 발음 모델링)

  • Kim, Sun-Hee;Park, Jeon-Gue;Jeon, Je-Hun;Na, Min-Soo;Chung, Min-Hwa
    • Annual Conference on Human and Language Technology
    • /
    • 2006.10e
    • /
    • pp.51-56
    • /
    • 2006
  • 일반적으로 운율 정보를 음성인식에 이용한 연구들에 있어서는 대부분 운율의 음향적 정보를 이용하는데 반하여, 본 연구에서는 운율어나 음절수와 같은 운율의 구조적 정보가 인식률 향상에 기여함을 보인다. 본 논문은 두 가지 운율 정보, 즉 운율어와 음절수를 이용하여 발음모델링을 할 경우에 음성인식기의 성능을 평가하는 것을 목표로 하는 것으로, 먼저, 운율어를 이용하여 위치 정보데이터의 가능한 모든 발음을 생성하고, 다시 음절 수를 기준으로 발음변이 수를 조절하는 방법을 제시한 다음, 제안한 방법에 의하여 생성한 발음사전을 이용하여 음성인식의 성능을 평가하였다. 실험결과 운율어를 이용하여 발음 사전을 제작한 모든 경우에 베이스라인과 비교하여 성능이 향상됨을 보였는데, 베이스라인의 WER 4.63% 에서 최대 8.4%의 WER 가 감소하였다. 위치 정보 데이터의 음절수에 따라서 발음 변이의 수를 조절한 결과도 전체적으로는 3 음절로 그 수를 제한한 경우, 6 음절이상 단어에서는 4음절로 제한한 경우에 가장 좋은 인식 성능을 얻을 수 있어서, 음절수에 따른 발음변이 수의 조절이 효과적임을 알 수 있었다.

  • PDF

A Comparative Study of Case Markers in Korean, Japanese and Ryukyuan Languages: Focusing on Nominative Case Markers and Accusative Case Markers (한(韓)·일(日)·유(琉) 격조사 비교연구 - 주격(主格)·목적격(目的格) 조사를 중심으로 -)

  • Li, Jia
    • Cross-Cultural Studies
    • /
    • v.46
    • /
    • pp.355-377
    • /
    • 2017
  • Compared with other Altaic languages, Japanese and Korean languages are much closer to each other in grammar, and also to Ryukyuan language. According to the literature, Korean people are the first foreigners to record Ryukyuan language in a written form. In the passage "pronunciation interpreting the Ryukyuan Kingdom" from A Journey to the Eastern Countries (1512), Korean people perfectly preserved the pronunciation and meanings of words and sentences in Ryukyuan language in both Korean and Chinese languages, which is an extremely valuable material. Unfortunately, the later time period witnessed stagnation after a prominent beginning. In order to clarify the language family to which Korean belongs to, it is necessary to thoroughly compare Korean language with Japanese and Ryukyuan languages. Different from lexis, grammar underwent a slow and gradual process of variation. A comparative study of the three languages can provide strong evidence for defining the language family of Korean. Based on this rationale, this paper starts from the comparison of grammar elements of these three languages, aiming at case markers including the nominative case markers and the accusative case markers, and observes the procedures and functions diachronically. Based on the examples from the medieval data, it is found that the nominative case markers and the accusative case markers of these three languages vary from each other in forms and origins. Although they show some similarities in functions, it can be conjectured that there is no cognate for the three languages in the history.

Robust Speech Recognition Algorithm of Voice Activated Powered Wheelchair for Severely Disabled Person (중증 장애우용 음성구동 휠체어를 위한 강인한 음성인식 알고리즘)

  • Suk, Soo-Young;Chung, Hyun-Yeol
    • The Journal of the Acoustical Society of Korea
    • /
    • v.26 no.6
    • /
    • pp.250-258
    • /
    • 2007
  • Current speech recognition technology s achieved high performance with the development of hardware devices, however it is insufficient for some applications where high reliability is required, such as voice control of powered wheelchairs for disabled persons. For the system which aims to operate powered wheelchairs safely by voice in real environment, we need to consider that non-voice commands such as user s coughing, breathing, and spark-like mechanical noise should be rejected and the wheelchair system need to recognize the speech commands affected by disability, which contains specific pronunciation speed and frequency. In this paper, we propose non-voice rejection method to perform voice/non-voice classification using both YIN based fundamental frequency(F0) extraction and reliability in preprocessing. We adopted a multi-template dictionary and acoustic modeling based speaker adaptation to cope with the pronunciation variation of inarticulately uttered speech. From the recognition tests conducted with the data collected in real environment, proposed YIN based fundamental extraction showed recall-precision rate of 95.1% better than that of 62% by cepstrum based method. Recognition test by a new system applied with multi-template dictionary and MAP adaptation also showed much higher accuracy of 99.5% than that of 78.6% by baseline system.

A PHONEMIC ANALYSIS OF THE UNWRITTEN LANGUAGE OF THE PULANG TRIBE

  • Kang, Su-Hee
    • Proceedings of the KSPS conference
    • /
    • 2000.07a
    • /
    • pp.166-177
    • /
    • 2000
  • The purpose of this study was to create letters for of nonliterary Pulang tribe in Thailand those who immigrant from China. illiterate Pulang tribe hand down their tradition by primary oral culture therefore their tradition can't initiate and keep, moreover, it may disappear throughout history. So it is expected to crusade against unlettered people. The scheme of research adopted in this study was a minority race who habitate at the northern Machan, Chiangrai in Thailand. It is not only analysis of language but also the eradication of literacy and the research based on linguistic, ethnolinguistic, and primary oral culture. Five Pulang people who live in that area were chosen for creating letters. By using the I. P. A., after each word was listen to their pronunciation one by one it was described and repeated this process several times; the material words and humanbody were pointed in front of them while other words were described by gesture. For final description, number of people were in the lineup for listening the sound of words and phrases to sentences. In the first stage, it was an analysis segmental of Pulang: vocoid, contoid and diphthong were described with each sample syllables and words. The suprasegmental were studied with intonation and juncture of the words in the second stage. Two words were compared and different meanings within their intonation and juncture were shown. At the end of this part, each case of phonemic or morphophonemics representation described the juncture in the words. In the third stage, minimal pairs were analyzed with vowels and consonants and described in free variation based on words. In the last stage, syllable structure in open syllable and closed syllable was studied and then each syllable of its structure was analyzed with samples. There were thirty-two phonemes in apong Pulang as follows: seven vocoids; a, i, e, o, u, ${\ae}$, and $\wedge$, one diphthong; wu, 24 contoids; b, c, d, f, g, h, j, k, k, 1, m, n, ${\eta}, {\;}p^{h}$, p, p, r, s, s, sh, t, t, w, and y. Their pronunciations of p, s, d, $p^{h}$, j, and t are frequently used in speech and are unique in triphthong. Moreover, most of the words used initial and final consonant cluster.

  • PDF

Performance Improvement of Connected Digit Recognition by Considering Phonemic Variations in Korean Digit and Speaking Styles (한국어 숫자음의 음운변화 및 화자 발성특성을 고려한 연결숫자 인식의 성능향상)

  • 송명규;김형순
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.4
    • /
    • pp.401-406
    • /
    • 2002
  • Each Korean digit is composed of only a syllable, so recognizers as well as Korean often have difficulty in recognizing it. When digit strings are pronounced, the original pronunciation of each digit is largely changed due to the co-articulation effect. In addition to these problems, the distortion caused by various channels and noises degrades the recognition performance of Korean connected digit string. This paper dealt with some techniques to improve recognition performance of it, which include defining a set of PLUs by considering phonemic variations in Korean digit and constructing a recognizer to handle speakers various speaking styles. In the speaker-independent connected digit recognition experiments using telephone speech, the proposed techniques with 1-Gaussian/state gave string accuracy of 83.2%, i. e., 7.2% error rate reduction relative to baseline system. With 11-Gaussians/state, we achieved the highest string accuracy of 91.8%, i. e., 4.7% error rate reduction.

Variation Measurement and Analysis of Jitter and Shimmer Parameter Value by Hemodialysis in Diabetic and Hypertensive (당뇨 및 고혈압 환자에서 혈액투석에 따른 Jitter와 Shimmer 요소값 변화 측정 및 분석)

  • Kim, Bong-Hyun;Cho, Dong-Uk
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.36 no.7B
    • /
    • pp.834-840
    • /
    • 2011
  • Chronic diseases is being increased attention to threatening the elements healthy life of elderly population in modem society. Especially, Chronic diseases caused by diabetes and hypertension is destroyed kidney. In this case, subjective symptom is not little. So if health is worsened, hemodialysis, artificial organs, transplant an organ etc. should be treated as a dangerous diseases. Therefor, a patients receiving hemodialysis of diabetes and hypertension studied on the effects to regularity of amplitude and rate vibration of vocal cords in hemodialysis in this paper. To do this, a diabetic and hypertensive patients don't have a problem with pronunciation selected as of the subjects and their voices collected before and after hemodialysis. We studied on the effects of voice analysis to apply regularity of amplitude and rate vibration of vocal cords. In conclusion, we extracted a result that voice after than before hemodialysis is relatively low in voice measures values a regularity of amplitude and rate vibration of vocal cords.