Search | Korea Science

A Study on the Spectrum Variation of Korean Speech (한국어 음성의 스펙트럼 변화에 관한 연구)

Lee Sou-Kil;Song Jeong-Young
- Journal of Internet Computing and Services
- /
- v.6 no.6
- /
- pp.179-186
- /
- 2005
We can extract spectrum of the voices and analyze those, after employing features of frequency that voices have. In the spectrum of the voices monophthongs are thought to be stable, but when a consonant(s) meet a vowel(s) in a syllable or a word, there is a lot of changes. This becomes the biggest obstacle to phoneme speech recognition. In this study, using Mel Cepstrum and Mel Band that count Frequency Band and auditory information, we analyze the spectrums that each and every consonant and vowel has and the changes in the voices reftects auditory features and make it a system. Finally we are going to present the basis that can segment the voices by an unit of phoneme.
PDF

Feasibility of Revision Cochlear Implant Surgery for Better Speech Comprehension

Hwang, Kyurin;Lee, Jae Yong;Oh, Hyeon Seok;Lee, Byung Don;Jung, Jinsei;Choi, Jae Young
- Journal of Audiology & Otology
- /
- v.23 no.2
- /
- pp.112-117
- /
- 2019
Background and Objectives: The purpose of this study was to evaluate the efficacy of revision cochlear implant (CI) surgery for better speech comprehension targeting patients with low satisfaction after first CI surgery. Subjects and Methods: Eight patients who could not upgrade speech processors because of an too early CI model and who wanted to change the whole system were included. After revision CI surgery, we compared speech comprehension before and after revision CI surgery. Categoies of Auditory Performance (CAP) score, vowel and consonant confusion test, Ling 6 sounds, word and sentence identification test were done. Results: The interval between surgeries ranged from eight years to 19 years. Same manufacturer's latest product was used for revision surgery in six cases of eight cases. Full insertion of electrode was possible in most of cases (seven of eight). CAP score (p-value=0.01), vowel confusion test (p-value=0.041), one syllable word identification test (p-value=0.026), two syllable identification test (p-value=0.028), sentence identification test (p-value=0.028) had significant improvement. Consonant confusion test (p-value=0.063), Ling 6 sound test (p-value=0.066) had improvement but it is not significant. Conclusions: Although there are some limitations of our study design, we could identify the effect of revision (upgrade) CI surgery indirectly. So we concluded that if patient complain low functional gain or low satisfaction after first CI surgery, revision (device upgrade) CI surgery is meaningful even if there is no device failure.
https://doi.org/10.7874/jao.2018.00430 인용

Feasibility of Revision Cochlear Implant Surgery for Better Speech Comprehension

Hwang, Kyurin;Lee, Jae Yong;Oh, Hyeon Seok;Lee, Byung Don;Jung, Jinsei;Choi, Jae Young
- Korean Journal of Audiology
- /
- v.23 no.2
- /
- pp.112-117
- /
- 2019
Background and Objectives: The purpose of this study was to evaluate the efficacy of revision cochlear implant (CI) surgery for better speech comprehension targeting patients with low satisfaction after first CI surgery. Subjects and Methods: Eight patients who could not upgrade speech processors because of an too early CI model and who wanted to change the whole system were included. After revision CI surgery, we compared speech comprehension before and after revision CI surgery. Categoies of Auditory Performance (CAP) score, vowel and consonant confusion test, Ling 6 sounds, word and sentence identification test were done. Results: The interval between surgeries ranged from eight years to 19 years. Same manufacturer's latest product was used for revision surgery in six cases of eight cases. Full insertion of electrode was possible in most of cases (seven of eight). CAP score (p-value=0.01), vowel confusion test (p-value=0.041), one syllable word identification test (p-value=0.026), two syllable identification test (p-value=0.028), sentence identification test (p-value=0.028) had significant improvement. Consonant confusion test (p-value=0.063), Ling 6 sound test (p-value=0.066) had improvement but it is not significant. Conclusions: Although there are some limitations of our study design, we could identify the effect of revision (upgrade) CI surgery indirectly. So we concluded that if patient complain low functional gain or low satisfaction after first CI surgery, revision (device upgrade) CI surgery is meaningful even if there is no device failure.
https://doi.org/10.7874/jao.2018.00430 인용

Implementation of Learning Puzzle Game by using Combination of Korean Alphabet (한글 자음과 모음결합을 이용한 학습용 퍼즐게임 구현)

Jo, Jae-Young;Kim, Yoon-Ho
- Journal of Digital Contents Society
- /
- v.7 no.4
- /
- pp.257-261
- /
- 2006
In this paper, learning oriented puzzle game which based on combination of consonant and vowel of Korean alphabet is implemented. Firstly, consonants and vowels of Korean alphabet are classified separately, and then reconstructed a word in real time. Word combinator is utilized by API based edit window and, in order to effective retrieve, initial combined syllable consonant based method is involved. Implemented Korean puzzle game can be used for improving the words learning capability for children.
PDF

Fiberscopic and Electromyograpic Study on Laryngeal Adjustments for Syllable-final Applosives in Korean (한국어의 음절말 내파음의 후두조절 -화이비스코프 및 근전도에 의한 관찰-)

Park, Hea-Suk
- Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
- /
- v.16 no.1
- /
- pp.53-67
- /
- 2005
It is known that Korean stop consonants in syllable-initial position are of three types : lax, aspirated and forced (or unaspirated). In syllable-final position, however, these three different types are merged to a single type with the same place of articulation, although the original three-way distinction is preserved in Korean orthographic (Hangul) system. Thus the syllable-final stops are phonetically realized as voiceless "applosives" which are characterized by the absence of oral release. The aim of the present study is to investigate the laryngeal adjustments for these syllable-final stops in various phonological conditions by using fiberscope, and, is to further investigate electromyographically the laryngeal adjustments for Korean stops both in the syllable-initial and final positions in various phonological conditions. The results can be summarized as follows : 1. In the case of syllable-initial stops, the glottal widths in each three types of the Korean stops during the articulatory closure are clearly different. And the pattern of thyroarytenoid(VOC) activity appeared to characterize the three different types of Korean stops. 2. The basic laryngeal feature of the Korean syllable-final applosives is characterized by a small degree of glottal opening which begins at or slightly after the oral closure. 3. In the case, syllable-final stop followed by the copula "ita", the syllable- final stop is pronounced as the stop consonant at the initial position of the following syllable containing the vowel[i], the underlying features of three-way distinction for the stops in the Korean orthographic(Hangul) system being manifested at the laryngeal adjustment. 4. In the case of the final applosives followed by the initial stops and fricatives, the laryngeal feature of the final applosives appears to be assimilated by that of the following consonant irrespective of the difference in the place of articulation, as far as the glottal abduction/adduction is concerned. It is clearly demonstrated in the case of syllable-initial stop that thyoarytenoid(VOC) activity is suppressed for the production of the stop consonants in question, the degree of which is slightest for the forced type and most marked for the aspirated type, while it is moderate for the lax type.
PDF

Recognition of Virtual Written Characters Based on Convolutional Neural Network

Leem, Seungmin;Kim, Sungyoung
- Journal of Platform Technology
- /
- v.6 no.1
- /
- pp.3-8
- /
- 2018
This paper proposes a technique for recognizing online handwritten cursive data obtained by tracing a motion trajectory while a user is in the 3D space based on a convolution neural network (CNN) algorithm. There is a difficulty in recognizing the virtual character input by the user in the 3D space because it includes both the character stroke and the movement stroke. In this paper, we divide syllable into consonant and vowel units by using labeling technique in addition to the result of localizing letter stroke and movement stroke in the previous study. The coordinate information of the separated consonants and vowels are converted into image data, and Korean handwriting recognition was performed using a convolutional neural network. After learning the neural network using 1,680 syllables written by five hand writers, the accuracy is calculated by using the new hand writers who did not participate in the writing of training data. The accuracy of phoneme-based recognition is 98.9% based on convolutional neural network. The proposed method has the advantage of drastically reducing learning data compared to syllable-based learning.

Speech Rate and the Acoustic Features of Korean Segments (발화속도와 한국어 분절음의 음향학적 특성)

이숙향;고현주
- The Journal of the Acoustical Society of Korea
- /
- v.23 no.2
- /
- pp.162-172
- /
- 2004
This study investigates the following three things through a production experiment and acoustic analysis: 1) relationship between speech rate and the segment duration in Korean, 2) relationship between speech rate and spectral characteristics of vowels, i. e. undershoot, and 3) correlation between the vowel duration and undershoot. The results showed that the faster the speech rate nab, the shorter the duration of syllables and segments was. A few speakers were affected by speech rate in the durational ratios between closure and aspiration in a stop and between Towel and consonant in a syllable. Closure duration and vowel duration were more affected compared to aspiration and consonant duration, respectively. Speakers showed some differences in the extent to which speech rate affected vowel undershoot, implying that speakers used different production mechanisms for spectral characteristics of vowels: Some speakers speeded up movement of articulatory organs according to speech rate increase while some kept it constant regardless of speech rate change.
PDF KSCI

A Study of the Analyses of Pronunciation Errors and Teaching Method of Stop-liquid Sequences in English (영어 정지음-유음 연쇄체의 발음오류분석과 지도방안연구)

Kim, Ju-Hee;Park, Han-Sang
- Proceedings of the KSPS conference
- /
- 2007.05a
- /
- pp.99-101
- /
- 2007
This study analyzes Korean middle school students' pronunciation errors of stop-liquid sequences in English. The results showed two typical errors: the insertion of a vowel between a stop and a liquid and the substitution of a liquid with a flap or vice versa. Those pronunciation errors seem to occur since English and Korean have different syllable structures and different types of liquids. A teaching material, which emphasizes no vowel insertion for a proper pronunciation of the consonant clusters, was designed to reduce Korean students' pronunciation errors. Errors were reduced substantially after a 50-minute class with the newly designed material.
PDF

A Study on Phoneme Likely Units to Improve the Performance of Context-dependent Acoustic Models in Speech Recognition (음성인식에서 문맥의존 음향모델의 성능향상을 위한 유사음소단위에 관한 연구)

임영춘;오세진;김광동;노덕규;송민규;정현열
- The Journal of the Acoustical Society of Korea
- /
- v.22 no.5
- /
- pp.388-402
- /
- 2003
In this paper, we carried out the word, 4 continuous digits. continuous, and task-independent word recognition experiments to verify the effectiveness of the re-defined phoneme-likely units (PLUs) for the phonetic decision tree based HM-Net (Hidden Markov Network) context-dependent (CD) acoustic modeling in Korean appropriately. In case of the 48 PLUs, the phonemes /ㅂ/, /ㄷ/, /ㄱ/ are separated by initial sound, medial vowel, final consonant, and the consonants /ㄹ/, /ㅈ/, /ㅎ/ are also separated by initial sound, final consonant according to the position of syllable, word, and sentence, respectively. In this paper. therefore, we re-define the 39 PLUs by unifying the one phoneme in the separated initial sound, medial vowel, and final consonant of the 48 PLUs to construct the CD acoustic models effectively. Through the experimental results using the re-defined 39 PLUs, in word recognition experiments with the context-independent (CI) acoustic models, the 48 PLUs has an average of 7.06%, higher recognition accuracy than the 39 PLUs used. But in the speaker-independent word recognition experiments with the CD acoustic models, the 39 PLUs has an average of 0.61% better recognition accuracy than the 48 PLUs used. In the 4 continuous digits recognition experiments with the liaison phenomena. the 39 PLUs has also an average of 6.55% higher recognition accuracy. And then, in continuous speech recognition experiments, the 39 PLUs has an average of 15.08% better recognition accuracy than the 48 PLUs used too. Finally, though the 48, 39 PLUs have the lower recognition accuracy, the 39 PLUs has an average of 1.17% higher recognition characteristic than the 48 PLUs used in the task-independent word recognition experiments according to the unknown contextual factor. Through the above experiments, we verified the effectiveness of the re-defined 39 PLUs compared to the 48PLUs to construct the CD acoustic models in this paper.
PDF KSCI

Analysis of Acoustic Characteristics of Vowel and Consonants Production Study on Speech Proficiency in Esophageal Speech (식도발성의 숙련 정도에 따른 모음의 음향학적 특징과 자음 산출에 대한 연구)

Choi, Seong-Hee;Choi, Hong-Shik;Kim, Han-Soo;Lim, Sung-Eun;Lee, Sung-Eun;Pyo, Hwa-Young
- Speech Sciences
- /
- v.10 no.3
- /
- pp.7-27
- /
- 2003
Esophageal Speech uses the esophageal air during phonation. Fluent esophageal speakers frequently intake air in oral communication, but unskilled esophageal speakers are difficult with swallowing lots of air. The purpose of this study was to investigate the difference of acoustic characteristics of vowel and consonants production according to the speech proficiency level in esophageal speech. 13 normal male speakers and 13 male esophageal speakers (5 unskilled esophageal speakers, 8 skilled esophageal speakers) with age ranging from 50 to 70 years old. The stimuli were sustained /a/ vowel and 36 meaningless two syllable words. Used vowel is /a/ and consonants were 18 : /k, n, t, m, p, s, c, $C^{h},\;k^{h},\;t^{h},\;p^{h}$, h, I, k', t', p', s', c'/. Fundermental frequency (Fx), Jitter, shimmer, HNR, MPT were measured with by electroglottography using Lx speech studio (Laryngograph Ltd, London, UK). 36 meaningless words produced by esophageal speakers were presented to 3 speech-language pathologists who phonetically transcribed their responses. Fx, Jitter, HNR parameters is significant different between skilled esophageal speakers and unskilled esophageal speakers (P<.05). Considering manner of articulation, ANOVA showed that differences in two esophageal speech groups on speech proficiency were significant; Glide had the highest number of confusion with the other phoneme class, affricates are the most intelligible in the unskilled esophageal speech group, whereas in the skilled esophageal speech group fricatives resulted highest number of confusions, nasals are the most intelligible. In the place of articulation, glottal /h/ is the highest confusion consonant in both groups. Bilabials are the most intelligible in the skilled esophageal speech, velars are the most intelligible in the unskilled esophageal speech. In the structure of syllable, 'CV+V' is more confusion in the skilled esophageal group, unskilled esophageal speech group has similar confusion in both structures. In unskilled esophageal speech, significantly different Fx, Jitter, HNR acoustic parameters of vowel and the highest confusions of Liquid, Nasals consonants could be attributed to unstable, improper contact of neoglottis as vibratory source and insufficiency in the phonatory air supply, and higher motoric demand of remaining articulation due to morphological characteristics of vocal tract after laryngectomy.
PDF

Search Result 62, Processing Time 0.021 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)