Search | Korea Science

A Study on Error Correction Using Phoneme Similarity in Post-Processing of Speech Recognition (음성인식 후처리에서 음소 유사율을 이용한 오류보정에 관한 연구)

Han, Dong-Jo;Choi, Ki-Ho
- The Journal of The Korea Institute of Intelligent Transport Systems
- /
- v.6 no.3
- /
- pp.77-86
- /
- 2007
Recently, systems based on speech recognition interface such as telematics terminals are being developed. However, many errors still exist in speech recognition and then studies about error correction are actively conducting. This paper proposes an error correction in post-processing of the speech recognition based on features of Korean phoneme. To support this algorithm, we used the phoneme similarity considering features of Korean phoneme. The phoneme similarity, which is utilized in this paper, rams data by mono-phoneme, and uses MFCC and LPC to extract feature in each Korean phoneme. In addition, the phoneme similarity uses a Bhattacharrya distance measure to get the similarity between one phoneme and the other. By using the phoneme similarity, the error of eo-jeol that may not be morphologically analyzed could be corrected. Also, the syllable recovery and morphological analysis are performed again. The results of the experiment show the improvement of 7.5% and 5.3% for each of MFCC and LPC.
PDF

Lip Shape Synthesis of the Korean Syllable for Human Interface (휴먼인터페이스를 위한 한글음절의 입모양합성)

이용동;최창석;최갑석
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.19 no.4
- /
- pp.614-623
- /
- 1994
Synthesizing speech and facial images is necessary for human interface that man and machine converse naturally as human do. The target of this paper is synthesizing the facial images. In synthesis of the facial images a three-dimensional (3-D) shape model of the face is used for realizating the facial expression variations and the lip shape variations. The various facial expressions and lip shapes harmonized with the syllables are synthesized by deforming the three-dimensional model on the basis of the facial muscular actions. Combications with the consonants and the vowels make 14.364 syllables. The vowels dominate most lip shapes but the consonants do a part of them. For determining the lip shapes, this paper investigates all the syllables and classifies the lip shapes pattern according to the vowels and the consonants. As the results, the lip shapes are classified into 8 patterns for the vowels and 2patterns for the consonants. In advance, the paper determines the synthesis rules for the classified lip shape patterns. This method permits us to obtain the natural facial image with the various facial expressions and lip shape patterns.
PDF

Differences in High Pitch Accents between News Speech and Natural Speech (영어 뉴스와 자연발화에 나타나는 고성조 피치액센트의 차이점)

Choi, Yun-Hui;Lee, Joo-Kyeong
- Speech Sciences
- /
- v.12 no.2
- /
- pp.17-28
- /
- 2005
This paper argues that news speech entails a distinct intonational pattern from natural speech, effectively reflecting that it primarily focuses on providing new information. We conducted a phonetic experiment to compare the tonal contours between news speech and natural speech, examining the distributions of pitch accents and the overall pitch ranges. We utilized 70 American Press (AP) radio news utterances and 70 natural utterances extracted from TV dramas. Results show that news speech involves 3.38 H*'s (including L+H* and !H*) within an intonational phrase (IP) or intermediate phrase (ip) whereas natural speech, 1.8 in average. The number of IP/ip's per sentence is 3 in news speech, which is shown in the highest rate of 32.07% of the news speech, but it is merely 1, taking up the highest 41.42% in natural speech. Next, declination tends to be prevented in news speech, and the pitch range is much greater in news speech than in natural speech. Finally, a secondary stress syllable is comparatively frequently given a pitch accent in news speech, explicitly distinct from natural speech. These results can be interpreted as stating that news has the particular purpose of providing new information; every content word tends to be given a H* or its related pitch accent like L+H* or !H* because news speech assumes that every word conveys new information. This definitely brings about more IP/ip's per sentence due to a human physiological constraint; that is, more H*'s will cause more respiratory breaks. Also, greater pitch ranges and pitch accents imposed on secondary stress may be attributed to exaggerating new information.
PDF

Overlapping of /o/ and /u/ in modern Seoul Korean: focusing on speech rate in read speech

Igeta, Takako;Hiroya, Sadao;Arai, Takayuki
- Phonetics and Speech Sciences
- /
- v.9 no.1
- /
- pp.1-7
- /
- 2017
Previous studies have reported on the overlapping of $F_1$ and $F_2$ distribution for the vowels /o/ and /u/ produced by young Korean speakers of the Seoul dialect. It has been suggested that the overlapping of /o/ and /u/ occurs due to sound change. However, few studies have examined whether speech rate influences the overlapping of /o/ and /u/. On the other hand, previous studies have reported that the overlapping of /o/ and /u/ in syllable produced by male speakers is smaller than by female speakers. Few reports have investigated on the overlapping of the two vowels in read speech produced by male speakers. In the current study, we examined whether speech rates affect overlapping of /o/ and /u/ in read speech by male and female speakers. Read speech produced by twelve young adult native speakers of Seoul dialect were recorded in three speech rates. For female speakers, discriminant analysis showed that the discriminant rate became lower as the speech rate increases from slow to fast. Thus, this indicates that speech rate is one of the factors affecting the overlapping of /o/ and /u/. For male speakers, on the other hand, the discriminant rate was not correlated with speech rate, but the overlapping was larger than that of female speakers in read speech. Moreover, read speech by male speakers was less clear than by female speakers. This indicates that the overlapping may be related to unclear speech by sociolinguistic reasons for male speakers.
https://doi.org/10.13064/KSSS.2017.9.1.001 인용 PDF KSCI

Statistical Survey of Vocabulary in Korean Textbook for Elementary School 6th-Grade (초등학교 6학년 국어교과서의 어휘 통계조사)

Kim, Jong-Young;Kim, Cheol-Su
- The Journal of the Korea Contents Association
- /
- v.12 no.5
- /
- pp.515-524
- /
- 2012
This paper studied the statistics such as the total number of syllables, the kinds of syllables, the frequency of syllables, the number of eojeols(word phrases unique in Korean language), the kinds of eojeols, average length of eojeols, the frequency of eojeols and the parts of speech in four different Korean textbooks for 6th-grade students(6-1 Korean Reading, 6-1 Korean Speaking Listening Writing, 6-2 Korean Reading and 6-2 Korean Speaking Listening Writing). The results of the statistical survey are as follows: the number of Hangul syllables was 194,683; the kinds of syllables were 1,290; the average frequency of syllables was 150.9; the number of eojeol was 70,185; the kinds of eojeol were 22,647; the average frequency of eojeol was 3.1; the average length of eojeols was 2.8 syllables, the longest one consist of 10 syllables. In parts of speech, nouns are used more in the Korean Reading textbook, and verbs are used more in Korean Speaking Listening Writing.
https://doi.org/10.5392/JKCA.2012.12.05.515 인용 PDF KSCI

A Study on the Self-voice Suppression Algorithm in a ZigBee CROS Hearing Aid (지그비 크로스 보청기에서의 자기음성 억제 알고리즘 연구)

Im, Won-Jin;Goh, Young-Hwan;Jeon, Yu-Yong;Kil, Se-Kee;Yoon, Kwang-Sub;Lee, Sang-Min
- Journal of IKEEE
- /
- v.13 no.3
- /
- pp.62-71
- /
- 2009
In this study, we developed a wireless CROS(contralateral routing of signal) hearing aid for unilateral impaired people. CROS hearing aid takes sound from an ear with poorer hearing and transmit to another ear with better hearing. Generally, the self-voice delivered through the receiver of CROS hearing aid can be very loud. It is hard to perceive target speech because of loud self-voice. To compensate it, a self-voice suppression algorithm has been developed. we performed SDT(speech discrimination test) for evaluation of the self-voice suppression algorithm. One-syllable words was used as test speech and recorded with self-voice at a 1m distance. As the results, SDT score was improved about 11% when the self-voice suppression algorithm was processed. It is verified that the self-voice suppression algorithm helps speech perception at a time to communicate with others.
PDF

Rule-based Speech Recognition Error Correction for Mobile Environment (모바일 환경을 고려한 규칙기반 음성인식 오류교정)

Kim, Jin-Hyung;Park, So-Young
- Journal of the Korea Society of Computer and Information
- /
- v.17 no.10
- /
- pp.25-33
- /
- 2012
In this paper, we propose a rule-based model to correct errors in a speech recognition result in the mobile device environment. The proposed model considers the mobile device environment with limited resources such as processing time and memory, as follows. In order to minimize the error correction processing time, the proposed model removes some processing steps such as morphological analysis and the composition and decomposition of syllable. Also, the proposed model utilizes the longest match rule selection method to generate one error correction candidate per point, assumed that an error occurs. For the purpose of deploying memory resource, the proposed model uses neither the Eojeol dictionary nor the morphological analyzer, and stores a combined rule list without any classification. Considering the modification and maintenance of the proposed model, the error correction rules are automatically extracted from a training corpus. Experimental results show that the proposed model improves 5.27% on the precision and 5.60% on the recall based on Eojoel unit for the speech recognition result.
https://doi.org/10.9708/jksci/2012.17.10.025 인용 PDF KSCI

The Study on Intraoral Pressure, Closure Duration, and VOT During Phonation of Korean Bilabial Stop Consonants (한국어 양순 파열음 발음시 구강내압과 폐쇄기, VOT에 대한 연구)

Pyo Hwa Young;Choi Hong Shik
- Proceedings of the KSPS conference
- /
- 1996.10a
- /
- pp.390-398
- /
- 1996
Acoustic analysis study was performed on 20 normal subjects by speaking nonsense syllables composed of Korean bilabial stops(/p, $p^{*}$/, ph/) and their Preceding and/or following vowel /a/(that is, [pa, $p^{*}a$, pha, apa, $ap^{*}a$, apha]) with an ultraminiature pressure sensor in their mouths. Speech materials were phonated twice, once with a moderate voice, another time with a loud voice. The acoustic signal and intraoral pressure were recorded simultaneously on computer. By these procedures, we were to measure the intraoral pressure, closure duration and VOT of Korean bilabial stops, and to compare the values one another according to the intensity of phonation and the position of the target consonants. Intraoral pressure was measured by the peak intraoral pressure value of its wave; closure duration by the time interval between the onset of intraoral pressure build-up and the burst meaning the release of closure; Voice onset time(VOT) by the time interval between the burst and the onset of glottal vibration. Heavily aspirated bilabial stop consonant /ph/ showed the highest intraoral pressure value, unaspirated /p$^{*}$/, the second, slightly aspirated /p/, the lowest. The syllable initial bilabial stops showed higher intraoral pressure than word initial stops, and the value of loudly phonated consonants were higher than moderate consonants. The longest closure duration period was that of /$p^{*}$/ and the shortest, /p/, and the duration was longer in word initial position and in the moderate voice. In VOT, the order of the longest to shortest was /ph/, /p/, /$p^{*}$/, and the value was shorter when the consonant was in intervocalic position and when it was phonated with a loud voice.
PDF

The identification of /I/ in Spanish and French

Jorge A. Gurlekian;Benoit Jacques;Miguelina Guirao
- Proceedings of the KSPS conference
- /
- 1996.10a
- /
- pp.521-528
- /
- 1996
This presentation explores on the perceptual characteristics of the lateral sound /l/ in CV syllables. At initial position we found that /l/ has well marked formant transitions. Then several questions arise: 1) are these formant structures dependent on the following vowel\ulcorner. 2) Are the formant transitions giving an additional cue for the identification\ulcorner Considering that the French vocalic system presents a greater variety of vowels than Spanish, several experiments were designed to verify to what extent a more extensive range of vocalic timbres contribute to the perception of /l/. Natural emissions of /l/ produced in Argentine Spanish and Canadian French CV syllables were recorded, where V was successively /i, e, a, o, u/ for Spanish and /i, e, $\varepsilon$, a, $\alpha$, o, u, y, \phi$/ for French. For each item, the segment C was maintained and V was replaced by cutting & splicing by each of the remaining vowels without transitions. Results of the identification tests for Spanish show that natural /l/ segments with low Fl and high formants F3, F4 can be clearly identified in the /i, e, u/ vowel contexts without transitions. For French subjects the combination of /l/ with a vowel without transitions reflected correct identifications for its own original vowel context in /e, $\varepsilon$, y, $\phi$/. For both languages, in all these combinations, F1 values remained rather steady along the syllable. In the case of /o, u/ very likely the F2 difference lead to a variety of perceptions of the original /l/. For example in Ilul, French subjects reported some identifications of /l/ as a vowel, mainly /y/. Our observations reinforce the importance of F1 as a relevant cue for /l/, and the incidence of the relative distance between formants frequencies of both components.
PDF

Egyptian learners' learnability of Korean phonemes (이집트 한국어 학습자들의 한국어 음소 학습용이성)

Benjamin, Sarah;Lee, Ho-Young;Hwang, Hyosung
- Phonetics and Speech Sciences
- /
- v.11 no.4
- /
- pp.19-33
- /
- 2019
This paper examines the perception of Korean phonemes by Egyptian learners of Korean and presents the learnability gradient of Korean consonants and vowels through High Variability Phonetic Training (HVPT). 50 Egyptian learners of Korean (27 low proficiency learners and 23 high proficiency learners) participated in 10 sessions of HVPT for Korean vowels, word initial and final consonants. Participants were tested on their identification ability of Korean vowels, word initial consonants, and syllable codas before and after the training. The results showed that both low and high proficiency groups did benefit from the training. Low proficiency learners showed a higher improvement rate than high proficiency learners. Based on the HVPT results, a learnability gradient was established to give insights into priorities in teaching Korean sounds to Egyptian learners.
https://doi.org/10.13064/KSSS.2019.11.4.019 인용 PDF KSCI

Search Result 621, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)