Search | Korea Science

A Study on Word Juncture Modeling for Continuous Speech Recognition of Korean Language (한국어 연속음성 인식을 위한 단어 결합 모델링에 관한 연구)

Choi, In-Jeong;Un, Chong-Kwan
- The Journal of the Acoustical Society of Korea
- /
- 제13권5호
- /
- pp.24-31
- /
- 1994
In this paper, we study continuous speech recognition of Korean language using acoustic models of word juncture coarticulation. To alleviate the performance degradation due to coarticulation problems, we use context-dependent units that model inter-word transitions in addition to intra-word transitions. In all cases the initial phone of each word has to be specified for each possible final phone of the previous word similarly for the final phone of each word. To improve the robustness of the HMM parameters, the covariance matrix is smoothed. We also use position-dependent units to improve the discriminative power between units. Simulation results show that when the improved models of word juncture coarticulation are used. the recognition performance is considerably improved compared to the baseline system using only intra-word units.
PDF

SEGMENTAL COARTICULATION STUDY IN DISYLLABIC CONTEXT IN STANDARD CHINESE

Chen, Xiao-xia
- Proceedings of the KSPS conference
- /
- 대한음성학회 1996년도 10월 학술대회지
- /
- pp.515-520
- /
- 1996
PDF

A Study on Korean Allophone Recognition Using Hierarchical Time-Delay Neural Network (계층구조 시간지연 신경망을 이용한 한국어 변이음 인식에 관한 연구)

김수일;임해창
- Journal of the Korean Institute of Telematics and Electronics B
- /
- 제32B권1호
- /
- pp.171-179
- /
- 1995
In many continuous speech recognition systems, phoneme is used as a basic recognition unit However, the coarticulation generated among neighboring phonemes makes difficult to recognize phonemes consistently. This paper proposes allophone as an alternative recognition unit. We have classified each phoneme into three different allophone groups by the location of phoneme within a syllable. For a recognition algorithm, time-delay neural network(TDNN) has been designed. To recognize all Korean allophones, TDNNs are constructed in modular fashion according to acoustic-phonetic features (e.g. voiced/unvoiced, the location of phoneme within a word). Each TDNN is trained independently, and then they are integrated hierarchically into a whole speech recognition system. In this study, we have experimented Korean plosives with phoneme-based recognition system and allophone-based recognition system. Experimental results show that allophone-based recognition is much less affected by the coarticulation.
PDF

Is Voicing of English Voiced Stops Active?

Yun, Il-Sung
- Speech Sciences
- /
- 제10권2호
- /
- pp.207-221
- /
- 2003
Phonetic voicing does not support the phonological distinction of voiced/voiceless in English stops. The present study is aimed at defining the nature of voicing of English voiced stops. A review of the literature reveals that the voicing is position-conditioned and its length is notably inconsistent relative to the closure duration. No consistent relationships are found between vocal fold adduction and glottal pulsing in initial position. Stress reduced the voicing, etc. The hypothesis for experiments was: (1) active voicing: stress generates longer (stronger) voicing during the closure duration of a voiced stop; (2) passive voicing: stress induces shorter (weaker) voicing during the closure. Instead the voiced stop becomes more voiced when the preceding vowel (syllable) is stressed. The literature review and the results of two experiments comparing English and Slovakian suggested that the voicing of English voiced stops is passive (i.e., a coarticulation of glottal pulsing for adjacent vowels-syllables) and should be distinguished from active voicing in some other languages.
PDF

An Acoustic Study of the Perceptual Significance of F2 Transition of /w/ in English and Korean

Kang, Hyun-Sook
- Speech Sciences
- /
- 제13권4호
- /
- pp.7-21
- /
- 2006
The intent of the present study is to investigate the acoustic properties of Korean /w/ in various phonological contexts, compare them with those of English /w/, and attempt to explain why English /w/'s are perceived differently by Korean speakers depending on the phonological contexts. Experiments 1 and 2 present the acoustic measure of F2 of Korean /w/ in various linguistic positions and show that unlike English /w/, Korean /w/ shows quite a strong coarticulation with the following vowel. Based on these experiments, Experiment 3 investigates why English /w/ is adapted differently into Korean. Specifically, it discusses why English /wain/ is adapted as /wain/ whereas English /twin/ is adapted into Korean as $/t^{h_i}win/$ with an extra vowel. This study argues that the different perception of English /w/ by Korean and English speakers is due to the different F2 transitional pattern of /w/ in Korean and English in various phonological contexts. It also argues that the F2 transitional pattern is an important factor in the perception of /w/.
PDF

V-to-C Coarticulation Effects in Non-native Speakers of English and Russian: A Locus-equation Analysis

Oh, Eun-Jin
- MALSORI
- /
- 제63호
- /
- pp.1-21
- /
- 2007
Locus equation scatterplots for [bilabial stop + vowel] syllables were obtained from 16 non-native speakers of English and Russian. The results indicated that both Russian speakers of English and English speakers of Russian exhibited modifications towards respective L2 norms in slopes and y-intercepts. All non-native locus equations generated exhibited linearity. Accordingly, the basic results reported in [17] were reverified by securing a larger subject base. More experienced speakers displayed better approximations to L2 norms than less experienced speakers, indicating the necessity of perception- and articulation-related learning for allophonic variations due to adjacent phonetic environments.
PDF

Classification of nasal places of articulation based on the spectra of adjacent vowels (모음 스펙트럼에 기반한 전후 비자음 조음위치 판별)

Jihyeon Yun;Cheoljae Seong
- Phonetics and Speech Sciences
- /
- 제15권1호
- /
- pp.25-34
- /
- 2023
This study examined the utility of the acoustic features of vowels as cues for the place of articulation of Korean nasal consonants. In the acoustic analysis, spectral and temporal parameters were measured at the 25%, 50%, and 75% time points in the vowels neighboring nasal consonants in samples extracted from a spontaneous Korean speech corpus. Using these measurements, linear discriminant analyses were performed and classification accuracies for the nasal place of articulation were estimated. The analyses were applied separately for vowels following and preceding a nasal consonant to compare the effects of progressive and regressive coarticulation in terms of place of articulation. The classification accuracies ranged between approximately 50% and 60%, implying that acoustic measurements of vowel intervals alone are not sufficient to predict or classify the place of articulation of adjacent nasal consonants. However, given that these results were obtained for measurements at the temporal midpoint of vowels, where they are expected to be the least influenced by coarticulation, the present results also suggest the potential of utilizing acoustic measurements of vowels to improve the recognition accuracy of nasal place. Moreover, the classification accuracy for nasal place was higher for vowels preceding the nasal sounds, suggesting the possibility of higher anticipatory coarticulation reflecting the nasal place.
https://doi.org/10.13064/KSSS.2023.15.1.025 인용 PDF

An Experimental Phonetic study of Perception of native Korean speakers on English and German $/\int/$ (한국인의 외국어 $/\int/$음에 대한 실험음성학적 연구)

Lee Sook-hyang;Kang Hyunsook
- MALSORI
- /
- 제40호
- /
- pp.1-12
- /
- 2000
This paper investigated how $/\int/$ in English and German is perceived and interpreted in the loanwords in Korean. $/\int/$ in these languages does not show one-to-one correspondence in Korean: $/\int/$ in the coda position in English and German is perceived as [swi] in Korean while $/\int/$ in the onset position is perceived as [syu]. This paper examined phonetic characteristics of $/\int/$ in English and German through its acoustic analysis and attempted to figure out which factor could explain this surface distribution of [swi] and [syu]; phonological (onset vs. coda) or phonetic (coarticulation) factor. Two acoustic features of $/\int/$ in English and German were examined: duration and energy Peak frequency of the frication noise. German $/\int/$ Perceived as [swi] in Korean showed higher energy Peak frequency and longer duration than that perceived as [syu] in Korean. English iii perceived as [swi] also showed longer duration than that Perceived as [syu] in Korean but energy Peak frequency showed different behavior. English $/\int/$ showed coarticulation with the preceding vowel rather than being affected by its position in the syllable in English. This paper concludes that 1)Phonetic characteristics used are duration and energy Peak frequency of its frication noise when $/\int/$ in English and German are adopted in Korean, 2)duration is used prior to energy peak frequency, which can be used as an enhancing feature.
PDF

Vowel Duration and the Feature of the Following Consonant

Yun, Il-Sung
- Phonetics and Speech Sciences
- /
- 제1권1호
- /
- pp.41-46
- /
- 2009
Duration of the preceding vowel is known to vary as a function of the (phonological or phonetic) voicing feature of the following consonant. This study raises a question against this general belief. A spectrographic experiment using 14 Korean obstruents (three sets of stops: /p, p', $p^h$/, /t, t', $t^h$/, /k, k', $k^h$/; one set of affricates: /c, c', $c^h$/; one set of fricatives: /s, s'/) reveals that (1) phonetic voicing in the intervocalic lax consonants /p, t, k, c, s/ has nothing to do with the duration of the preceding vowel; (2) vowel length is significantly shorter before tense consonants than before their lax cognates while tense consonants are significantly longer than their lax cognates. Importantly, Korean obstruents are all phonologically voiceless. Therefore, the voicing feature is rejected as the cause of preconsonantal vowel shortening in Korean both phonetically and phonologically. It is suggested that the temporal phenomenon is basically a kind of physiologically-motivated coarticulation though it is restricted by the phonology of a given language. To meet this assumption, the feature voicing should be replaced with the feature tenseness as the cause, which will enable us to explain the temporal phenomenon on the same basis irrespective of language.
PDF

Speech Synthesis Based on CVC Speech Segments Extracted from Continuous Speech (연속 음성으로부터 추출한 CVC 음성세그먼트 기반의 음성합성)

김재홍;조관선;이철희
- The Journal of the Acoustical Society of Korea
- /
- 제18권7호
- /
- pp.10-16
- /
- 1999
In this paper, we propose a concatenation-based speech synthesizer using CVC(consonant-vowel-consonant) speech segments extracted from an undesigned continuous speech corpus. Natural synthetic speech can be generated by a proper modelling of coarticulation effects between phonemes and the use of natural prosodic variations. In general, CVC synthesis unit shows smaller acoustic degradation of speech quality since concatenation points are located in the consonant region and it can properly model the coarticulation of vowels that are effected by surrounding consonants. In this paper, we analyze the characteristics and the number of required synthesis units of 4 types of speech synthesis methods that use CVC synthesis units. Furthermore, we compare the speech quality of the 4 types and propose a new synthesis method based on the most promising type in terms of speech quality and implementability. Then we implement the method using the speech corpus and synthesize various examples. The CVC speech segments that are not in the speech corpus are substituted by demonstrate speech segments. Experiments demonstrate that CVC speech segments extracted from about 100 Mbytes continuous speech corpus can produce high quality synthetic speech.
PDF

검색결과 40건 처리시간 0.021초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)