Search | Korea Science

Analysis of Korean Spontaneous Speech Characteristics for Spoken Dialogue Recognition (대화체 연속음성 인식을 위한 한국어 대화음성 특성 분석)

박영희;정민화
- The Journal of the Acoustical Society of Korea
- /
- v.21 no.3
- /
- pp.330-338
- /
- 2002
Spontaneous speech is ungrammatical as well as serious phonological variations, which make recognition extremely difficult, compared with read speech. In this paper, for conversational speech recognition, we analyze the transcriptions of the real conversational speech, and then classify the characteristics of conversational speech in the speech recognition aspect. Reflecting these features, we obtain the baseline system for conversational speech recognition. The classification consists of long duration of silence, disfluencies and phonological variations; each of them is classified with similar features. To deal with these characteristics, first, we update silence model and append a filled pause model, a garbage model; second, we append multiple phonetic transcriptions to lexicon for most frequent phonological variations. In our experiments, our baseline morpheme error rate (WER) is 31.65%; we obtain MER reductions such as 2.08% for silence and garbage model, 0.73% for filled pause model, and 0.73% for phonological variations. Finally, we obtain 27.92% MER for conversational speech recognition, which will be used as a baseline for further study.
PDF KSCI

On Improving the Effects of Varying the Window Length on Speech Energy Computation (음성 에너지계산에서 창함수-길이 변화영향의 개선에 관한 연구)

Bae, Myung-Jin;Ann, Sou-Guil
- The Journal of the Acoustical Society of Korea
- /
- v.9 no.2
- /
- pp.34-41
- /
- 1990
The energy parameter is widely used in pre-processing of speech signals, because it represent the phoneme characteristics of well But, the energy parameter is affected by the window length during the extracting. Thus, in this paper, the window length effects are studied in detail, and we proposed a new energy extraction algorithm that reduces the length effects. The energy contours with this algorithm are well representing for the characteristics of speech phonemes. And the computations to implement the algorithm are only required one subtraction, one addition, and two comparison aperation per speech sample.
PDF

Perceptual Characteristics of Korean Vowels Distorted by the Frequency Band Limitation (주파수 대역 제한에 의한 한국어 모음의 지각 특성 분석)

Kim, YeonWhoa;Choi, DaeLim;Lee, Sook-Hyang;Lee, YongJu
- Phonetics and Speech Sciences
- /
- v.6 no.1
- /
- pp.85-93
- /
- 2014
This paper investigated the effects of frequency band limitation on perceptual characteristics of Korean vowels. Monosyllabic speech (144 syllables of CV type, 56 syllables of VC type, 8 syllables of V type) produced by two announcers were low- and high-pass filtered with cutoff frequencies ranging from 300 to 5000 Hz. Six listeners with normal hearing performed perception tests by types of filter and cutoff frequencies. We reported phoneme recognition rates and types of perception error of band-limited Korean vowels to examine how frequency distortion in the process of speech transmission affect listener's perception.
https://doi.org/10.13064/KSSS.2014.6.1.085 인용 PDF KSCI

Spectral Characteristics and Nasalance Scores of Hypernasality in Patient with Cleft Palate

Soh, Byung-Soo;Shin, Hyo-Keun;Kim, Hyun-Gi
- Speech Sciences
- /
- v.12 no.1
- /
- pp.27-35
- /
- 2005
Differential instrumentation for the diagnoses of individuals with Cleft palate has been used to objectively measure speech problems. The Cepstrum Method was used to study the vocal tract transfer function. The vocal tract transfer function and the source spectrum should be considered in the evaluation of nasal resonance. The aim of this study was to collect quantitative data on the acoustic Instrumentation used for evaluating hypernasality. Normal subjects (9 male, 21 female; 37 male children, 20 female children) and individuals with VPI (13 male, 8 female; 16 male children, 9 female) participated in this study. The vowel /i/ was selected to gauge the severances of hypernasality Spectral and Cepstral studies using CSL was used to identify the acoustic characteristics. Cepstrum analysis shows significant differences in quefrency and amplitude. The quefrency of normal groups was shorter than that of the VPI groups, while the amplitude of normal groups was lower than that of the VPI groups. This may have significance in the evaluation 'of nasal resonance.
PDF

Acoustic Variation in infant crying (아기 울음의 음향학적 특성)

Choi, Yoon-Mi;Kim, Sun-Jun;Joo, Chan-Uhng;Kim, Hyun-Gi
- Proceedings of the KSPS conference
- /
- 2007.05a
- /
- pp.146-148
- /
- 2007
Studies of cry characteristics in the newborn infant were aimed to determine if cry analysis could be succesful in the early detection of the infant at risk for developmental difficulties. Crying presupposes functioning of the respiratory, laryngeal and supralaryngeal muscles. The nervous system controls the capacity, stability, and co-ordination of the movements in these muscles. Hence, the cry provides information about how the Nervous System is functioning. 3 patients(down syndrome, cornelia de lange syndrome, Patent ductus arteriosus) were assessed through a Computerized Speech Lab (CSL). Tests had been chosen to assess Fundamental frequency(mean, maximum, minimum values), Melody contour, NHR, Energy. We compared the data from patients and healthy volunteer. Variations in cry characteristics were documented in a number of medical abnormalities.
PDF

The Acoustic Characteristics in Women Diver's Soombijil Sound (해녀의 숨비질소리에 대한 음향특징)

Han, Ji-Yeon;Park, Hyun-Ja;Jeong, Ok-Ran
- Proceedings of the KSPS conference
- /
- 2007.05a
- /
- pp.176-179
- /
- 2007
This study examined the acoustic characteristics in women diver's Soombijil sound. A total of 18 women divers was attended this study. Acoustic analysis was performed via Praat. Soombijil sound were classified into three types as pitch variations in beginning, middle, and ending part. Type I showed increasing-decreasing-flat. Type II was identified by the shape of flat-flat-increasing. The shape of type III showed increasing-decreasing-increasing. Duration of Soombijil sound was mean 1.48 sec. The range of frequency was 1591.54 ${\sim}$ 4477.13 Hz. FFT analysis showed that frequencies were concentrated 500${\sim}$2000 Hz. Type I and II showed two peaks at 500 Hz and 1500${\sim}$2000 Hz. Type III has one peak below 500 Hz.
PDF

Style-Specific Language Model Adaptation using TF*IDF Similarity for Korean Conversational Speech Recognition

Park, Young-Hee;Chung, Min-Hwa
- The Journal of the Acoustical Society of Korea
- /
- v.23 no.2E
- /
- pp.51-55
- /
- 2004
In this paper, we propose a style-specific language model adaptation scheme using n-gram based tf*idf similarity for Korean spontaneous speech recognition. Korean spontaneous speech shows especially different style-specific characteristics such as filled pauses, word omission, and contraction, which are related to function words and depend on preceding or following words. To reflect these style-specific characteristics and overcome insufficient data for training language model, we estimate in-domain dependent n-gram model by relevance weighting of out-of-domain text data according to their n-. gram based tf*idf similarity, in which in-domain language model include disfluency model. Recognition results show that n-gram based tf*idf similarity weighting effectively reflects style difference.
PDF KSCI

Articulation Characteristics of Preschool Children in the Bilingual Environment (학령전 이중언어 환경 아동의 조음특성)

Kwon, Mi-Ji;Park, Sang-Hee;Seok, Dong-Il
- Speech Sciences
- /
- v.14 no.2
- /
- pp.73-87
- /
- 2007
The aim of this study was to examine the articulation characteristics of preschool children in the bilingual or monolingual environment. Subjects included 23 children of 4 to 6 years old in the bilingual environment, and 19 children of monolingual environment. Their speech was evaluated in terms of articulation correctness and intelligibility by the author and a speech therapist. Results showed as the following: First, there were some significant differences between bilingual and monolingual children in the percentage of consonants correctly articulated. But there was no significant difference between their language environment or ages in the percentage of vowels correctly articulated. Second, there were some significant differences between the bilingual and monolingual children in the intelligibility of word articulation. Also, there were some significant differences between the two language groups in the sentence intelligibility. There was a high positive correlation between the word and sentence intelligibility.
PDF

Voice Recognition Performance Improvement using the Convergence of Voice signal Feature and Silence Feature Normalization in Cepstrum Feature Distribution (음성 신호 특징과 셉스트럽 특징 분포에서 묵음 특징 정규화를 융합한 음성 인식 성능 향상)

Hwang, Jae-Cheon
- Journal of the Korea Convergence Society
- /
- v.8 no.5
- /
- pp.13-17
- /
- 2017
Existing Speech feature extracting method in speech Signal, there are incorrect recognition rates due to incorrect speech which is not clear threshold value. In this article, the modeling method for improving speech recognition performance that combines the feature extraction for speech and silence characteristics normalized to the non-speech. The proposed method is minimized the noise affect, and speech recognition model are convergence of speech signal feature extraction to each speech frame and the silence feature normalization. Also, this method create the original speech signal with energy spectrum similar to entropy, therefore speech noise effects are to receive less of the noise. the performance values are improved in signal to noise ration by the silence feature normalization. We fixed speech and non speech classification standard value in cepstrum For th Performance analysis of the method presented in this paper is showed by comparing the results with CHMM HMM, the recognition rate was improved 2.7%p in the speech dependent and advanced 0.7%p in the speech independent.
https://doi.org/10.15207/JKCS.2017.8.5.013 인용 PDF KSCI

A study of the understanding about speech therapy and the satisfaction about counseling for mothers who have children with disability (언어치료에 대한 장애아동 어머니의 이해도와 상담 만족도)

Park, Jin-Won
- Journal of Korean Clinical Health Science
- /
- v.9 no.1
- /
- pp.1469-1477
- /
- 2021
Purpose: The purpose of this study is to investigate the understanding about speech therapy and the satisfaction of counseling about speech therapy according to the characteristics of mothers who have children with disabilities, and to devise the clinical instruction methods to provide the effective speech therapy by identifying the correlation between the two variables. Methods: This study conducted a survey for 78 mothers of children with disabilities who use speech therapy labs in university. 17 questions were composed to investigate the understanding degree about speech therapy and 24 questions were composed to investigate the satisfaction degree about speech therapy counseling. Results: First, the survey showed that mothers who have the higher education level have the higher understanding degree about language(p<0.01). Second, the survey showed that mothers who have the higher education level have the lower satisfaction degree about counseling process(p<0.5). In the view of job status, mothers who have a job have the higher satisfaction degree about counseling time(p<0.5). Third, the survey showed that in the view of mothers'understanding degree about speech therapy and satisfaction degree about counseling, mothers who have the higher understanding degree about language, speech therapy tools and speech therapy area have the higher satisfaction degree about counseling. Conclusions: This study showed the necessity to understand the subjects'needs exactly and communicate with mothers actively. In addition, the concrete and various methods should be devised in order to increase the understanding degree about speech therapy and increase the satisfaction degree of counseling about the clinical practice environment and language therapy process.
https://doi.org/10.15205/kschs.2021.6.30.1469 인용 PDF KSCI

Search Result 970, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)