Search | Korea Science

A User friendly Remote Speech Input Unit in Spontaneous Speech Translation System

Lee, Kwang-Seok;Kim, Heung-Jun;Song, Jin-Kook;Choo, Yeon-Gyu
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2008.05a
- /
- pp.784-788
- /
- 2008
In this research, we propose a remote speech input unit, a new method of user-friendly speech input in speech recognition system. We focused the user friendliness on hands-free and microphone independence in speech recognition applications. Our module adopts two algorithms, the automatic speech detection and speech enhancement based on the microphone array-based beamforming method. In the performance evaluation of speech detection, within-200msec accuracy with respect to the manually detected positions is about 97percent under the noise environments of 25dB of the SNR. The microphone array-based speech enhancement using the delay-and-sum beamforming algorithm shows about 6dB of maximum SNR gain over a single microphone and more than 12% of error reduction rate in speech recognition.
PDF

A study on the voice onset times of the Seoul Corpus males in their twenties (서울 코퍼스 20대 남성의 성대진동 개시시간 연구)

Lee, Yuri;Yoon, Kyuchul
- Phonetics and Speech Sciences
- /
- v.8 no.4
- /
- pp.1-8
- /
- 2016
The purpose of this work is to examine the voice onset times (VOTs) of the three types of plosives from the Seoul Corpus male speakers in their twenties. In addition, the factors known to affect VOTs were analyzed, including the place and manner of articulation, speakers, location in words, type of following vowels and speech rates calculated from the three consecutive words. Much of the findings agreed with those from earlier studies on Korean and other languages and new discoveries were made.
https://doi.org/10.13064/KSSS.2016.8.4.001 인용 PDF KSCI

Study on the realization of pause groups and breath groups (휴지 단위와 호흡 단위의 실현 양상 연구)

Yoo, Doyoung;Shin, Jiyoung
- Phonetics and Speech Sciences
- /
- v.12 no.1
- /
- pp.19-31
- /
- 2020
The purpose of this study is to observe the realization of pause and breath groups from adult speakers and to examine how gender, generation, and tasks can affect this realization. For this purpose, we analyzed forty-eight male or female speakers. Their generation was divided into two groups: young, old. Task and gender affected both the realization of pause and breath groups. The length of the pause groups was longer in the read speech than in the spontaneous speech and female speech. On the other hand, the length of the breath group was longer in the spontaneous speech and the male speech. In the spontaneous speech, which requires planning, the speaker produced shorter length of pause group. The short sentence length of the reading material influenced the reason for which the length of the breath group was shorter in the reading speech. Gender difference resulted from difference in pause patterns between genders. In the case of the breath groups, the male speaker produced longer duration of pause than the female speaker did, which may be due to difference in lung capacity between genders. On the other hand, generation did not affect either the pause groups or the breath groups. The generation factor only influenced the number of syllables and the eojeols, which can be interpreted as the result of the difference in speech rate between generations.
https://doi.org/10.13064/KSSS.2020.12.1.019 인용 PDF KSCI

Variance characteristics of speaking fundamental frequency and vocal intensity depending on utterance conditions (발화조건에 따른 기본주파수 및 음성강도 변동의 특징)

Lee, Moo-Kyung
- Phonetics and Speech Sciences
- /
- v.4 no.1
- /
- pp.111-118
- /
- 2012
The purpose of this study was to characterize and determine variances of speaking fundamental frequency and vocal intensity depending on gender and three utterance conditions (spontaneous speech, reading, and counting). A total of 65 undergraduate students (32 male students, 33 female students) attending universities in Daegu, South Korea participated in this study. The subjects were all in their 20s. This study used KayPENTAX's Visi-Pitch IV (Model 3950) to measure the variances of speaking fundamental frequency (SFF0) and vocal intensity (VI). As a result, this study came to the following conclusions. First, it was found that both males and females showed no significant difference in SFF0 and vocal intensity among three utterance conditions. Second, this study sought to analyze differences in the variances of SFF0 between males and females. As a result, it was found that females showed significantly higher levels of four measured variances (SFF0 $SD^{**}$, SFF0 $range^{***}$, Min $SFF0^{***}$ and Max $SFF0^{***}$) than males on spontaneous speech. However, it was found that there was no significant difference between males and females in SFF0 range on reading or in SFF0 SD and SFF0 range on counting. It was found that there was no significant difference between males and females in the level of measured variances of vocal intensity depending on utterance conditions. Finally, this study made a comparison and analysis on differences in the variances of SFF0 and vocal intensity among utterance conditions. As a result, it was found that all the measured variances of SFF0 in males were most significantly reduced depending upon spontaneous speech which was followed by reading and counting respectively (SFF0 SD: p<.001, SFF0 range: p<.05, Max SFF0: p<.05). Females however, show no significant difference in the measured variances of SFF0 depending upon three utterance conditions. It was also found that the measured variances of vocal intensity in females were most significantly reduced depending on spontaneous speech that was followed by reading and counting (VI SD: p<.001, VI range: p<.001, Min VI: p<.01 Max VI: p<.05), while males showed no significant difference in the measured variances of vocal intensity depending on three utterance conditions. In sum, these findings suggest that variances of SFF0 in males are affected by three utterance conditions, while variances of vocal intensity in females are affected by three utterance conditions.
https://doi.org/10.13064/KSSS.2012.4.1.111 인용 PDF

A Comparative Study on the Speech Rate of Advanced Korean(L2) Learners and Korean Native Speakers in Conversational Speech (자유 대화에서의 한국어 원어민 화자와 한국어 고급 학습자들의 발화 속도 비교)

Hong, Minkyoung
- Journal of Korean language education
- /
- v.29 no.3
- /
- pp.345-363
- /
- 2018
The purpose of this study is to compare the speech rate of advanced Korean(L2) learners and Korean native speakers in spontaneous utterances. Specifically, the current study investigated the difference of the two groups' speech pattern according to utterance length. Eight advanced Korean(L2) learners and eight Korean native speakers participated in this study. The data were collected by recording their conversation and physical measurements (speaking rate, articulatory rates, pause and several types of speech disfluency) were taken on extracted 120 utterances from 12 out of the 16 participants. The findings show that advanced Korean learners' speech pattern is similar to that of Koreans in the short-length utterance. However, in the long-length utterance, two groups show different speech patterns; while the articulatory rate of Korean native speakers increased in the long-length utterance, that of Korean learners decreased. This suggests that the frequency of speech disfluency factors might affect this result.

한국어 낭독체 발화의 경계 인식에 있어서 묵음 휴지(Silent pause)의 역할

Jo, Hyeong-Sil
- Proceedings of the KSPS conference
- /
- 2006.11a
- /
- pp.117-119
- /
- 2006
This paper discusses the importance of silent pauses in the perception of prosodic boundaries in Korean speech. It is suggested that in speech in general, and in particular in spontaneous speech, silent pauses are neither necessary nor sufficient for the perception of prosodic boundaries. In read speech, however, there is a high correlation between the presence of a pause and the perception of a boundary. An experiment was carried out to determine whether removing the silent boundary from an extract of speech had a significant effect on the perception of boundaries in Korean read speech. Results suggest that while the presence of a silent boundary slightly reinforces the perception of a prosodic boundary, subjects are in general capable of perceiving the boundary without the silent pause.
PDF

Measurements of Speaking Rate and Fluency in Stuttering Adults (유창성장애 성인의 말속도와 유창성 측정에 관한 연구)

Shin, Moon-Ja
- Speech Sciences
- /
- v.7 no.3
- /
- pp.273-284
- /
- 2000
The purpose of this study was to investigate speech rate and fluency in stuttering adults. It was suggested that a measurement guideline of speech rate and fluency for collecting clinically meaningful data be used. Subjects included 10 adults who stutter (mean age=25;8). Syllables were used as the unit of measurement for analyzing the duration of speech. The mean rate was 241 SPM (syllables per minute) for reading, and 196 SPM for spontaneous speaking. Fluency was also measured in both cases. The correlation between rate of speech and fluency was high (r=0.92). A strong positive correlation was found between different investigators in measuring speech rates and fluencies.
PDF

AI-based language tutoring systems with end-to-end automatic speech recognition and proficiency evaluation

Byung Ok Kang;Hyung-Bae Jeon;Yun Kyung Lee
- ETRI Journal
- /
- v.46 no.1
- /
- pp.48-58
- /
- 2024
This paper presents the development of language tutoring systems for nonnative speakers by leveraging advanced end-to-end automatic speech recognition (ASR) and proficiency evaluation. Given the frequent errors in non-native speech, high-performance spontaneous speech recognition must be applied. Our systems accurately evaluate pronunciation and speaking fluency and provide feedback on errors by relying on precise transcriptions. End-to-end ASR is implemented and enhanced by using diverse non-native speaker speech data for model training. For performance enhancement, we combine semisupervised and transfer learning techniques using labeled and unlabeled speech data. Automatic proficiency evaluation is performed by a model trained to maximize the statistical correlation between the fluency score manually determined by a human expert and a calculated fluency score. We developed an English tutoring system for Korean elementary students called EBS AI Peng-Talk and a Korean tutoring system for foreigners called KSI Korean AI Tutor. Both systems were deployed by South Korean government agencies.
https://doi.org/10.4218/etrij.2023-0322 인용 PDF

Error Correction and Praat Script Tools for the Buckeye Corpus of Conversational Speech (벅아이 코퍼스 오류 수정과 코퍼스 활용을 위한 프랏 스크립트 툴)

Yoon, Kyu-Chul
- Phonetics and Speech Sciences
- /
- v.4 no.1
- /
- pp.29-47
- /
- 2012
The purpose of this paper is to show how to convert the label files of the Buckeye Corpus of Spontaneous Speech [1] into Praat format and to introduce some of the Praat scripts that will enable linguists to study various aspects of spoken American English present in the corpus. During the conversion process, several types of errors were identified and corrected either manually or automatically by the use of scripts. The Praat script tools that have been developed can help extract from the corpus massive amounts of phonetic measures such as the VOT of plosives, the formants of vowels, word frequency information and speech rates that span several consecutive words. The script tools can extract additional information concerning the phonetic environment of the target words or allophones.
https://doi.org/10.13064/KSSS.2012.4.1.029 인용 PDF

Breath and Memory in Speech based on Quantitative Analysis of Breath Groups and Pause Units in Korean (언어 수행에서의 호흡과 기억 -호흡 단위와 휴지 단위의 양적 분석 결과를 바탕으로-)

Shin, Jiyoung
- Korean Linguistics
- /
- v.79
- /
- pp.91-116
- /
- 2018
This paper aims at proposing issues of breath and memory in speech based on the quantitative analysis of breath groups and pause units in Korean. As a human being, we have two kinds of limitations on continuing speech; breath and memory. The prosodic structure and temporal structure of spontaneous speech data from six speakers were closely examined. One of the main findings of the present study is that the prosodic structure and temporal structure of Korean appears to reflect the breath and memory problems in speech.
https://doi.org/10.20405/kl.2018.05.79.91 인용

Search Result 125, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)