• Title/Summary/Keyword: Speech analysis

Search Result 1,572, Processing Time 0.042 seconds

Pattern Recognition Methods for Emotion Recognition with speech signal

  • Park Chang-Hyun;Sim Kwee-Bo
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.6 no.2
    • /
    • pp.150-154
    • /
    • 2006
  • In this paper, we apply several pattern recognition algorithms to emotion recognition system with speech signal and compare the results. Firstly, we need emotional speech databases. Also, speech features for emotion recognition are determined on the database analysis step. Secondly, recognition algorithms are applied to these speech features. The algorithms we try are artificial neural network, Bayesian learning, Principal Component Analysis, LBG algorithm. Thereafter, the performance gap of these methods is presented on the experiment result section.

Features Analysis of Speech Signal by Adaptive Dividing Method (음성신호 적응분할방법에 의한 특징분석)

  • Jang, S.K.;Choi, S.Y.;Kim, C.S.
    • Speech Sciences
    • /
    • v.5 no.1
    • /
    • pp.63-80
    • /
    • 1999
  • In this paper, an adaptive method of dividing a speech signal into an initial, a medial and a final sound of the form of utterance utilized by evaluating extreme limits of short term energy and autocorrelation functions. By applying this method into speech signal composed of a consonant, a vowel and a consonant, it was divided into an initial, a medial and a final sound and its feature analysis of sample by LPC were carried out. As a result of spectrum analysis in each period, it was observed that there existed spectrum features of a consonant and a vowel in the initial and medial periods respectively and features of both in a final sound. Also, when all kinds of words were adaptively divided into 3 periods by using the proposed method, it was found that the initial sounds of the same consonant and the medial sounds of the same vowels have the same spectrum characteristics respectively, but the final sound showed different spectrum characteristics even if it had the same consonant as the initial sound.

  • PDF

Speech-Oriented Multimodal Usage Pattern Analysis for TV Guide Application Scenarios (TV 가이드 영역에서의 음성기반 멀티모달 사용 유형 분석)

  • Kim Ji-Young;Lee Kyong-Nim;Hong Ki-Hyung
    • MALSORI
    • /
    • no.58
    • /
    • pp.101-117
    • /
    • 2006
  • The development of efficient multimodal interfaces and fusion algorithms requires knowledge of usage patterns that show how people use multiple modalities. We analyzed multimodal usage patterns for TV-guide application scenarios (or tasks). In order to collect usage patterns, we implemented a multimodal usage pattern collection system having two input modalities: speech and touch-gesture. Fifty-four subjects participated in our study. Analysis of the collected usage patterns shows a positive correlation between the task type and multimodal usage patterns. In addition, we analyzed the timing between speech-utterances and their corresponding touch-gestures that shows the touch-gesture occurring time interval relative to the duration of speech utterance. We believe that, for developing efficient multimodal fusion algorithms on an application, the multimodal usage pattern analysis for the given application, similar to our work for TV guide application, have to be done in advance.

  • PDF

A Basic Study on the Development of a Grading Scale of Discourse Competence in Korean Speaking Assessment -Focusing on the Scale of 'REFUSAL' Task (한국어 말하기 평가에서 '담화 능력' 등급 기술을 위한 기초 연구 -'부탁'에 대한 '거절하기' 과제를 중심으로-)

  • Lee, Haeyong;Lee, Hyang
    • Journal of Korean language education
    • /
    • v.29 no.3
    • /
    • pp.255-292
    • /
    • 2018
  • Most grading scales of Korean language proficiency tests are based on existing grading scales that are not empirically verified. The purpose of this study is to develop an empirically verified scale descriptor. The 'Performance data-driven approach' that is suggested by Fulcher (1987) was used to develop the detailed description of characteristics for each level of performance. This study is focused on the functional phase of speech samples analysis (coding data) to create explanatory categories of discourse skills into which individual observations of speech phenomena can be scored. The speech samples that were collected through this study demonstrated stages of speech that can be a foundation of a grading scale. The data used in the study was collected from 23 native speakers of Korean. Speech samples were recorded from simulated speaking tests using the 'REFUSAL' task, and transcribed for analysis. The transcript was analyzed using discourse analysis. The result showed that the 'REFUSAL' task needs to go through four functional phases in actual communication. Furthermore, this study found specific and detailed explanatory categories of discourse competence based on the actual native speaker's speech data. Such findings are expected to contribute to the development of more valid and reliable speaking assessment.

An Analysis of Science-gifted Elementary Students' Perception of Speech and the Relationship between Their Voluntary Speech and Scientific Creativity (초등과학영재학생의 발표에 대한 인식 및 발표의 자발성과 과학창의성의 관계 분석)

  • Kim, Minju;Lim, Chaeseong
    • Journal of Korean Elementary Science Education
    • /
    • v.38 no.3
    • /
    • pp.331-344
    • /
    • 2019
  • This study aims to analyse science-gifted elementary students' perception of speech in general school class, school science class, and science-gifted class and the relationship between their voluntary speech and scientific creativity. For this, 39 fifth-graders in the Science-Gifted Education Center at Seoul Metropolitan Office of Education in Korea were asked about their frequency of voluntary speech on each class situation, the reasons for such behavior, and their general opinions about speech. Also, researchers collected the teachers' observation on students' speech in class. To get the scores for students' scientific creativity, four different subjects of tasks were presented. The students' scientific creativity scores were used for correlation analysis with their frequency of speech. The main findings from this study are as follows: First, science-gifted elementary students tended to be passive in science-gifted class compared to general school and school science class. Second, the main reason for the low frequency of students' speech in school classes is that they do not have many opportunities to make presentations. Third, a survey of students' general thoughts on speech showed that more students wanted to make a speech voluntarily in class than the opposite. Fourth, the four different scientific creativity tasks had little correlation. Fifth, the correlations between the frequency of voluntary speech and the scores of scientific creativity were mostly low, with significant results only for plant task. Sixth, the correlations between the frequency of voluntary speech and the two components that make up scientific creativity, originality and usefulness, were also mostly low, but significant results for both were found in plant task, with originality having a higher correlation than usefulness. Based on this results, this study discussed the meanings and implications of students' voluntary speech on elementary science education and creativity education.

Korean prosodic properties between read and spontaneous speech (한국어 낭독과 자유 발화의 운율적 특성)

  • Yu, Seungmi;Rhee, Seok-Chae
    • Phonetics and Speech Sciences
    • /
    • v.14 no.2
    • /
    • pp.39-54
    • /
    • 2022
  • This study aims to clarify the prosodic differences in speech types by examining the Korean read speech and spontaneous speech in the Korean part of the L2 Korean Speech Corpus (speech corpus for Korean as a foreign language). To this end, the articulation length, articulation speed, pause length and frequency, and the average fundamental frequency values of sentences were set as variables and analyzed via statistical methodologies (t-test, correlation analysis, and regression analysis). The results found that read speech and spontaneous speech were structurally different in the form of prosodic phrases constituting each sentence and that the prosodic elements differentiating each speech type were articulation length, pause length, and pause frequency. The statistical results show that the correlation between articulation speed and articulation length was highest in read speech, explaining that the longer a given sentence is, the faster the speaker speaks. In spontaneous speech, however, the relationship between the articulation length and the pause frequency in a sentence was high. Overall, spontaneous speech produces more pauses because short intonation phrases are continuously built to make a sentence, and as a result, the sentence gets lengthened.

Korean speakers hyperarticulate vowels in polite speech

  • Oh, Eunhae;Winter, Bodo;Idemaru, Kaori
    • Phonetics and Speech Sciences
    • /
    • v.13 no.3
    • /
    • pp.15-20
    • /
    • 2021
  • In line with recent attention to the multimodal expression of politeness, the present study examined the association between polite speech and acoustic features through the analysis of vowels produced in casual and polite speech contexts in Korean. Fourteen adult native speakers of Seoul Korean produced the utterances in two social conditions to elicit polite (professor) and casual (friend) speech. Vowel duration and the first (F1) and second formants (F2) of seven sentence- and phrase-initial monophthongs were measured. The results showed that polite speech shares acoustic similarities with vowel production in clear speech: speakers showed greater vowel space expansion in polite than casual speech in an effort to enhance perceptual intelligibility. Especially, female speakers hyperarticulated (front) vowels for polite speech, independent of speech rate. The implications for the acoustic encoding of social stance in polite speech are further discussed.

A Study on the Endpoint Detection by FIR Filtering (FIR filtering에 의한 끝점추출에 관한 연구)

  • Lee, Chang-Young
    • Speech Sciences
    • /
    • v.5 no.1
    • /
    • pp.81-88
    • /
    • 1999
  • This paper provides a method for speech detection. After first order FIR filtering on the speech signals, we applied the conventional method of endpoint detection which utilizes the energy as the criterion in separating signals from background noise. By FIR filtering, only the Fourier components with large values of [amplitude x frequency] become significant in energy profile. By applying this procedure to the 445-words database constructed from ETRI, we confirmed that the low-amplitude noise and/or the low-frequency noise are separated clearly from the speech signals, thereby enhancing the feasibility of ideal endpoint detections.

  • PDF

A Corpus-based Lexical Analysis of the Speech Texts: A Collocational Approach

  • Kim, Nahk-Bohk
    • English Language & Literature Teaching
    • /
    • v.15 no.3
    • /
    • pp.151-170
    • /
    • 2009
  • Recently speech texts have been increasingly used for English education because of their various advantages as language teaching and learning materials. The purpose of this paper is to analyze speech texts in a corpus-based lexical approach, and suggest some productive methods which utilize English speaking or writing as the main resource for the course, along with introducing the actual classroom adaptations. First, this study shows that a speech corpus has some unique features such as different selections of pronouns, nouns, and lexical chunks in comparison to a general corpus. Next, from a collocational perspective, the study demonstrates that the speech corpus consists of a wide variety of collocations and lexical chunks which a number of linguists describe (Lewis, 1997; McCarthy, 1990; Willis, 1990). In other words, the speech corpus suggests that speech texts not only have considerable lexical potential that could be exploited to facilitate chunk-learning, but also that learners are not very likely to unlock this potential autonomously. Based on this result, teachers can develop a learners' corpus and use it by chunking the speech text. This new approach of adapting speech samples as important materials for college students' speaking or writing ability should be implemented as shown in samplers. Finally, to foster learner's productive skills more communicatively, a few practical suggestions are made such as chunking and windowing chunks of speech and presentation, and the pedagogical implications are discussed.

  • PDF

Speech Evaluation Based on the Type of Cleft Palate (구개열의 유형에 따른 발음 비교)

  • Kim, Seok-Kwun;Kim, Min-Su;Heo, Jung;Kwon, Yong-Seok;Lee, Keun-Cheol;Jeong, Boon-Seon;Lee, Min Hyuk
    • Archives of Craniofacial Surgery
    • /
    • v.9 no.2
    • /
    • pp.72-76
    • /
    • 2008
  • Purpose: Authors evaluated results of palatoplasty by speech analysis in bilateral, unilateral complete, and unilateral incomplete and submucous cleft palate patients. Methods: The speech outcomes were studied in 15 bilateral, 28 unilateral complete, and 46 unilateral incomplete and submucous cleft palate patients who underwent push-back palatoplasties from January 1998 to July 2004. The patients were divided into 2 groups as 3 to 6, 7 to 10-year-old and compared with 20 normal children(control groups were divided into 10 children on each side). Nasal emission test, hypernasality test, and articulation test were done by speech evaluation table which was composed of 39 different words. Results: In all speech evaluation tests, the group of bilateral cleft palate patients got the worst score. And 7 to 10-year-old groups got better score when compared to the same type cleft palate. Conclusion: Bilateral cleft palate patients have many more speech problems than other patients. In cleft palate patients, the speech problem was improved with ages, postoperatively. And the speech therapy can improve the operative outcomes.