• 제목/요약/키워드: formant frequencies

검색결과 75건 처리시간 0.023초

모의 지능로봇에서 음성신호에 의한 감정인식 (Speech Emotion Recognition by Speech Signals on a Simulated Intelligent Robot)

  • 장광동;권오욱
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2005년도 추계 학술대회 발표논문집
    • /
    • pp.163-166
    • /
    • 2005
  • We propose a speech emotion recognition method for natural human-robot interface. In the proposed method, emotion is classified into 6 classes: Angry, bored, happy, neutral, sad and surprised. Features for an input utterance are extracted from statistics of phonetic and prosodic information. Phonetic information includes log energy, shimmer, formant frequencies, and Teager energy; Prosodic information includes pitch, jitter, duration, and rate of speech. Finally a patten classifier based on Gaussian support vector machines decides the emotion class of the utterance. We record speech commands and dialogs uttered at 2m away from microphones in 5different directions. Experimental results show that the proposed method yields 59% classification accuracy while human classifiers give about 50%accuracy, which confirms that the proposed method achieves performance comparable to a human.

  • PDF

음향 실험을 기초로 한 몽골어와 한국어의 단모음 대조분석 (Contrastive Analysis of Mongolian and Korean Monophthongs Based on Acoustic Experiment)

  • 이중진
    • 말소리와 음성과학
    • /
    • 제2권2호
    • /
    • pp.3-16
    • /
    • 2010
  • This study aims at setting the hierarchy of difficulty of the 7 Korean monophthongs for Mongolian learners of Korean according to Prator's theory based on the Contrastive Analysis Hypothesis. In addition to that, it will be shown that the difficulties and errors for Mongolian learners of Korean as a second or foreign language proceed directly from this hierarchy of difficulty. This study began by looking at the speeches of 60 Mongolians for Mongolian monophthongs; data were investigated and analyzed into formant frequencies F1 and F2 of each vowel. Then, the 7 Korean monophthongs were compared with the resultant Mongolian formant values and are assigned to 3 levels, 'same', 'similar' or 'different sound'. The findings in assessing the differences of the 8 nearest equivalents of Korean and Mongolian vowels are as follows: First, Korean /a/ and /$\wedge$/ turned out as a 'same sound' with their counterparts, Mongolian /a/ and /ɔ/. Second, Korean /i/, /e/, /o/, /u/ turned out as a 'similar sound' with each their Mongolian counterparts /i/, /e/, /o/, /u/. Third, Korean /ɨ/ which is nearest to Mongolian /i/ in terms of phonetic features seriously differs from it and is thus assigned to 'different sound'. And lastly, Mongolian /$\mho$/ turned out as a 'different sound' with its nearest counterpart, Korean /u/. Based on these findings the hierarchy of difficulty was constructed. Firstly, 4 Korean monophthongs /a/, /$\wedge$/, /i/, /e/ would be Level 0(Transfer); they would be transferred positively from their Mongolian counterparts when Mongolians learn Korean. Secondly, Korean /o/, /u/ would be Level 5(Split); they would require the Mongolian learner to make a new distinction and cause interference in learning the Korean language because Mongolian /o/, /u/ each have 2 similar counterpart sounds; Korean /o, u/, /u, o/. Thirdly, Korean /ɨ/ which is not in the Mongolian vowel system will be Level 4(Overdifferentiation); the new vowel /ɨ/ which bears little similarity to Mongolian /i/, must be learned entirely anew and will cause much difficulty for Mongolian learners in speaking and writing Korean. And lastly, Mongolian /$\mho$/ will be Level 2(Underdifferentiation); it is absent in the Korean language and doesn‘t cause interference in learning Korean as long as Mongolian learners avoid using it.

  • PDF

영어 약모음 /ə/ 교수에 있어서 명시적 Form-Focused Instruction의 효과 연구 (The Effectiveness of Explicit Form-Focused Instruction in Teaching the Schwa /ə/)

  • 이윤현
    • 한국콘텐츠학회논문지
    • /
    • 제20권8호
    • /
    • pp.101-113
    • /
    • 2020
  • 본 연구는 명시적 형태 중심 교수법(FFI)이 영어 약모음 /ə/을 교실 상황에서 EFL 학생에게 교수하는 데 있어 얼마나 효과적인지 조사하였다. 25명의 고등학교 여학생이 13명은 실험집단에 12명은 통제집단으로 나뉘어 참여하였다. 또한, 미국인 여성 한 명도 비교 기준점을 위한 음성자료를 제공하였다. 실험집단 참여자는 한 달 반 동안 연구자의 발음과 텍스트를 음성으로 변환해주는 인터넷 프로그램의 발음을 따라 하고, 개인별로 피드백을 받았다. 처치 전, 후 참여자들은 14개의 2음절 이상 다음절 실험단어와 그 단어가 포함된 문장을 읽었으며 읽은 문장은 음성자료로 녹음되었다. 자료 분석을 위해 대응 표본 t 검증과 비모수 Wilcoxon signed-rank 검증이 사용되었다. 연구 결과에 따르면 실험군 참여자들은 사전 실험보다 사후 실험에서 영어 약모음을 약 40% 짧게 조음하였다. 하지만 모음 조음 공간에서 혀의 위치를 나타내는 F1/F2 formant에서 실험 참여자의 F1/F2 formant 분포형태는 이 연구의 기준점인 539 Hz (F1) × 1797 Hz (F2)와 상이했다. 이 연구의 결과는 반복적인 따라 하기와 적절한 피드백을 제공하는 명시적인 형태 중심 교수법(FFI)이 영어 발음 교수에 일부 효과가 있다는 것을 보여 주었다.

표준어와 경상 지역 방언의 한국어 모음 발음에 따른 영어 모음 발음의 영향에 대한 연구 (Influence of standard Korean and Gyeongsang regional dialect on the pronunciation of English vowels)

  • 장수연
    • 말소리와 음성과학
    • /
    • 제13권4호
    • /
    • pp.1-7
    • /
    • 2021
  • 본 논문의 목적은 한국어 표준어와 경상 지역 방언의 한국어 모음 발음의 영어 모음 발음에 대한 영향을 연구하는데 있다. 데이터 자료는 한국인의 영어 발음 음성 코퍼스(Korean-Spoken English Corpus, K-SEC)를 활용하였다. 이중 일곱 개의 한국어 단모음이 포함된 단어와 열 개의 영어 단모음이 포함된 단어가 선정되어 분석되었다. 선정된 자료는 외국 거주 경험이 없는 성인 남성 표준어 화자와 경상 지역 방언 화자에 의해 발화되었다. 녹음된 코퍼스 자료의 포먼트 주파수는 음성 분석 프로그램인 Praat에서 제공하는 스펙트로그램을 통해 측정되었다. 녹음된 자료들은 포먼트 구역 그래프로 나타내어 분석되었다. 결과에 의하면, 한국어와 영어 모음의 발화에서 경상 지역 방언 화자가 강한 후설성을 보인 반면에 표준어 화자는 비교적 전설성이 강하게 나타났다. 또한, 표준어와 경상 지역 방언의 한국어 모음 발음 차이 (/으/, /어/)는 대치되는 영어 모음 발음(/ə/, /ʊ/)의 조음 방식에 영향을 미쳤다. 지역 방언의 사용과 무관한 한국인의 일반적인 모음 발음 특징은 영어 원어민 화자보다 조음 구역이 좁다는 것이다. 이에 한국인은 전반적으로 긴장 모음과 이완 모음을 구별하는 데 어려움이 있지만, 영어 원어민 화자는 모음 조음에 명확한 구분이 있다.

Relationship between roar sound characteristics and body size of Steller sea lion

  • Park, Tae-Geon;Iida, Kohji;Mukai, Tohru
    • 수산해양기술연구
    • /
    • 제46권4호
    • /
    • pp.458-465
    • /
    • 2010
  • Hundreds of Steller sea lions, Eumetopias jubatus, migrate from Sakhalin and the northern Kuril Islands to Hokkaido every winter. During this migration, they may use their roaring sounds to navigate and to maintain their groups. We recorded the roars of wild Steller sea lions that had landed on reefs on the west coast of Hokkaido, and those of captive sea lions, while making video recordings. A total of 300 roars of wild sea lions and 870 roars of captive sea lions were sampled. The fundamental frequency ($F_0$), formant frequency ($F_1$), pulse repetition rate (PRR), and duration of syllables (T) were analyzed using a sonagraph. $F_0$, $F_1$, and PRR of the roars emitted by captive sea lions increased in the order male, female, and juvenile. By contrast, the $F_1$ of wild males was lower than that of females, while the $F_0$ and PRR of wild males and females did not differ statistically. Moreover, the $F_0$ and $F_1$ frequencies for captive sea lions were higher than those of wild sea lions, while PRR in captive sea lions was lower than in wild sea lions. Since there was a linear relationship between body length and the $F_0$ and $F_1$ frequencies in captive sea lions, the body length distribution of wild sea lions could be estimated from the $F_0$ and $F_1$ frequency distribution using a regression equation. These results roughly agree with the body length distribution derived from photographic geometry. As the volume of the oral cavity and the length of the vocal cords are generally proportional to body length, sampled roars can provide useful information about a population, such as the body length distribution and sex ratio.

The interlanguage Speech Intelligibility Benefit for Korean Learners of English: Production of English Front Vowels

  • Han, Jeong-Im;Choi, Tae-Hwan;Lim, In-Jae;Lee, Joo-Kyeong
    • 말소리와 음성과학
    • /
    • 제3권2호
    • /
    • pp.53-61
    • /
    • 2011
  • The present work is a follow-up study to that of Han, Choi, Lim and Lee (2011), where an asymmetry in the source segments eliciting the interlanguage speech intelligibility benefit (ISIB) was found such that the vowels which did not match any vowel of the Korean language were likely to elicit more ISIB than matched vowels. In order to identify the source of the stronger ISIB in non-matched vowels, acoustic analyses of the stimuli were performed. Two pairs of English front vowels [i] vs. [I], and $[{\varepsilon}]$ vs. $[{\ae}]$ were recorded by English native talkers and two groups of Korean learners according to their English proficiency, and then their vowel duration and the frequencies of the first two formants (F1, F2) were measured. The results demonstrated that the non-matched vowels such as [I], and $[{\ae}]$ produced by Korean talkers seemed to show more deviated acoustic characteristics from those of the natives, with longer duration and with closer formant values to the matched vowels, [i] and $[{\varepsilon}]$, than those of the English natives. Combining the results of acoustic measurements in the present study and those of word identification in Han et al. (2011), we suggest that relatively better performance in word identification by Korean talkers/listeners than the native English talkers/listeners is associated with the shared interlanguage of Korean talkers and listeners.

  • PDF

A Study on Correcting Korean Pronunciation Error of Foreign Learners by Using Supporting Vector Machine Algorithm

  • Jang, Kyungnam;You, Kwang-Bock;Park, Hyungwoo
    • International Journal of Advanced Culture Technology
    • /
    • 제8권3호
    • /
    • pp.316-324
    • /
    • 2020
  • It has experienced how difficult People with foreign language learning, it is to pronounce a new language different from the native language. The goal of various foreigners who want to learn Korean is to speak Korean as well as their native language to communicate smoothly. However, each native language's vocal habits also appear in Korean pronunciation, which prevents accurate information transmission. In this paper, the pronunciation of Chinese learners was compared with that of Korean. For comparison, the fundamental frequency and its variation of the speech signal were examined and the spectrogram was analyzed. The Formant frequencies known as the resonant frequency of the vocal tract were calculated. Based on these characteristics parameters, the classifier of the Supporting Vector Machine was found to classify the pronunciation of Koreans and the pronunciation of Chinese learners. In particular, the linguistic proposition was scientifically proved by examining the Korean pronunciation of /ㄹ/ that the Chinese people were not good at pronouncing.

Korean and English affricates in bilingual children

  • Yu, Hye Jeong
    • 말소리와 음성과학
    • /
    • 제9권3호
    • /
    • pp.1-6
    • /
    • 2017
  • This study examined how early bilingual children produce sounds in their two languages articulated with the same manner of articulation but at different places of articulation. English affricates are palato-alveolar and Korean affricates are alveolar. This study analyzed the frequencies of center of gravity (COG), spectral peak (SP), and the second formant (F2) of word-initial affricates in English and Korean produced by twenty-four early Korean-English bilingual children (aged 4 to 7), and compared them with those of monolingual counterparts in the two languages. If early Korean-English bilingual children produce palato-alveolar affricates in English and alveolar affricates in Korean, they may produce Korean affricates with higher COGs, SPs, and F2s than English affricates. The early Korean-English bilingual children at the age of 4 produced English and Korean affricates with similar COGs, SPs, and F2s, and the COGs, SPs, and F2s of their Korean affricates were similar to those of the Korean monolingual counterparts. However, the early bilingual children at the age of 5 to 7 had lower COGs and SPs for English affricates with higher F2s compared to Korean affricates, and the COGs, SPs, and F2s of their English affricates were similar to those of the English monolingual counterparts.

음성합성시스템을 위한 음색제어규칙 연구 (A Study on Voice Color Control Rules for Speech Synthesis System)

  • 김진영;엄기완
    • 음성과학
    • /
    • 제2권
    • /
    • pp.25-44
    • /
    • 1997
  • When listening the various speech synthesis systems developed and being used in our country, we find that though the quality of these systems has improved, they lack naturalness. Moreover, since the voice color of these systems are limited to only one recorded speech DB, it is necessary to record another speech DB to create different voice colors. 'Voice Color' is an abstract concept that characterizes voice personality. So speech synthesis systems need a voice color control function to create various voices. The aim of this study is to examine several factors of voice color control rules for the text-to-speech system which makes natural and various voice types for the sounding of synthetic speech. In order to find such rules from natural speech, glottal source parameters and frequency characteristics of the vocal tract for several voice colors have been studied. In this paper voice colors were catalogued as: deep, sonorous, thick, soft, harsh, high tone, shrill, and weak. For the voice source model, the LF-model was used and for the frequency characteristics of vocal tract, the formant frequencies, bandwidths, and amplitudes were used. These acoustic parameters were tested through multiple regression analysis to achieve the general relation between these parameters and voice colors.

  • PDF

신경망을 이용한 모음의 학습 및 인식 방법 (A Method of Learning and Recognition of Vowels by Using Neural Network)

  • 심재형;이종혁;윤태훈;김재창;이양성
    • 대한전자공학회논문지
    • /
    • 제27권11호
    • /
    • pp.144-151
    • /
    • 1990
  • 본 연구에서는, Ohotomo 등이 모음의 학습과 인식을 위해 구성한 BP 구조 신경망의 학습을 위해 사용하였던 입력 패턴의 방법을 보완하여, 포만트 주파수의 대역폭을 고려한 측면값을 학습용 입력패턴에 두어 수렵 속도와 인식율을 높이고자 한다. 본 연구에서 제안한 방법이 오인식율에서는 $30{\%}$정도의 감소와 수렴 속도며에서는 $7{\%}$의 증가를 컴퓨터 시뮬레이션을 통하여 알 수 있었다.

  • PDF