Search | Korea Science

Automatic Recognition of Pitch Accent Using Distributed Time-Delay Recursive Neural Network (분산 시간지연 회귀신경망을 이용한 피치 악센트 자동 인식)

Kim Sung-Suk
- The Journal of the Acoustical Society of Korea
- /
- v.25 no.6
- /
- pp.277-281
- /
- 2006
This paper presents a method for the automatic recognition of pitch accents over syllables. The method that we propose is based on the time-delay recursive neural network (TDRNN). which is a neural network classifier with two different representation of dynamic context: the delayed input nodes allow the representation of an explicit trajectory F0(t) along time. while the recursive nodes provide long-term context information that reflects the characteristics of pitch accentuation in spoken English. We apply the TDRNN to pitch accent recognition in two forms: in the normal TDRNN. all of the prosodic features (pitch. energy, duration) are used as an entire set in a single TDRNN. while in the distributed TDRNN. the network consists of several TDRNNs each taking a single prosodic feature as the input. The final output of the distributed TDRNN is weighted sum of the output of individual TDRNN. We used the Boston Radio News Corpus (BRNC) for the experiments on the speaker-independent pitch accent recognition. π 1e experimental results show that the distributed TDRNN exhibits an average recognition accuracy of 83.64% over both pitch events and non-events.
https://doi.org/10.7776/ASK.2006.25.6.277 인용 PDF KSCI

A Study on Implementation of Emotional Speech Synthesis System using Variable Prosody Model (가변 운율 모델링을 이용한 고음질 감정 음성합성기 구현에 관한 연구)

Min, So-Yeon;Na, Deok-Su
- Journal of the Korea Academia-Industrial cooperation Society
- /
- v.14 no.8
- /
- pp.3992-3998
- /
- 2013
This paper is related to the method of adding a emotional speech corpus to a high-quality large corpus based speech synthesizer, and generating various synthesized speech. We made the emotional speech corpus as a form which can be used in waveform concatenated speech synthesizer, and have implemented the speech synthesizer that can be generated various synthesized speech through the same synthetic unit selection process of normal speech synthesizer. We used a markup language for emotional input text. Emotional speech is generated when the input text is matched as much as the length of intonation phrase in emotional speech corpus, but in the other case normal speech is generated. The BIs(Break Index) of emotional speech is more irregular than normal speech. Therefore, it becomes difficult to use the BIs generated in a synthesizer as it is. In order to solve this problem we applied the Variable Break[3] modeling. We used the Japanese speech synthesizer for experiment. As a result we obtained the natural emotional synthesized speech using the break prediction module for normal speech synthesize.
https://doi.org/10.5762/KAIS.2013.14.8.3992 인용 PDF KSCI

Speech Animation Synthesis based on a Korean Co-articulation Model (한국어 동시조음 모델에 기반한 스피치 애니메이션 생성)

Jang, Minjung;Jung, Sunjin;Noh, Junyong
- Journal of the Korea Computer Graphics Society
- /
- v.26 no.3
- /
- pp.49-59
- /
- 2020
In this paper, we propose a speech animation synthesis specialized in Korean through a rule-based co-articulation model. Speech animation has been widely used in the cultural industry, such as movies, animations, and games that require natural and realistic motion. Because the technique for audio driven speech animation has been mainly developed for English, however, the animation results for domestic content are often visually very unnatural. For example, dubbing of a voice actor is played with no mouth motion at all or with an unsynchronized looping of simple mouth shapes at best. Although there are language-independent speech animation models, which are not specialized in Korean, they are yet to ensure the quality to be utilized in a domestic content production. Therefore, we propose a natural speech animation synthesis method that reflects the linguistic characteristics of Korean driven by an input audio and text. Reflecting the features that vowels mostly determine the mouth shape in Korean, a coarticulation model separating lips and the tongue has been defined to solve the previous problem of lip distortion and occasional missing of some phoneme characteristics. Our model also reflects the differences in prosodic features for improved dynamics in speech animation. Through user studies, we verify that the proposed model can synthesize natural speech animation.
https://doi.org/10.15701/kcgs.2020.26.3.49 인용 PDF KSCI

Prosodic Properties in the Speech of Adults with Cerebral Palsy (뇌성마비 성인 발화의 운율특성)

Lee, Sook-Hyang;Ko, Hyun-Ju;Kim, Soo-Jin
- MALSORI
- /
- no.64
- /
- pp.39-51
- /
- 2007
The purpose of this study is to investigate prosodic characteristics in the speech of adults with cerebral palsy through a comparison with the speech of normal speakers. Ten speakers with cerebral palsy (6 males, 4 females) and 6 normal speakers (3 males, 3 females) served as subjects. The results revealed that, compared to normal speakers, speakers with cerebral palsy showed a slower speech rate, a larger number of intonational phrases(IPs) and pauses, a larger number of accentual phrases(APs) per IP, a longer duration of pauses, and more gradual slopes of [L +H] in APs. However, the two groups showed similar tone patterns in their APs. The results also showed mild to moderate correlations between speech intelligibility and the prosodic properties which showed significant differences between the two groups, suggesting that they could be important prosodic factors to predict speech intelligibility in the speech of adults with cerebral palsy.
PDF

Emotion Recognition using Prosodic Feature Vector and Gaussian Mixture Model (운율 특성 벡터와 가우시안 혼합 모델을 이용한 감정인식)

Kwak, Hyun-Suk;Kim, Soo-Hyun;Kwak, Yoon-Keun
- Proceedings of the Korean Society for Noise and Vibration Engineering Conference
- /
- 2002.11b
- /
- pp.762-766
- /
- 2002
This paper describes the emotion recognition algorithm using HMM(Hidden Markov Model) method. The relation between the mechanic system and the human has just been unilateral so far. This is the why people don't want to get familiar with multi-service robots of today. If the function of the emotion recognition is granted to the robot system, the concept of the mechanic part will be changed a lot. Pitch and Energy extracted from the human speech are good and important factors to classify the each emotion (neutral, happy, sad and angry etc.), which are called prosodic features. HMM is the powerful and effective theory among several methods to construct the statistical model with characteristic vector which is made up with the mixture of prosodic features
PDF

Some Prosodic Characteristics in Apraxia - From a visual task point of view - (실행증 환자의 운율적 특성 연구 - 시각과제 중심으로 -)

Kim Sujung
- Proceedings of the KSPS conference
- /
- 2003.10a
- /
- pp.125-127
- /
- 2003
The aim of the paper is to analyze prosodic characteristics in apraxia of speech and establish the fundamental sources in diagnosis of motor speech disorders. The sentences consist of two different types (declarative and interrogative) with different numbers of constituents which are one to three. The stimuli were constructed to assess apraxics speech with articulation and humming skills. The features of speech patterns were examined such as utterance duration, boundary tones, and etc. The results of the analysis are as follow: 1) In the interrogative sentences, the rising boundary tones appeared only in the humming tasks 2) the utterance duration is relatively shorter in the humming tasks than the speech with articulation.
PDF

A Study on the Prosodic Characteristics of the Korean Broadcast News Utterances (한국어 정규 뉴스 방송 문장의 운율 특성 연구)

In, Ji-Young;Seong, Cheol-Jae
- Proceedings of the KSPS conference
- /
- 2007.05a
- /
- pp.197-200
- /
- 2007
The purpose of this study is to analyze the prosodic characteristics of Korean news utterances. In this paper, prosodic phrases were described in terms of the K-ToBI labeling system. In addition, the change of intonation contour that occurs throughout the sentences was discussed in terms of types of media and gender. According to analyzing the tendency of resets, 331 out of 729 resets were observed at the boundary of the intonation phrases. This means that resets are of the speaker's own volition regardless of prosodic units of intonation phrases. The declination of the intonation contour of radio news showed a gentler slope than that of TV news, because when the sentence is getting longer, the declination of the intonation contour becomes slower.
PDF

Emotion Recognition using Prosodic Feature Vector and Gaussian Mixture Model (운율 특성 벡터와 가우시안 혼합 모델을 이용한 감정인식)

Kwak, Hyun-Suk;Kim, Soo-Hyun;Kwak, Yoon-Keun
- Proceedings of the Korean Society for Noise and Vibration Engineering Conference
- /
- 2002.11a
- /
- pp.375.2-375
- /
- 2002
This paper describes the emotion recognition algorithm using HMM(Hidden Markov Model) method. The relation between the mechanic system and the human has just been unilateral so far This is the why people don't want to get familiar with multi-service robots. If the function of the emotion recognition is granted to the robot system, the concept of the mechanic part will be changed a lot. (omitted)
PDF

Prosodic Characteristics of Politeness in Korean (한국어에서의 공손함을 나타내는 운율적 특성에 관한 연구)

Ko Hyun-ju;Kim Sang-Hun;Kim Jong-Jin
- MALSORI
- /
- no.45
- /
- pp.15-22
- /
- 2003
This study is a kind of a preliminary study to develop naturalness of dialog TTS system. In this study, as major characteristics of politeness in Korean, temporal(total duration of utterances, speech rate and duration of utterance final syllables) and F0(mean F0, boundary tone pattern, F0 range) features were discussed through acoustic analysis of recorded data of semantically neutral sentences, which were spoken by ten professional voice actors under two conditions of utterance type - namely, normal and polite type. The results show that temporal characteristics were significantly different according to the utterance type but F0 characteristics were not.
PDF

The Prosodic Characteristics of Pre-school Age Children-Related Adults (학령전기아동 관련 성인의 운율 특성)

Kim, Jiwon;Seong, Cheoljae
- Phonetics and Speech Sciences
- /
- v.6 no.3
- /
- pp.23-32
- /
- 2014
This study presents the prosodic characteristics of 'Motherese' and 'Teacherese (child care teacher and kindergarten teacher)'. 21 mothers and 24 teachers spoke to children in the child care center or kindergarten. Children are in their 4;00-6;11. Speech and articulation rate, number of accentual phrases (APs), number of intonational phrases (IPs), pitch-related factors (f0, pitch range, f0 standard deviation), and intonation slope (mean Absolute, f0, q-tone slope) were measured. 2 groups spoke 2 sentential types (interrogative_ alternative question, declarative_ coordinated sentence) in 2 situations (one accompanied with the children, the other done without children, but pretending as if they were in front of the children). The results indicate that teachers show more noticeable prosodic characteristics than mothers do.
https://doi.org/10.13064/KSSS.2014.6.3.023 인용 PDF KSCI

Search Result 59, Processing Time 0.022 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)