• Title/Summary/Keyword: Utterance

Search Result 382, Processing Time 0.022 seconds

Real-time Background Music System for Immersive Dialogue in Metaverse based on Dialogue Emotion (메타버스 대화의 몰입감 증진을 위한 대화 감정 기반 실시간 배경음악 시스템 구현)

  • Kirak Kim;Sangah Lee;Nahyeon Kim;Moonryul Jung
    • Journal of the Korea Computer Graphics Society
    • /
    • v.29 no.4
    • /
    • pp.1-6
    • /
    • 2023
  • To enhance immersive experiences for metaverse environements, background music is often used. However, the background music is mostly pre-matched and repeated which might occur a distractive experience to users as it does not align well with rapidly changing user-interactive contents. Thus, we implemented a system to provide a more immersive metaverse conversation experience by 1) developing a regression neural network that extracts emotions from an utterance using KEMDy20, the Korean multimodal emotion dataset 2) selecting music corresponding to the extracted emotions from an utterance by the DEAM dataset where music is tagged with arousal-valence levels 3) combining it with a virtual space where users can have a real-time conversation with avatars.

The aesthetics of irony in repetition and the difference of Oh! Soojung (<오! 수정>의 아이러니 미학 - 반복과 차이의 구조를 중심으로)

  • Suh, MyungSoo
    • 기호학연구
    • /
    • no.57
    • /
    • pp.121-153
    • /
    • 2018
  • In terms of the story told, we see that Oh! Soojung(Virgin Stripped Bare by Her Bachelors) is a film of the ideololgy of masculinity. However, from the point of view of the manner of presenting story, Oh! Soojung is a film that aims to devalue this ideology. How will it be possible? This is the principle of the irony that the speaker, by saying P, wants to make Q listen that devalues and contradicts P. Our study is tempted to explain the process of interpreting the irony in the film. The ideology of the film occurs when the presupposed contents have become the subject. For example Cendrion who tells a story of a girl married to a prince presupposes that the girl, Cendrion, is obedient. The subject of this story is that the presupposition: /the girls who want to be happy must be obedient/, which represents the ideology of masculinity. Presupposed content thus imposes on the public a collective and conservative value, as its enunciator belongs to the collective voice. Since ironisation occurs when the utterance itself is annulled, one must also deny or cancel the story told of Oh! Soojng: /Jeahun who is rich and Soojung who is obedient and virgin have become lovers/. Since there is no semantic mark within the utterance, irony is a voice that comes from without; this is how we understand irony in a purely pragmatic way. The outer voices are two things: the way to build the story: question of focusing, ocularization and auricularization, and the way to present the story: question the order, the frequency or the plot. Our study is focused on the question of frequency at Oh! Soojung which has a repetition structure in which the memory of Jeahun and that of Soojung are represented one after the other. Since the memories of two characters are not identical, the repetition is accompanied by differences. The differences at first allow the public to build their own story from the di?g?se of the film and then make the audience fall into confusion where we can not be certain of what we see and know in the di?g?se of the film, and finally make their knowledge questionable. About repetition, so that it can have validity in terms of the informativeness of the utterance, it must deny the existence of the previous repetition. This is how repetition cancels itself and consequently the utterance. We see that the irony of Oh! Soojung occurs by repetition with differences that cancels the story of the film.

Speaker Adaptation using ICA-based Feature Transformation (ICA 기반의 특징변환을 이용한 화자적응)

  • Park ManSoo;Kim Hoi-Rin
    • MALSORI
    • /
    • no.43
    • /
    • pp.127-136
    • /
    • 2002
  • The speaker adaptation technique is generally used to reduce the speaker difference in speech recognition. In this work, we focus on the features fitted to a linear regression-based speaker adaptation. These are obtained by feature transformation based on independent component analysis (ICA), and the transformation matrix is learned from a speaker independent training data. When the amount of data is small, however, it is necessary to adjust the ICA-based transformation matrix estimated from a new speaker utterance. To cope with this problem, we propose a smoothing method: through a linear interpolation between the speaker-independent (SI) feature transformation matrix and the speaker-dependent (SD) feature transformation matrix. We observed that the proposed technique is effective to adaptation performance.

  • PDF

A Fundamental Phonetic Investigation of Korean Monophthongs (한국어 단모음의 음성학적 기반연구)

  • Moon, Seung-Jae
    • MALSORI
    • /
    • no.62
    • /
    • pp.1-17
    • /
    • 2007
  • The purpose of this study was to investigate and quantitatively describe the acoustic characteristics of current Korean monophthongs. Recordings were made of 33 men and 27 women producing the vowels /i, e, ${\epsilon}$, a, ${\partial}$, o, u, i/ in a carrier phrase "This character is ___." A listening test was conducted in which 19 participants judged each vowel. F1, F2, and F3 were measured from the vowels judged as intended vowels by more than 17 people from the listening test. Analysis of formant data shows some interesting results including the undeniable confirmation of the 7-vowel system in modern Korean. It turns out that quite different sounding Korean vowels and English vowels happen to have very similar formant measurements. Also the difference between "citation-form reading" vs. "natural utterance reading" is discussed.

  • PDF

Relation among Mother's Interaction Behavior, Mother's Language Input and Children's MLU: A Comparison between Multicultural- and Korean-Families (어머니의 상호작용행동 및 언어입력과 영·유아의 언어발달과의 관계: 다문화가정과 일반가정의 비교)

  • Park, Hye-Won;Lee, Kuk-Hee;Cho, Jeung-Ryeul
    • Korean Journal of Human Ecology
    • /
    • v.21 no.3
    • /
    • pp.439-451
    • /
    • 2012
  • Maternal interaction behavior, language input and children's language of 34 multicultural families were compared with those of ordinary families. MLU's of multicultural mothers and their children were shorter than those of ordinary Korean mothers and children. Positive maternal interaction behaviors of multicultural mothers were significantly lower than those of ordinary mothers. Correlational analyses revealed that there were positive correlations among maternal interaction behaviors, mother's MLU and children' MLU in multicultural families. However, there were no such correlations in ordinary families. Findings suggest language education and support for multicultural mothers be an effective policy for their children's language development.

Korean Nominal Particles Development in Korean-English Simultaneous Bilingual Children (혼자놀이에서 5-6세 '한국어-영어' 동시습득 이중언어아동의 한국어 조사(助詞) 습득분석)

  • Lee, Ha-Won;Choi, Kyoung-Sook
    • Korean Journal of Child Studies
    • /
    • v.29 no.6
    • /
    • pp.147-161
    • /
    • 2008
  • The present study compared characteristics of Korean nominal particles (occurrence, error, error patterns) of ten 5- to 6-year-old Korean-English simultaneous bilingual children with ten Korean monolingual children. Data were analyzed by Mann-Whitney U test and Spearman Rank Correlation and by qualitative analysis. Results were (1) bilingual children showed significantly lower frequency based on the number of occurrence of nominal particles per utterance. (2) The error percentage of adverbial markers was significantly higher for bilingual children. (3) Error patterns of bilingual children showed a higher percentage of in-case substitution and double use error. These findings suggest that Korean-English simultaneous bilingual children have a different Korean nominal particles development from Korean monolingual children.

  • PDF

Speaker Adaptation Using ICA-Based Feature Transformation

  • Jung, Ho-Young;Park, Man-Soo;Kim, Hoi-Rin;Hahn, Min-Soo
    • ETRI Journal
    • /
    • v.24 no.6
    • /
    • pp.469-472
    • /
    • 2002
  • Speaker adaptation techniques are generally used to reduce speaker differences in speech recognition. In this work, we focus on the features fitted to a linear regression-based speaker adaptation. These are obtained by feature transformation based on independent component analysis (ICA), and the feature transformation matrices are estimated from the training data and adaptation data. Since the adaptation data is not sufficient to reliably estimate the ICA-based feature transformation matrix, it is necessary to adjust the ICA-based feature transformation matrix estimated from a new speaker utterance. To cope with this problem, we propose a smoothing method through a linear interpolation between the speaker-independent (SI) feature transformation matrix and the speaker-dependent (SD) feature transformation matrix. From our experiments, we observed that the proposed method is more effective in the mismatched case. In the mismatched case, the adaptation performance is improved because the smoothed feature transformation matrix makes speaker adaptation using noisy speech more robust.

  • PDF

On a Study of Measurement Method of Utterance Velocity for the Reduction of Transmission Rate in CELP Vocoder. (LSP 파라미터를 이용한 발성측정법)

  • 장경아;배명진
    • Proceedings of the IEEK Conference
    • /
    • 2000.11d
    • /
    • pp.199-202
    • /
    • 2000
  • Speaking Rate has variety depends on the situation and habit of speakers. It has been many studied about speaking rate In speaker recognition. The study of speaking rate in speech recognition is one of considerable matter when It is recognized the speakers and it is measured by many speech data base and complicate estimation for accuracy. In this paper, conventional vocoder process the speech signal when encoding and transmitting without regard to speaking rate so in order to apply the speaking rate for vocoder It should be considered the simpler algorithm and less computation amount than the conventional method of speaking rate used In speech recognition. We proposed the speaking rate algorithm which is used the simple parameter with Line Spectrum Pair (LSP). The proposed peaking rate method is measured by the information of LSP in speech. We measured the variety rate of phenomenon about utterances which have different velocity, respectively. As a result, It has distinct variation rate of phenomenon between utterances uttered fast and slow and the rate is 42.8% higher in case of uttered fast than in case of uttered slow.

  • PDF

Implementation of Speaker Verification Security System Using DSP Processor(TMS320C32) (DSP Processor(TMS320C32)를 이용한 화자인증 보안시스템의 구현)

  • Haam, Young-Jun;Kwon, Hyuk-Jae;Choi, Soo-Young;Jeong, lk-Joo
    • Journal of Industrial Technology
    • /
    • v.21 no.B
    • /
    • pp.107-116
    • /
    • 2001
  • The speech includes various kinds of information : language information, speaker's information, affectivity, hygienic condition, utterance environment etc. when a person communicates with others. All technologies to utilize in real life processing this speech are called the speech technology. The speech technology contains speaker's information that among them and it includes a speech which is known as a speaker recognition. DTW(Dynamic Time Warping) is the speaker recognition technology that seeks the pattern of standard speech signal and the similarity degree in an inputted speech signal using dynamic programming. ln this study, using TMS320C32 DSP processor, we are to embody this DTW and to construct a security system.

  • PDF

Prosodic Break Index Estimation using LDA and Tri-tone Model (LDA와 tri-tone 모델을 이용한 운율경계강도 예측)

  • 강평수;엄기완;김진영
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.7
    • /
    • pp.17-22
    • /
    • 1999
  • In this paper we propose a new mixed method of LDA and tri-tone model to predict Korean prosodic break indices(PBI) for a given utterance. PBI can be used as an important cue of syntactic discontinuity in continuous speech recognition(CSR). The model consists of three steps. At the first step, PBI was predicted with the information of syllable and pause duration through the linear discriminant analysis (LDA) method. At the second step, syllable tone information was used to estimate PBI. In this step we used vector quantization (VQ) for coding the syllable tones and PBI is estimated by tri-tone model. In the last step, two PBI predictors were integrated by a weight factor. The proposed method was tested on 200 literal style spoken sentences. The experimental results showed 72% accuracy.

  • PDF