• Title/Summary/Keyword: Utterance

Search Result 382, Processing Time 0.02 seconds

A New Teat Data Generation for SPRT in Speaker Verification (화자 확인에서 SPRT를 위한 새로운 테스트 데이터 생성)

  • 서창우;이기용
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.1
    • /
    • pp.42-47
    • /
    • 2003
  • This paper proposes the method to generate new test data using the sample shift of the start frame for SPRT(sequential probability ratio test) in speaker verification. The SPRT method is a effective algorithm that can reduce the test computational complexity. However, in making the decision procedure, SPRT can be executed on the assumption that the input samples are usually to be i.i.d. (Independent and Identically Distributed) samples from a probability density function (pdf), also it's not suitable method to apply for the short utterance. The proposed method can achieve SPRT regardless of the utterance length of the test data because it is method to generate the new test data through the sample shift of start frame. Also, the correlation property of data to be considered in the SPRT method can be effectively removed by employing the principal component analysis. Experimental results show that the proposed method increased the computational complexity of data for sample shift a little, but it has a good performance result more than a conventional method above the average 0.7% in EER (equal error rate).

Generation of Zero Pronouns using Center Transition of Preceding Utterances (선행 발화의 중심 전이를 이용한 영형 생성)

  • Roh, Ji-Eun;Na, Seung-Hoon;Lee, Jong-Hyeok
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.10
    • /
    • pp.990-1002
    • /
    • 2005
  • To generate coherent texts, it is important to produce appropriate pronouns to refer to previously-mentioned things in a discourse. Specifically, we focus on pronominalization by zero pronouns which frequently occur in Korean. This paper investigates zero pronouns in Korean based on the cost-based centering theory, especially focusing on the center transitions of adjacent utterances. In previous centering works, only one type of nominal entity has been considered as the target of pronominalization, even though other entities are frequently pronominalized as zero pronouns. To resolve this problem, and explain the reference phenomena of real texts, four types of nominal entity (Npair, Ninter, Nintra, and Nnon) from centering theory are defined with the concept of inter-, intra-, and pairwise salience. For each entity type, a case study of zero phenomena is performed through analyzing corpus and building a pronominalization model. This study shows that the zero phenomena of entities which have been neglected in previous centering works are explained via the renter transition of the second previous utterance. We also show that in Ninter, Nintra, and Nnon, pronominalization accuracy achieved by complex combination of several types of features is completely or nearly achieved by using the second previous utterance's transition across genres.

An Analysis of the Vowel Formants of the Young Males in the Buckeye Corpus (벅아이 코퍼스에서의 젊은 성인 남성의 모음 포먼트 분석)

  • Yoon, Kyu-Chul;Noh, Hye-Uk
    • Phonetics and Speech Sciences
    • /
    • v.4 no.2
    • /
    • pp.41-49
    • /
    • 2012
  • The purpose of this paper is to extract the vowel formants of the ten young male speakers from the Buckeye Corpus of Conversational Speech [1] and to analyze them in comparison to earlier works in terms of various phonetic factors that are expected to affect the realization of the formant distribution. The first two formant frequency values were automatically extracted with a Praat script along with such factors as the place of articulation, the content versus function word information, syllabic stress information, the location in a word, location in utterance, speech rate of three consecutive words, and the word frequency in the corpus. The results indicated that the formant patterns from the corpus were very different from those of earlier works although the overall pattern was similar and that the factors were strongly responsible for the realization of the two formants. The purpose of this paper is to extract the vowel formants of the ten young male speakers from the Buckeye Corpus of Conversational Speech [1] and to analyze them in comparison to earlier works in terms of various phonetic factors that are expected to affect the realization of the formant distribution. The first two formant frequency values were automatically extracted with a Praat script along with such factors as the place of articulation, the content versus function word information, the syllabic stress information, the location in a word, the location in an utterance, the speech rate of the three consecutive words, and the word frequency in the corpus. The result indicated that the formant patterns from the corpus were very different from those of earlier works although the overall pattern was similar and that the factors were strongly responsible for the realization of the two formants.

A Literature study on the language disturbance (聲音의 生理 病理에 關한 文獻的 考察)

  • Lee, Won-Ju;Kim, Yeon-Jin;Rho, Sek-Seon
    • The Journal of Korean Medicine Ophthalmology and Otolaryngology and Dermatology
    • /
    • v.10 no.1
    • /
    • pp.159-184
    • /
    • 1997
  • A Literature study on the language disturbance, the results are as follows; 1. Utterance was closely concerned not only the vocal organs(pharynx, larynx, epiglottis, lips, tongue, vocal cord etc,) but also five viscera{especially heart, lung, kidney etc.) in The Yellow Emperor's Canon of Internal Medicine. It is very like the vocal mechanism in Medical science. 2. In the language disturbance, It is classified with dysarthria and dysphasia in Medical science. But in Oriental medicine, it is expressed the language disturbance as coma-speech lessness, stiff tongue-speechlessness, frightening-speechlessness etc. Especially in Oriental medicine, Non-utterance is called aphasia in literature study. 3. In the concern of the language disturbance and five viscera, $Heart{\cdot}Lung{\cdot}Kidney$ are counted of first importence. In differential diagnosis, It is divided sthenia-syndrome and asthenia-syndrome. Sthenia-syndrome is classified with wind-cold, fire-evil, adverseness of vital energy, stagnation of phlegm, is easy to cure. Asthenia-syndrome is classified with sexual desire, anxiety-meditation, fear, is hard to cure. 4. The pathogenesis of dysphasia originated from two factors; The first internal damages are consumption of body fluid caused by lung-dryness and yin-dificiency of lung & kidney. The second disease caused by exogenous evjls is sluggishness of lung-energy. 5. In many using points of acupuncture of the language disturbance, the order is LI-4(合谷), H-7(神門), K-l(湧泉), L-3(太衝), K-3(太谿), S-6(三陰交), H-5(通里), G-15(아門), C-23(廉泉), S-40(豊降), K-6(照海), L-7(列缺), S-36(足三里) etc.

  • PDF

Pitch Patterns of Interrogative Sentences in relation to the Focus (초점과 관련된 의문문 억양 패턴 실험)

  • Kim, Mi-Ran;Shin, Dong-Hyun;Choe, Jae-Woong;Kim, Kee-Ho
    • Speech Sciences
    • /
    • v.7 no.4
    • /
    • pp.203-217
    • /
    • 2000
  • In spoken language, the characteristics of prosodic realization are related to the meaning of utterance. The pitch pattern of an interrogative sentence which differs from that of declarative sentences can be considered in this respect.. If we consider the question-answer pair, we can find that the most important variation comes from the intended meaning of asking. In this paper, we experiment with four kinds of interrogative sentences and show that the difference in pitch patterns of interrogative sentences can be explained in relation to the focus phenomena that is, the differences of the boundary tones in interrogative sentences are due to the differences in the prosodic domain of focus. For a relevant explanation with the focus phenomena, we divided focus into the categories: emphatic focus, which plays a role in delivering the speaker's intended meaning for the sentence interpretation, and informational focus, delivers the central intended meaning of the utterance. The results can be summarized in three points. First, High boundary tone delivers the meaning of asking. Second, the realization of different boundary tones that are found in wh-question and alternative question are just phonetic variations caused by focusing. Third, the high rise boundary tone in echo questions is related to the meaning of surprise or incredulity, and this relation is a consensus of existing opinion, that is, the speaker's attitude of surprise can raise the pitch range. From these results we can distinguish between boundary type and phonetic variation, and we can also give appropriate meaning to the different boundary tones in interrogative sentences that have been regarded as merely a part of sentence type.

  • PDF

Cepstral Feature Normalization Methods Using Pole Filtering and Scale Normalization for Robust Speech Recognition (강인한 음성인식을 위한 극점 필터링 및 스케일 정규화를 이용한 켑스트럼 특징 정규화 방식)

  • Choi, Bo Kyeong;Ban, Sung Min;Kim, Hyung Soon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.34 no.4
    • /
    • pp.316-320
    • /
    • 2015
  • In this paper, the pole filtering concept is applied to the Mel-frequency cepstral coefficient (MFCC) feature vectors in the conventional cepstral mean normalization (CMN) and cepstral mean and variance normalization (CMVN) frameworks. Additionally, performance of the cepstral mean and scale normalization (CMSN), which uses scale normalization instead of variance normalization, is evaluated in speech recognition experiments in noisy environments. Because CMN and CMVN are usually performed on a per-utterance basis, in case of short utterance, they have a problem that reliable estimation of the mean and variance is not guaranteed. However, by applying the pole filtering and scale normalization techniques to the feature normalization process, this problem can be relieved. Experimental results using Aurora 2 database (DB) show that feature normalization method combining the pole-filtering and scale normalization yields the best improvements.

Safety Robust Speaker Recognition Against Utterance Variationsed (발성변화에 강인한 화자 인식에 관한 연구)

  • Lee Ki-Yong
    • Journal of Internet Computing and Services
    • /
    • v.5 no.2
    • /
    • pp.69-73
    • /
    • 2004
  • A speaker model In speaker recognition system is to be trained from a large data set gathered in multiple sessions. Large data set requires large amount of memory and computation, and moreover it's practically hard to make users utter the data inseveral sessions. Recently the incremental adaptation methods are proposed to cover the problems, However, the data set gathered from multiple sessions is vulnerable to the outliers from the irregular utterance variations and the presence of noise, which result in inaccurate speaker model. In this paper, we propose an incremental robust adaptation method to minimize the influence of outliers on Gaussian Mixture Madel based speaker model. The robust adaptation is obtained from an incremental version of M-estimation. Speaker model is initially trained from small amount of data and it is adapted recursively with the data available in each session, Experimental results from the data set gathered over seven months show that the proposed method is robust against outliers.

  • PDF

Abstract Art, the early phenomena of aesthetic discourse - In the case of Korean art in 1930s (추상, 그 미학적 담론의 초기 현상 -1930년대 한국의 경우)

  • Lee, Ihn-Bum
    • The Journal of Art Theory & Practice
    • /
    • no.3
    • /
    • pp.135-154
    • /
    • 2005
  • In the late decade of 1930, under the Japanese Imperialism, the Korean abstract art which was formed with affection by Japan and Europe. They say the early Korean abstract art is colonized, from a point that it derives from exterior impact. And they say also it is colonized not to be related to the representation of their own life world. On the other hand, the early Korean abstract art in 1930s is told as the prehistory of 'Korean Modernism in Art', which flourished in 1970s followed 'Informal Art Movement' in the late 1950s. Because the status of abstract art in 1930s was not more than a germ of 'Korean Modernism in Art', while they understand until 1950s as a period dominated by representational art based on Chosun Exhibition or Korean National Exhibition, the period until 1970s as a period ruled by abstract art which was accepted as 'Korean Modernism in Art', and the period after 1980s as a period by Min-jung Art and Post-Modernism Art. However, the historical value of Korean Abstract Art in 1930s cannot be passed over, if not trying to understand the development of 'Korean Modernism in Art' especially focusing on not their own history but the impact of Western and Japanese art. In the late colonial period, the Korean early abstract art was the strongest utterance of the time paradoxically, even if not related much to optical representation of the Korean subjectivity. Therefore the existing viewpoints about the early Korean abstract art should be changed.

  • PDF

Korean Native Speakers Auditory Cognitive Reactions to Chinese Korean-learners' Pronunciation: Centered on the utterance of consonants in the Korean Language (중국인 학습자의 한국어 발음에 대한 한국인 모어 화자의 청각 인지 반응 -중국인 학습자의 자음 발음을 중심으로-)

  • Kim, Ji-hyung
    • Journal of Korean language education
    • /
    • v.28 no.2
    • /
    • pp.37-60
    • /
    • 2017
  • This research has its basis with focus on the way Korean native speakers recognize Chinese Korean-learners' pronunciation. The objective of the study is to lay the cornerstone for establishing effective teaching-learning strategies for the education of the Korean phonetic system. In this study, the results of the experiment are presented which shows how native speakers of Korean identify Chinese Korean-learners' pronunciation of consonants. In the first place, stimulation tones were created from the original utterances of Chinese Korean-learners and seven scripts were made through the Pratt program. In addition, the subjects were asked to choose what the phonetic materials sounded like. The results of the research are represented as the ratio of frequency of Korean native speakers' response to each utterance to the total frequency. In addition, the paired t-test was taken in order to explore any relatedness to the changes in the level of proficiency of the Korean phonetic system, ranging from beginners to advanced learners. The outcome shows that the mistakes which Chinese Korean-learners make in pronouncing the consonants of Korean are relatively well-reflected in Korean native speakers' auditory cognitive reactions. To put it concretely, there is some difficulty in differentiating lax consonants from aspirates in the cases of plosives and affricates, but relatively little trouble with fortes. However, it is revealed that there is also a slight difference in relation to articulatory positions in detailed aspects. To provide an effective teaching method for the Korean phonetic system, it is essential to comprehend learners' phonetic mistakes through the precise analysis of data in terms of 'production.' Also, a more meticulous observation of 'phenomena' must be made through verification from the view of 'reception,' as attempted in this study. A more thorough diagnosis by applying methodology makes it possible to lay the foundation for developing effective teaching-learning strategies for the instruction of the Korean phonetic system. This study has its significance in making such attempts.

A Contrastive Study on '됐어' and 'X了': Focusing on the Functions as a Discourse Marker (한국어 '됐어'와 중국어 'X了(료)'의 대조 연구 -담화표지로서의 기능을 중심으로-)

  • Zhang, Ya Nan
    • Journal of Korean language education
    • /
    • v.28 no.4
    • /
    • pp.181-219
    • /
    • 2017
  • The purpose of this study is to review the functions of {됐어} and {X了} as a discourse marker on different levels, and to examine their similarities and differences. {됐어} has not been widely recognized as a discourse marker in the field of Korean linguistics and Korean language education. Therefore, in order to establish the identity of {됐어} as a discourse marker, the reasons that {됐어} can be regarded as discourse marker were explained prior to the contrastive analysis. As to the method of contrastive analysis for {됐어} and {X了}, they were analyzed on three main dimensions: that is, the textual dimension, the interpersonal dimension, and the metalinguistic dimension in the corpus consisting of scripts of Korean and Chinese sitcoms. The results are as follows. In the textual domain, {됐어} and {X了} have the function of closing the topic in common, while {X了} can indicate a new topic and transmit a topic. In terms of functions in the interpersonal domain, {됐어} and {X了} are commonly used to refuse a partner's proposal or request and to interrupt a partner's speech or action. Furthermore, in the interactional aspect, {됐어} and {X了} performs the function of expressing a response to a preceding utterance and taking the turn of speaking. The difference between them in the interpersonal domain is that {X了} performs the function of correcting a speaker's utterance. In the metalinguistic domain, {됐어} and {X了} are common in that they perform the function of expressing the dissatisfaction of the speaker, showing generosity and making a compromise with the addressee. {X了}'s distinguishing characteristics in this domain is that it can express the attitude of consoling the hearer.