• 제목/요약/키워드: speeches

검색결과 93건 처리시간 0.03초

로봇 시스템에의 적용을 위한 음성 및 화자인식 알고리즘 (Implementation of the Auditory Sense for the Smart Robot: Speaker/Speech Recognition)

  • 조현;김경호;박영진
    • 한국소음진동공학회:학술대회논문집
    • /
    • 한국소음진동공학회 2007년도 춘계학술대회논문집
    • /
    • pp.1074-1079
    • /
    • 2007
  • We will introduce speech/speaker recognition algorithm for the isolated word. In general case of speaker verification, Gaussian Mixture Model (GMM) is used to model the feature vectors of reference speech signals. On the other hand, Dynamic Time Warping (DTW) based template matching technique was proposed for the isolated word recognition in several years ago. We combine these two different concepts in a single method and then implement in a real time speaker/speech recognition system. Using our proposed method, it is guaranteed that a small number of reference speeches (5 or 6 times training) are enough to make reference model to satisfy 90% of recognition performance.

  • PDF

담화상에 나타나는 목적격표지 {-를}의 음향적 특성 (The Acoustic Characteristics of the Korean Accusative Marker {l${\i}$l} in Discourse)

  • 김기호;김화영;김민정
    • 음성과학
    • /
    • 제6권
    • /
    • pp.55-82
    • /
    • 1999
  • The purpose of this paper is to investigate the acoustic characteristics of the Korean accusative marker {-lil} which functions as a discourse marker in discourse. Generally, in written texts or read speeches, it is seldom omitted and it certainly seems to serve a grammatical function. But in ordinary discourse, speakers do not use it in many cases. That is, the environments speakers use {-lil} differ from those they do not. According to the semantic interpretations, {-lil} functions as a pragmatic factor and adds to the meaning of the object in an utterance. In this paper, by comparing the acoustic characteristics of the utterances that contain the marker {-lil} with those of utterances that do not, especially based on Korean Intonational Phonology, we will demonstrate that the Korean accusative marker {-lil} shows clearly the acoustic characteristics related to the pragmatic factors which reflect speakers' special intention.

  • PDF

Determining the Relative Differences of Emotional Speech Using Vocal Tract Ratio

  • Wang, Jianglin;Jo, Cheol-Woo
    • 음성과학
    • /
    • 제13권1호
    • /
    • pp.109-116
    • /
    • 2006
  • In this paper, our study focuses on obtaining the differences of emotional speech in three different vocal tract sections. The vocal tract area was computed from the area function of the emotional speech. The total vocal tract was divided into 3 sections (vocal fold section, middle section and lip section) to acquire the differences in each vocal tract section of emotional speech. The experiment data include 6 emotional speeches from 3 males and 3 females. The 6 emotions consist of neutral, happiness, anger, sadness, fear and boredom. The measured difference is computed by the ratio through comparing each emotional speech with the normal speech. The experimental results present that there is not a remarkable difference at lip section, but the fear and sadness have a great change at the vocal fold part.

  • PDF

현대 한국어 파찰음의 조음점 전진 현상에 대한 연구 (The Study of Advanced Articulation of the Korean Affricates)

  • 국경아;강은지;김주원
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2007년도 한국음성과학회 공동학술대회 발표논문집
    • /
    • pp.247-250
    • /
    • 2007
  • The affricates of the Korean were alveolar sounds in the 15th century. Alveolar sounds have changed to post-alveolar or alveo-palatal sounds since the 18th century, at least in Southern Korean. These days, the advanced articulation of the affricates are observed, especially in the speech of young generations. The aim of this paper is to show the differences of the affricates when they are pronounced in alveo-palatal and in a more advanced position than in alveo-palatal by their cut-off frequencies. We have recorded speeches of freshmen(in their early twenties) at Seoul National University. The result was that the cut-off frequency of the advanced articulation in auditory observations was higher than that of the others. We have found in particular, that women have tendency to advance their place of articulation of the affricates.

  • PDF

부사 및 부사구의 의미적 예측가능성과 피치액센트 실현의 상관관계 (Correlation between sematic predictability and pitch-accent realization)

  • 조상현;이주경
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2007년도 한국음성과학회 공동학술대회 발표논문집
    • /
    • pp.281-284
    • /
    • 2007
  • This experimental study aims to find out the correlation between semantic predictability and pitch-accent realization. For the experiment, we classified the predictability into three degrees: unpredictable, implicitly predictable, and explicitly predictable. And then each degree divided into to two subcatergories: one is adverbs/adverbial phrases of time or place and the other one is not time or place adverbs/adverbial phrases. The materials used in the experiment were 9 sentences for the each subcategory. One male and one female English native speakers participated in this experiment. Their reading speeches were recorded on Digital Audio Tape. Their speech data were analyzed by using Pitchworks program. The results of this experiment show pitch accented ratio is somewhat in inverse proportion to the degree of predictability.

  • PDF

A Method of Evaluating Korean Articulation Quality for Rehabilitation of Articulation Disorder in Children

  • Lee, Keonsoo;Nam, Yunyoung
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제14권8호
    • /
    • pp.3257-3269
    • /
    • 2020
  • Articulation disorders are characterized by an inability to achieve clear pronunciation due to misuse of the articulators. In this paper, a method of detecting such disorders by comparing to the standard pronunciations is proposed. This method defines the standard pronunciations from the speeches of normal children by clustering them with three features which are the Linear Predictive Cepstral Coefficient (LPCC), the Mel-Frequency Cepstral Coefficient (MFCC), and the Relative Spectral Analysis Perceptual Linear Prediction (RASTA-PLP). By calculating the distance between the centroid of the standard pronunciation and the inputted pronunciation, disordered speech whose features locates outside the cluster is detected. 89 children (58 of normal children and 31 of children with disorders) were recruited. 35 U-TAP test words were selected and each word's standard pronunciation is made from normal children and compared to each pronunciation of children with disorders. In the experiments, the pronunciations with disorders were successfully distinguished from the standard pronunciations.

시간 영역에서의 무제한 고립어 합성을 위한 운율 요소 제어용 알고리즘 개발 (Development of an algorithm for the control of prosodic factors to synthesize unlimited isolated words in the time domain)

  • 강찬희
    • 전자공학회논문지C
    • /
    • 제35C권7호
    • /
    • pp.59-68
    • /
    • 1998
  • This paper is to develop an algorithm for the unlimited korean speech synthesis. We present the results controlled of prosodic factors with isolated words as aynthesis basis unit int he time domain. With a new pitch-synchronous and parametric speech synthesis mehtod in the time domain here we mainly present the results of controlled prosody factors such a spitch periods, energy envelops and durations and the evaluaton of synthetic speech qualities. In the case of synthesis, it is possible ot synthesize connected words by controlling of a continuous unified prosody that makes to improve the naturalities. In the results of experiment, it also has been to be improved uncontinuities of pitch and zeroing of energy in the junction parts of speech waveforms. Specially it has been to be possible to synthesize speeches with unlimitted durations and tones. So on it makes the noisiness and the clearness better by improving the degradation effects from the phase distortion due to the discontinuities in the waveform connection parts.

  • PDF

Machine Learning Techniques for Speech Recognition using the Magnitude

  • Krishnan, C. Gopala;Robinson, Y. Harold;Chilamkurti, Naveen
    • Journal of Multimedia Information System
    • /
    • 제7권1호
    • /
    • pp.33-40
    • /
    • 2020
  • Machine learning consists of supervised and unsupervised learning among which supervised learning is used for the speech recognition objectives. Supervised learning is the Data mining task of inferring a function from labeled training data. Speech recognition is the current trend that has gained focus over the decades. Most automation technologies use speech and speech recognition for various perspectives. This paper demonstrates an overview of major technological standpoint and gratitude of the elementary development of speech recognition and provides impression method has been developed in every stage of speech recognition using supervised learning. The project will use DNN to recognize speeches using magnitudes with large datasets.

고속 음성 문서 검색을 위한 Expected Matching Score 기반의 문서 확장 기법 (Expected Matching Score Based Document Expansion for Fast Spoken Document Retrieval)

  • 서민구;정규준;오영환
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2006년도 추계학술대회 발표논문집
    • /
    • pp.71-74
    • /
    • 2006
  • Many works have been done in the field of retrieving audio segments that contain human speeches without captions. To retrieve newly coined words and proper nouns, subwords were commonly used as indexing units in conjunction with query or document expansion. Among them, document expansion with subwords has serious drawback of large computation overhead. Therefore, in this paper, we propose Expected Matching Score based document expansion that effectively reduces computational overhead without much loss in retrieval precisions. Experiments have shown 13.9 times of speed up at the loss of 0.2% in the retrieval precision.

  • PDF

Exploring the Microscopic Textual Characteristics of Japanese Prime Ministers' Diet Addressesby Measuring the Quantity and Diversity of Nouns

  • Suzuki, Takafumi;Kageura, Kyo
    • 한국언어정보학회:학술대회논문집
    • /
    • 한국언어정보학회 2007년도 정기학술대회
    • /
    • pp.459-470
    • /
    • 2007
  • This study explores the textual characteristics, more precisely the quantity and diversity of nouns, of Japanese prime ministers' Diet addresses. In the field of stylistics, textual characteristics independent of the content have been examined with the aim on detecting the authors, genres, and chronological variations of texts. This study focuses instead on textual characteristics related to the content of texts, namely the quantity and diversity of nouns, because our aim is to analyze texts to better understand two political phenomena: (a) the difference between the two types of Diet addresses delivered by Japanese prime ministers, and (b) the perceived changes made to these addresses by two powerful prime ministers. It is a case study of the microscopic characterization of texts, which has become more and more important with the expansion in the scope of stylistics and the production of a wide variety of new types of texts following the advent of the Web.

  • PDF