• Title/Summary/Keyword: 발화 실험

Search Result 354, Processing Time 0.027 seconds

Multi Domain Dialog State Tracking using Domain State (도메인 상태를 이용한 다중 도메인 대화 상태 추적)

  • Jeon, Hyunmin;Lee, Geunbae
    • Annual Conference on Human and Language Technology
    • /
    • 2020.10a
    • /
    • pp.421-426
    • /
    • 2020
  • 다중 도메인 목적 지향 대화에서 기존 딥 러닝을 이용한 대화 상태 추적(Dialog state tracking)은 여러 턴 동안 누적된 사용자와 시스템 간 대화를 입력 받아 슬롯 밸류(Slot value)를 추출하는 모델들이 연구되었다. 하지만 이 모델들은 대화가 길어질수록 연산량이 증가한다. 이에 본 논문에서는 다중 도메인 대화에서 누적된 대화의 history 없이 슬롯 밸류를 추출하는 방법을 제안한다. 하지만, 단순하게 history를 제거하고 현재 턴의 발화만 입력 받는 방법은 문맥 정보의 손실로 이어진다. 따라서 본 논문에서는 도메인 상태(Domain state)를 도입하여 매 턴 마다 대화 상태와 함께 추적하는 모델을 제안한다. 도메인 상태를 같이 추적함으로써 현재 어떠한 도메인에 대하여 대화가 진행되고 있는지를 파악한다. 또한, 함축된 문맥 정보를 담고 있는 이전 턴의 대화 상태와 도메인 상태를 현재 턴의 발화와 같이 입력 받아 정보의 손실을 줄였다. 대표적인 데이터 셋인 MultiWOZ 2.0과 MultiWOZ 2.1에서 실험한 결과, 대화의 history를 사용하지 않고도 대화 상태 추적에 있어 좋은 성능을 보이는 것을 확인하였다. 또한, 시스템 응답과 과거 발화에 대한 의존성을 제거하여 end-to-end 대화 시스템으로의 확장이 좀 더 용이할 것으로 기대된다.

  • PDF

Perceptual discrimination of wh-scopes in Gyeongsang Korean (경상 방언 의문문 작용역의 지각 구분)

  • Yun, Weonhee
    • Phonetics and Speech Sciences
    • /
    • v.14 no.2
    • /
    • pp.1-10
    • /
    • 2022
  • A wh-phrase positioned in an embedded clause can be interpreted as having a matrix scope if the sentence is produced with proper prosodic structures such as the wh-intonation. In a previous experiment, a sentence with a wh-phrase in an embedded clause was given to 40 speakers of Gyeongsang Korean. A script containing the sentence was provided to induce a matrix scope interpretation for the wh-phrase. These 40 utterances were prepared as stimuli for a perception test to verify whether the wh-phrases in the stimuli were perceived as having matrix scopes. Each utterance was played thrice to 24 subjects. The results showed that more than half of the 72 responses indicated a preference for an embedded scope rather than a matrix scope in 20 of the utterances. A multiple linear regression analysis showed that the matrix scope responses were best predicted by the magnitude of the pitch prominence in a prosodic word consisting of an embedded verb and a complementizer. The pitch prominence was calculated by subtracting the fundamental frequency (F0) at the right edge of the prosodic word from the peak F0 in the same prosodic word. The smaller the magnitude, the more matrix responses there were. These results suggest that the categorical perception of wh-scopes is based on the magnitude of pitch prominence.

A Study on the Fireproof Characteristic and the Extinguishment by NAF S-III on a Molded Transformer in Substation (변전실용 몰드변압기의 난연성과 NAF S-III 소화에 관한 연구)

  • 이수경;신효섭
    • Fire Science and Engineering
    • /
    • v.15 no.4
    • /
    • pp.78-85
    • /
    • 2001
  • This dissertation is research on the fireproof characteristic of molded transformer and the extinguishable characteristics of the NAF S-III. As the research method, a theoretical examination has been made for the combustion process of epoxy resin, which was the main material of molded transformer, and extinguishing process of NAF S-III, which has recently been used in the clean extinguishable chemicals. Furthermore, for its proof, the experiments on combustion and extin-guishment on molded transformer has been performed. By installing the actual molded transformer in and artificial the horizontal heating furnace which has similar conditions with the electrical substation, and after subsequently ignited, the extinguishing process has been observed by classifying it into the natural extinguishment of the ignited transformer, and extinguishable chemical in NAF S-III has been injected. The volume of injected extinguishable chemical was the economical amount which was equipped with the extinguishable capability on the molded transformer under combustion, and it was calculated with the Announcement of the Ministry of Government Administration and Home Affairs as the basis. With the injection of the calculated extinguishable chemicals, the ignited transformer has completely extinguished within one minute.

  • PDF

Analysis on Vowel and Consonant Sounds of Patent's Speech with Velopharyngeal Insufficiency (VPI) and Simulated Speech (구개인두부전증 환자와 모의 음성의 모음과 자음 분석)

  • Sung, Mee Young;Kim, Heejin;Kwon, Tack-Kyun;Sung, Myung-Whun;Kim, Wooil
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.7
    • /
    • pp.1740-1748
    • /
    • 2014
  • This paper focuses on listening test and acoustic analysis of patients' speech with velopharyngeal insufficiency (VPI) and normal speakers' simulation speech. In this research, a set consisting of 50-words, vowels and single syllables is determined for speech database construction. A web-based listening evaluation system is developed for a convenient/automated evaluation procedure. The analysis results show the trend of incorrect recognition for VPI speech and the one for simulation speech are similar. Such similarity is also confirmed by comparing the formant locations of vowel and spectrum of consonant sounds. These results show that the simulation method for VPI speech is effective at generating the speech signals similar to actual VPI patient's speech. It is expected that the simulation speech data can be effectively employed for our future work such as acoustic model adaptation.

An Example-Based Natural Language Dialogue System for EPG Information Access (EPG 정보 검색을 위한 예제 기반 자연어 대화 시스템)

  • Kim, Seok-Hwan;Lee, Cheong-Jae;Jung, Sang-Keun;Lee, GaryGeun-Bae
    • Journal of KIISE:Software and Applications
    • /
    • v.34 no.2
    • /
    • pp.123-130
    • /
    • 2007
  • In this paper, we present an example-based natural language dialogue system for Electronic Program Guide Information Access. We introduce an effective and practical dialogue management technique incorporating dialogue examples and situation-based rules. In order to generate cooperative responses to smoothly lead the dialogue with users, our natural language dialogue system consists of natural language understanding, dialogue manager, system utterance generator. and EPG database manager. Each module is designed and implemented to make an effective and practical natural language dialogue system. In particular, in order to reflect the up-to-date EPG information which is updated frequently and periodically, we applied a web-mining technology to the EPG database manager, which builds the content database based on automatically extracted information from popular EPG websites. The automatically generated content database is used by other modules in the system for building their own resources. Evaluations show that our system performs EPG access task in high performance and can be managed with low cost.

Example-based Dialog System for English Conversation Tutoring (영어 회화 교육을 위한 예제 기반 대화 시스템)

  • Lee, Sung-Jin;Lee, Cheong-Jae;Lee, Geun-Bae
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.2
    • /
    • pp.129-136
    • /
    • 2010
  • In this paper, we present an Example-based Dialogue System for English conversation tutoring. It aims to provide intelligent one-to-one English conversation tutoring instead of old fashioned language education with static multimedia materials. This system can understand poor expressions of students and it enables green hands to engage in a dialogue in spite of their poor linguistic ability, which gives students interesting motivation to learn a foreign language. And this system also has educational functionalities to improve the linguistic ability. To achieve these goals, we have developed a statistical natural language understanding module for understanding poor expressions and an example-based dialogue manager with high domain scalability and several effective tutoring methods.

A Study on Ignitability and Heat Release Rate Characteristics of Rigid Polyurethane Foam (경질 폴리우레탄폼의 착화성 및 열방출특성 연구)

  • 공영건;이두형
    • Fire Science and Engineering
    • /
    • v.17 no.4
    • /
    • pp.117-123
    • /
    • 2003
  • In this study; the ignition and heat release rate characteristics of rigid polyurethane foam were investigated in accordance with setchkin ignition tester and cone calorimeter which is using oxygen consumption principle. In the ignition temperature study; flash-ignition temperature was $383^{\circ}C$-$390^{\circ}C$, self-ignition temperature was$ 493^{\circ}C$∼495$^{\circ}C$. The self-ignition temperature of rigid polyurethane foam was about $100^{\circ}C$ higher than the flash-ignition temperature. In the cone calorimeter study, the time to ignition of rigid polyurethane foam was faster as the external heat flux increase. In the same heat flux level, the time to ignition was faster as the density of rigid polyurethane foam decrease. Also the heat release rate was the largest value at the heat flux of /$50 ㎾\m^2$ and had a tendency of increase as the heat flux level and density increase. In the standpoint of time to ignition and heat release rate, the fire performance of rigid polyurethane foam was influenced by the applied heat flux level and density and the flashover propensity classified by Petrella's proposal was high.

Voice Features Extraction of Lung Diseases Based on the Analysis of Speech Rates and Intensity (발화속도 및 강도 분석에 기반한 폐질환의 음성적 특징 추출)

  • Kim, Bong-Hyun;Cho, Dong-Uk
    • The KIPS Transactions:PartB
    • /
    • v.16B no.6
    • /
    • pp.471-478
    • /
    • 2009
  • The lung diseases classifying as one of the six incurable diseases in modern days are caused mostly by smoking and air pollution. Such causes the lung function damages, and results in malfunction of the exchange of carbon dioxide and oxygen in an alveolus, which the interest is augment with risk diseases of life prolongation. With this in the paper, we proposed a diagnosis method of lung diseases by applying parameters of voice analysis aiming at the getting the voice feature extraction. Firstly, we sampled the voice data from patients and normal persons in the same age and sex, and made two sample groups from them. Also, we conducted an analysis by applying the various parameters of voice analysis through the collected voice data. The relational significance between the patient and normal groups can be evaluated in terms of speech rates and intensity as a part of analized parameters. In conclusion, the patient group has shown slower speech rates and bigger intensity than the normal group. With this, we propose the method of voice feature extraction for lung diseases.

Comparison of Classification Performance Between Adult and Elderly Using Acoustic and Linguistic Features from Spontaneous Speech (자유대화의 음향적 특징 및 언어적 특징 기반의 성인과 노인 분류 성능 비교)

  • SeungHoon Han;Byung Ok Kang;Sunghee Dong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.8
    • /
    • pp.365-370
    • /
    • 2023
  • This paper aims to compare the performance of speech data classification into two groups, adult and elderly, based on the acoustic and linguistic characteristics that change due to aging, such as changes in respiratory patterns, phonation, pitch, frequency, and language expression ability. For acoustic features we used attributes related to the frequency, amplitude, and spectrum of speech voices. As for linguistic features, we extracted hidden state vector representations containing contextual information from the transcription of speech utterances using KoBERT, a Korean pre-trained language model that has shown excellent performance in natural language processing tasks. The classification performance of each model trained based on acoustic and linguistic features was evaluated, and the F1 scores of each model for the two classes, adult and elderly, were examined after address the class imbalance problem by down-sampling. The experimental results showed that using linguistic features provided better performance for classifying adult and elderly than using acoustic features, and even when the class proportions were equal, the classification performance for adult was higher than that for elderly.

Dialect classification based on the speed and the pause of speech utterances (발화 속도와 휴지 구간 길이를 사용한 방언 분류)

  • Jonghwan Na;Bowon Lee
    • Phonetics and Speech Sciences
    • /
    • v.15 no.2
    • /
    • pp.43-51
    • /
    • 2023
  • In this paper, we propose an approach for dialect classification based on the speed and pause of speech utterances as well as the age and gender of the speakers. Dialect classification is one of the important techniques for speech analysis. For example, an accurate dialect classification model can potentially improve the performance of speaker or speech recognition. According to previous studies, research based on deep learning using Mel-Frequency Cepstral Coefficients (MFCC) features has been the dominant approach. We focus on the acoustic differences between regions and conduct dialect classification based on the extracted features derived from the differences. In this paper, we propose an approach of extracting underexplored additional features, namely the speed and the pauses of speech utterances along with the metadata including the age and the gender of the speakers. Experimental results show that our proposed approach results in higher accuracy, especially with the speech rate feature, compared to the method only using the MFCC features. The accuracy improved from 91.02% to 97.02% compared to the previous method that only used MFCC features, by incorporating all the proposed features in this paper.