Search | Korea Science

A Study on the Multiple Pronunciation Dictionary for Spontaneous Speech Recognition (대화체 연속음성인식을 위한 확장 다중발음 사전에 관한 연구)

Kang ByungOk
- Proceedings of the KSPS conference
- /
- 2003.10a
- /
- pp.65-68
- /
- 2003
본 논문에서는 대화체 연속음성인식 과정에서 사용되는 다중발음사전의 개념을 확장하여 대화체 발화에 빈번하게 나타나는 불규칙한 발음변이 현상을 포용하도록 한 확장된 발음사전의 방법을 적용하여 대화체 연속음성인식에서 인식성능의 향상을 가져오게 됨을 실험을 통해 보여준다. 대화체 음성에서 빈번하게 나타나는 음운축약 및 음운탈락, 전형적인 오발화, 양성음의 음성음화 등의 발음변이는 언어모델의 효율성을 떨어뜨리고 어휘 수를 증가시켜 음성인식의 성능을 저하시키고, 또한 음성인식 결과로 나타나는 출력형태가 정형화되지 못하는 단점을 가지고 있다. 이에 이러한 발음변이들을 발음사전에 수용할 때 각각의 대표어휘에 대한 변이발음으로 처리하고, 언어모델과 어휘사전은 대표어휘만을 이용해 구성하도록 한다. 그리고, 음성인식기의 탐색부에서는 각각의 변이발음의 발음열도 탐색하되 대표어휘로 언어모델을 참조하도록 하고, 인식결과를 출력하도록 하여 결과적으로 인식성능을 향상시키고, 정형화된 출력패턴을 얻도록 한다. 본 연구에서는 어절단위 뿐 아니라 의사형태소[2] 단위의 발음사전에도 발음변이를 포용하도록 하여 실험을 하였다. 실험을 통해 어절단위의 다중발음사전 구성을 통해 ERR 10.9％, 의사형태소 단위의 다중발음 사전의 구성을 통해 ERR 4.3％의 성능향상을 보였다.
PDF

A Study on Deep Learning Based RobotArm System (딥러닝 기반의 로봇팔 시스템 연구)

Shin, Jun-Ho;Shim, Gyu-Seok
- Proceedings of the Korea Information Processing Society Conference
- /
- 2020.11a
- /
- pp.901-904
- /
- 2020
본 시스템은 세 단계의 모델을 복합적으로 구성하여 이루어진다. 첫 단계로 사람의 음성언어를 텍스트로 전환한 후 사용자의 발화 의도를 분류해내는 BoW방식을 이용해 인간의 명령을 이해할 수 있는 자연어 처리 알고리즘을 구성한다. 이후 YOLOv3-tiny를 이용한 실시간 영상처리모델과 OctoMapping모델을 활용하여 주변환경에 대한 3차원 지도생성 후 지도데이터를 기반으로하여 동작하는 기구제어 알고리즘 등을 ROS actionlib을 이용한 관리자시스템을 구성하여 ROS와 딥러닝을 활용한 편리한 인간-로봇 상호작용 시스템을 제안한다.
https://doi.org/10.3745/PKIPS.y2020m11a.901 인용 PDF

Frame Arguments Role Labeling for Event extraction in Dialogue (대화문에서의 이벤트 추출을 위한 프레임 논항 역할 분류기)

Heo, Cheolhun;Noh, Youngbin;Hahm, Younggyun;Choi, Key-Sun
- Annual Conference on Human and Language Technology
- /
- 2020.10a
- /
- pp.119-123
- /
- 2020
이벤트 추출은 텍스트에서 구조화된 이벤트를 분석하는 것이다. 본 논문은 대화문에서 발생하는 다양한 종류의 이벤트를 다루기 위해 이벤트 스키마를 프레임넷으로 정한다. 대화문에서의 이벤트 논항은 이벤트가 발생하는 문장 뿐만 아니라 다른 문장 또는 대화에 참여하는 발화자에서 발생할 수 있다. 대화문 주석 데이터의 부재로 대화문에서의 프레임 파싱 연구는 진행되지 않았다. 본 논문이 제안하는 모델은 대화문에서의 이벤트 논항 구간이 주어졌을 때, 논항 구간의 역할을 식별하는 모델이다. 해당 모델은 이벤트를 유발한 어휘, 논항 구간, 논항 역할 간의 관계를 학습한다. 대화문 주석 데이터의 부족을 극복하기 위해 문어체 주석 데이터인 한국어 프레임넷을 활용하여 전이학습을 진행한다. 이를 통해 정확도 51.21%를 달성한다.
PDF

KE-T5-Based Text Emotion Classification in Korean Conversations (KE-T5 기반 한국어 대화 문장 감정 분류)

Lim, Yeongbeom;Kim, San;Jang, Jin Yea;Shin, Saim;Jung, Minyoung
- Annual Conference on Human and Language Technology
- /
- 2021.10a
- /
- pp.496-497
- /
- 2021
감정 분류는 사람의 사고방식이나 행동양식을 구분하기 위한 중요한 열쇠로, 지난 수십 년간 감정 분석과 관련된 다양한 연구가 진행되었다. 감정 분류의 품질과 정확도를 높이기 위한 방법 중 하나로 단일 레이블링 대신 다중 레이블링된 데이터 세트를 감정 분석에 활용하는 연구가 제안되었고, 본 논문에서는 T5 모델을 한국어와 영어 코퍼스로 학습한 KE-T5 모델을 기반으로 한국어 발화 데이터를 단일 레이블링한 경우와 다중 레이블링한 경우의 감정 분류 성능을 비교한 결과 다중 레이블 데이터 세트가 단일 레이블 데이터 세트보다 23.3% 더 높은 정확도를 보임을 확인했다.
PDF

A prototype of digital humans capable of emotionally using deep generative models (사전학습 기반 생성모델을 이용한 정서적 지지형 디지털 휴먼 프로토타입 구현)

Song, Chejung;Lee, Jee Hang
- Proceedings of the Korea Information Processing Society Conference
- /
- 2021.11a
- /
- pp.1005-1008
- /
- 2021
메타버스의 산업적/학술적 가치가 증대되면서, 실세계 인간과 메타버스 내 디지털 휴먼과의 상호작용 시스템 또한 큰 조명을 받고 있다. 본 논문에서는 인간과 디지털 휴먼이 상호작용할 때, 인간의 발화에 대해 감성적 지지가 가능한 디지털 휴먼 프로토타입을 소개한다. 대화의 의미에 따른 동작 생성이 가능한 아바타 구축 공개 프레임워크를 도입하고, 사전학습모델을 바탕으로 감성적 지지가 가능한 심층 대화 생성 모델 기반 대화 시스템을 여기에 통합하여 인간의 감성 상태에 따른 동작과 대화를 진행하는 감성 지지형 디지털 휴먼 프로토타입을 구현하였다. 이러한 프로토타입을 고도화 하면, 향후 메타버스 기반 정신 건강 케어 및 디지털 치료제로의 확장이 가능할 것으로 사료된다.
https://doi.org/10.3745/PKIPS.y2021m11a.1005 인용 PDF

Performance Improvement of Fast Speaker Adaptation Based on Dimensional Eigenvoice and Adaptation Mode Selection (차원별 Eigenvoice와 화자적응 모드 선택에 기반한 고속화자적응 성능 향상)

송화전;이윤근;김형순
- The Journal of the Acoustical Society of Korea
- /
- v.22 no.1
- /
- pp.48-53
- /
- 2003
Eigenvoice method is known to be adequate for fast speaker adaptation, but it hardly shows additional improvement with increased amount of adaptation data. In this paper, to deal with this problem, we propose a modified method estimating the weights of eigenvoices in each feature vector dimension. We also propose an adaptation mode selection scheme that one method with higher performance among several adaptation methods is selected according to the amount of adaptation data. We used POW DB to construct the speaker independent model and eigenvoices, and utterances(ranging from 1 to 50) from PBW 452 DB and the remaining 400 utterances were used for adaptation and evaluation, respectively. With the increased amount of adaptation data, proposed dimensional eigenvoice method showed higher performance than both conventional eigenvoice method and MLLR. Up to 26% of word error rate was reduced by the adaptation mode selection between eigenvoice and dimensional eigenvoice methods in comparison with conventional eigenvoice method.
PDF KSCI

Analysis on Vowel and Consonant Sounds of Patent's Speech with Velopharyngeal Insufficiency (VPI) and Simulated Speech (구개인두부전증 환자와 모의 음성의 모음과 자음 분석)

Sung, Mee Young;Kim, Heejin;Kwon, Tack-Kyun;Sung, Myung-Whun;Kim, Wooil
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.18 no.7
- /
- pp.1740-1748
- /
- 2014
This paper focuses on listening test and acoustic analysis of patients' speech with velopharyngeal insufficiency (VPI) and normal speakers' simulation speech. In this research, a set consisting of 50-words, vowels and single syllables is determined for speech database construction. A web-based listening evaluation system is developed for a convenient/automated evaluation procedure. The analysis results show the trend of incorrect recognition for VPI speech and the one for simulation speech are similar. Such similarity is also confirmed by comparing the formant locations of vowel and spectrum of consonant sounds. These results show that the simulation method for VPI speech is effective at generating the speech signals similar to actual VPI patient's speech. It is expected that the simulation speech data can be effectively employed for our future work such as acoustic model adaptation.
https://doi.org/10.6109/jkiice.2014.18.7.1740 인용 PDF KSCI

Implementation of Dynamic Context-Awareness Platform for Internet of Things(IoT) Loading Waste Fire-Prevention based on Universal Middleware (유니버설미들웨어기반의 IoT 적재폐기물 화재예방 동적 상황인지 플랫폼 구축)

Lee, Hae-Jun;Hwang, Chi-Gon;Yoon, Chang-Pyo
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.26 no.8
- /
- pp.1231-1237
- /
- 2022
It is necessary to dynamic recognition system with real time loading height and pressure of the loading waste, the drying of wood, batteries, and plastic wastes, which are representative compositional wastes, and the carbonization changes on the surface. The dynamic context awareness service constituted a platform based on Universal Middleware system using BCN convergence communication service as a Ambient SDK model. A context awareness system should be constructed to determine the cause of the fire based on the analysis data of fermentation heat point with natural ignition from the load waste. Furthermore, a real-time dynamic service platform that could be apply to the configuration of scenarios for each type from early warning fire should be built using Universal Middleware. Thus, this issue for Internet of Things realize recognition platform for analyzing low temperature fired fire possibility data should be dynamically configured and presented.
https://doi.org/10.6109/jkiice.2022.26.8.1231 인용 PDF KSCI

석고보드의 실내화재 성능평가

김충환;김종훈;김운형;하동명;이수경
- Proceedings of the Korean Institute of Industrial Safety Conference
- /
- 2000.06a
- /
- pp.190-195
- /
- 2000
현재 미국과 유럽, 일본 등에서는 실내 벽 및 천장 마감재에 관하여 성능 기준 설계에 의한 새로운 평가방법이 활발하게 연구되고 있다. 예를 들면 시험에 소요되는 시간과 경비를 절감하고자 Bench-scale test 결과를 이용하여 화재위험성을 예측하는 내장재의 Performance-based fire safety design에 관한 연구가 진행되고 있다. 한편, 성능기준 화재안전 설계를 적용하기 위하여 현재 사용하는 대부분의 존 모델 프로그램은 공간화재시 가연성 마감재의 발화 및 화염확산 및 위험성을 충분히 고려하지 못하고 있다. (중략)
PDF

Utterance Verification Using Anti-models Based on Neighborhood Information (이웃 정보에 기초한 반모델을 이용한 발화 검증)

Yun, Young-Sun
- MALSORI
- /
- no.67
- /
- pp.79-102
- /
- 2008
In this paper, we investigate the relation between Bayes factor and likelihood ratio test (LRT) approaches and apply the neighborhood information of Bayes factor to building an alternate hypothesis model of the LRT system. To consider the neighborhood approaches, we contemplate a distance measure between models and algorithms to be applied. We also evaluate several methods to improve performance of utterance verification using neighborhood information. Among these methods, the system which adopts anti-models built by collecting mixtures of neighborhood models obtains maximum error rate reduction of 17% compared to the baseline, linear and weighted combination of neighborhood models.
PDF

Search Result 205, Processing Time 0.035 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)