Search | Korea Science

Context-aware and controllable natural language generation model for task-oriented dialogue systems (목적 지향 대화 시스템을 위한 문맥 기반의 제어 가능한 자연어 생성 모델 )

Jina Ham;Jaewon Kim;Dongil Yang
- Annual Conference on Human and Language Technology
- /
- 2022.10a
- /
- pp.71-76
- /
- 2022
목적 지향 대화 시스템은 사용자가 원하는 목적을 달성하기 위해 사용하는 시스템으로 일상 대화와 다르게 시스템이 정보를 명확히 전달하는 것이 중요하다. 따라서 최근 연구에서 목적 지향 대화 시스템을 위한 자연어 생성 모델은 정해진 대화 정책에 따라 알맞은 응답을 생성할 수 있도록 의도와 슬롯 정보를 담은 대화 행위(Dialog Act)를 활용한다. 하지만 대화 행위는 생성하는 문장을 탁월하게 제어하는 반면에 대화의 흐름과 상황에 맞게 다양한 문장을 생성하기 어렵다는 문제점을 가지고 있다. 이러한 문제점을 해소하고자 본 논문에서는 목적에 부합하는 내용을 명확하게 자연어로 생성하기 위해 대화 행위를 사용하면서 동시에 일상 대화 생성 모델과 같이 문맥을 고려하여 대화 흐름에 어울리는 자연스러운 문장을 생성할 수 있는 문맥 기반의 제어 가능한 자연어 생성 모델을 제안한다. 실험에서는 KoGPT2 사전 학습 모델과 한국어 대화 데이터셋을 사용하였으며 실험을 통해 대화 행위 기반의 자연어 생성 모델과 본 연구에서 제안한 문맥 기반의 제어 가능한 자연어 생성 모델을 비교하였다. 결과적으로 대화 행위를 단독으로 학습한 모델보다 일정 문맥을 함께 학습한 모델이 유의미한 BLEU 점수 향상을 보인다는 점을 확인하였다.
PDF

The design of Plan based dialogue system in Task execution domain (작업수행영역에서 계획에 기반한 대화 시스템의 설계)

오종건;서정연
- Proceedings of the Korean Information Science Society Conference
- /
- 2000.04b
- /
- pp.450-452
- /
- 2000
대화 시스템이란 자연어를 이용하여 인간과 정보를 교환하거나 업무를 수행하는 프로그램이다. 자연언어는 인간이 사용할 수 있는 쉽고 효율적인 인터페이스이기 때문에 이를 이용한 대화 시스템의 필요성이 증대되고 있다. 본 논문에서는 지금까지 주로 인식에 초점이 맞추어져 연구되어 왔던 계획 기반 대화 모델을 이용하여 발화하여 생성하는 시스템을 설계하고자 한다. 본 논문에서 설계하고자 하는 대화 시스템은 사용자의 질의에 응답할 뿐 아니라 자신의 행위를 능동적으로 수행할 수 있는 협조적 대화 시스템이다. 또한 대화의 효율성을 고려하여 사용자가 필요로 하는 정보를 능동적으로 제공하는 시스템이다. 대화의 효율성을 고려한 발화를 위해 본 논문에서는 새로운 시스템의 행위를 정의하여 실제 가능한 대화 예를 보이고자 한다.
PDF

Gesture Recognition in Multiple People Environment (복수 등장인물을 대상으로 한 제스처 인식)

Hong, Seok-Ju;Setiawan, Nurul Arif;Kim, Song-Gook;Kim, Jang-Woon;Lee, Chil-Woo
- 한국HCI학회:학술대회논문집
- /
- 2007.02a
- /
- pp.891-896
- /
- 2007
지금까지 진행된 제스처 인식 연구는 한 사람을 대상으로 정적인 환경을 가정하여 이루어져 왔다. 본 논문에서는 복수의 등장인물이 존재하는 환경에서 대화 상대를 선택하여 제스처를 인식하는 방법에 대해 기술한다. 먼저 복수의 인물이 존재하는 환경에서 배경영역을 제외한 행위자의 영역을 추출한다. 그 후 각각의 행위자를 트래킹하면서 카메라와 가장 가까이 있는 행위자를 대화 상대자로 선택한다. 대화상대자가 선택되면 스테레오 카메라에서 입력된 영상에서 추출된 실루엣 이미지를 이용하여 얼굴과 두 손을 특징 영역으로 하여 매 프레임마다 Kalman filter를 사용하여 각 영역을 트래킹한다. 트래킹되는 특징 영역의 2차원 좌표 값을 모델 제스처의 2차원 좌표 값과 비교하여 가장 높은 유사값을 갖는 모델 제스처를 입력 제스처로 인식하게 된다. 본 논문에서 사용한 방법은 복수의 등장인물이 있는 경우 효과적으로 행위 대상자를 선택하여 제스처를 인식할 수 있다. 또한 제스처 인식에 있어서 단순한 큐 매칭을 사용함으로써 계산이 복잡하지 않은 장점이 있다. 실험을 통해 본 논문에서 제시한 방법을 적용함으로써 복수의 인물이 등장하는 환경에서 제스처 인식이 가능함을 입증한다.
PDF

Developing an Adaptive Dialogue System Using External Information (외부 상황 정보를 활용하는 적응적 대화 모델의 구현)

Jang, Jin Yea;Jung, Minyoung;Park, Hanmu;Shin, Saim
- Annual Conference on Human and Language Technology
- /
- 2019.10a
- /
- pp.456-459
- /
- 2019
대화 행위는 단순한 발화 문장들의 교환을 넘어 발화자들의 다양한 주변 정보를 고려한 종합적인 판단의 결과로 볼 수 있다. 본 논문은 여섯 가지 유형의 외부 상황 정보를 기반으로 적응적 발언을 생성하는 딥러닝 기반 대화 모델을 소개한다. 직접 구축한 상황 정보들이 태깅된 대화 데이터를 바탕으로, 외부 상황 정보를 사용자 발화와 더불어 활용하는 다양한 구조의 신경망 구조를 가지는 모델과 더불어 외부 상황 정보를 사용하지 않는 모델과의 성능에 대해 비교한다. 실험 결과들은 대화 모델의 발화 생성에 있어서 상황 정보 활용의 중요성을 보여준다.
PDF

Prediction of Domain Action Using a Neural Network (신경망을 이용한 영역 행위 예측)

Lee, Hyun-Jung;Seo, Jung-Yun;Kim, Hark-Soo
- Korean Journal of Cognitive Science
- /
- v.18 no.2
- /
- pp.179-191
- /
- 2007
In a goal-oriented dialogue, spoken' intentions can be represented by domain actions that consist of pairs of a speech art and a concept sequence. The domain action prediction of user's utterance is useful to correct some errors that occur in a speech recognition process, and the domain action prediction of system's utterance is useful to generate flexible responses. In this paper, we propose a model to predict a domain action of the next utterance using a neural network. The proposed model predicts the next domain action by using a dialogue history vector and a current domain action as inputs of the neural network. In the experiment, the proposed model showed the precision of 80.02% in speech act prediction and the precision of 82.09% in concept sequence prediction.
PDF

Resolution of Anaphoric Noun Phrases using a Centering Algorithm with a Dual Cache Model in a Multimodal Dialogue System (다중모드 대화 시스템에서 이중 캐시 모델의 센터링 알고리즘을 이용한 명사 대용어구 처리)

Kim, Hak-Su;Seo, Jeong-Yeon
- Journal of KIISE:Software and Applications
- /
- v.27 no.11
- /
- pp.1133-1140
- /
- 2000
다중모드 대화에서 나타나는 대용어는 언어만을 사용하는 대화에서 나타나는 것과 비교하여 매우 다른 형태와 특징을 가진다. 그것은 행위나 시각이 대용 행위로 사용될 수 있기 때문이다. 본 논문에서는 터치스크린 인터페이스를 이용한 홈쇼핑 가구점 영역의 다중모드 대화 시스템에서 나타나는 다양한 대용어의 처리 방법을 알아본다. 먼저, 화면 대용어와 참조 대용어를 정의하여 다양한 형태의 대용어를 분류한다. 그리고 각 대용어를 처리할 수 있는 두 가지의 일반적인 방법을 제안한다. 하나는 지시 행위를 수반하거나 생략한 채 발화되어 현재 화면에 나타나 있는 아이템을 참조하는 대용어를 처리하는 단순한 매핑 알고리즘이다. 다른 하나는 다중 모드 대화 시스템을 위해 워커(Walker)의 센터링 알고리즘을 확장한 이중 캐시 구조의 센터링 알고리즘이다. 확장된 센터링 알고리즘은 발화와시각 정보 그리고 화면 전환 시간을 유지할 수 있기 때문에 다중모드 대화에서 발생하는 다양한 대용어를 처리하기에 적합하다. 실험에서 제안된 시스템은 40개의 대화에서 나타난 402개의 대용어(발화당 0.54)중에서 387개를 처리하여 96.3%의 정확도를 보였다.
PDF

A Domain Action Classification Model Using Conditional Random Fields (Conditional Random Fields를 이용한 영역 행위 분류 모델)

Kim, Hark-Soo
- Korean Journal of Cognitive Science
- /
- v.18 no.1
- /
- pp.1-14
- /
- 2007
In a goal-oriented dialogue, speakers' intentions can be represented by domain actions that consist of pairs of a speech act and a concept sequence. Therefore, if we plan to implement an intelligent dialogue system, it is very important to correctly infer the domain actions from surface utterances. In this paper, we propose a statistical model to determine speech acts and concept sequences using conditional random fields at the same time. To avoid biased learning problems, the proposed model uses low-level linguistic features such as lexicals and parts-of-speech. Then, it filters out uninformative features using the chi-square statistic. In the experiments in a schedule arrangement domain, the proposed system showed good performances (the precision of 93.0% on speech act classification and the precision of 90.2% on concept sequence classification).
PDF

Examination of a Voice Interaction Model for Smart TV through Conversation Patterns (대화 패턴 연구를 통한 스마트TV 음성 상호작용 모델의 탐구)

Choi, Jinhae
- The Journal of the Korea Contents Association
- /
- v.17 no.2
- /
- pp.96-104
- /
- 2017
As new smart devices are evolved into the intelligent agent who can reflect user intention and use context, user experience design for easy and convenient usability becomes a core competitive edge. Under the assumption that human centered natural interaction is necessary for the optimal smart TV experience, this study explores the types of voice interaction which are peculiar to TV watching context. In order to build a model for the users to naturally interact with Smart TV, conversation patterns were collected by requesting key features of Smart TV to intelligent agent. Collected sentences were applied to CfA model and classified by responses to activate features. The classified conversation patterns were divided into feature activation and information search. This study has identified that CfC1 occurred when voice interaction between Smart TV and users was vague and CfC2 occurred when the requests were complex or conditional. In conclusion, Simple Request Type is the most efficient model and voice interaction is more appropriate to use to clarify users' vague requests.
https://doi.org/10.5392/JKCA.2017.17.02.096 인용 PDF KSCI

Plan-based Ellipsis Resolution for Utterances in Noun-Phrase-Form in Restricted Domain Dialogues (제한된 영역의 대화에서 체언구 형태의 발화 이해를 위한 계획기반 생략 처리)

윤철진;서정연
- Korean Journal of Cognitive Science
- /
- v.11 no.1
- /
- pp.81-92
- /
- 2000
Elliptical fragments are common in natural language dialogues between humans. Since most elliptical fragments should be interpeted within the context. it is not easy for computers to recognize the speaker's intention from the elliptical fragments. In t this paper we propose a model to recognize speaker's intention from elliptical fragments 1 in Korean by expanding the tripartite plan-based model proposed by Lambert. We add new discourse recipes to define user's discourse actions through elliptical fragments. In order to use plan inference process. we must represent utterances as actions. e. g .. r e elliptical fragments are represented as surface speech acts. In surface speech act representation. we include the information of 'Josa' (case markers in Korean), because t the information of 'Josa' plays a very important role in analysing speakers' intention in Korean. Finally. by using an object and discourse focus theory, the system can recognize the intention that a user is trying to compare between two plans by uttering elliptical fragments
PDF

CNN Based Speech-act Classification Using Sentence Types and Modalities (문장 유형과 양태 정보를 이용한 합성곱 신경망 기반의 대화체 발화 화행 분석)

Park, Yongsin;Ko, Youngjoong
- Annual Conference on Human and Language Technology
- /
- 2018.10a
- /
- pp.642-644
- /
- 2018
화행(Speech-act)이란 어떤 목적을 달성하기 위해 발화를 통해 이루어지는 화자의 행위를 뜻하며, 화행 분석(Speech-act analysis)이란 주어진 발화의 화행을 결정하는 것을 뜻한다. 문장 유형과 양태는 화행의 일종으로, 문장 유형의 경우 화자의 기본적인 발화 의도에 따라 평서문, 명령문, 청유문, 의문문, 감탄문의 다섯 가지 유형으로 나눌 수 있고, 양태는 문장이 표현하는 명제나, 명제가 기술하는 상황에 대해서 화자가 갖는 의견이나 태도를 말한다. 본 논문에서는 종결어미와 보조용언으로부터 비교적 간단하게 추출 가능한 문장 유형과 양태 정보를 활용하여 대화체 발화문의 화행 분석 성능을 높이는 방법을 보인다. 본 논문에서 제안하는 모델은 합성곱 신경망(CNN)을 사용한 기본 모델에 비해 0.52%p 성능 향상을 보였다.
PDF

Search Result 22, Processing Time 0.022 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)