Browse > Article
http://dx.doi.org/10.3745/KTSDE.2019.8.5.213

A Semi-Automatic Semantic Mark Tagging System for Building Dialogue Corpus  

Park, Junhyeok (한국교통대학교 컴퓨터정보공학과)
Lee, Songwook (한국교통대학교 컴퓨터정보공학전공)
Lim, Yoonseob (한국과학기술연구원 지능로봇연구단/치매DTC융합연구단)
Choi, Jongsuk (과학기술연합대학원(UST) HCI 및 로봇 세부전공)
Publication Information
KIPS Transactions on Software and Data Engineering / v.8, no.5, 2019 , pp. 213-222 More about this Journal
Abstract
Determining the meaning of a keyword in a speech dialogue system is an important technology for the future implementation of an intelligent speech dialogue interface. After extracting keywords to grasp intention from user's utterance, the intention of utterance is determined by using the semantic mark of keyword. One keyword can have several semantic marks, and we regard the task of attaching the correct semantic mark to the user's intentions on these keyword as a problem of word sense disambiguation. In this study, about 23% of all keywords in the corpus is manually tagged to build a semantic mark dictionary, a synonym dictionary, and a context vector dictionary, and then the remaining 77% of all keywords is automatically tagged. The semantic mark of a keyword is determined by calculating the context vector similarity from the context vector dictionary. For an unregistered keyword, the semantic mark of the most similar keyword is attached using a synonym dictionary. We compare the performance of the system with manually constructed training set and semi-automatically expanded training set by selecting 3 high-frequency keywords and 3 low-frequency keywords in the corpus. In experiments, we obtained accuracy of 54.4% with manually constructed training set and 50.0% with semi-automatically expanded training set.
Keywords
Dialogue Corpus; Semantic Mark Tagging; Context Vector Similarity;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean, "Efficient Estimation of Word Representations in Vector Space," arXiv:1301.3781, 2013.
2 Wansu Kim and Cheolyoung Ock, “Korean Semantic Role Labeling Using Case Frame Dictionary and Subcategorization,” Journal of KIISE, Vol. 43, No. 12, pp. 1376-1384, 2016.   DOI
3 Jangseong Bae, Changki Lee, and Soojong Lim, "Korean Semantic Role Labeling using Deep Learning," Proc. of the KIISE Korea Computer Congress 2015, pp. 690-692, 2015.
4 Martha Palmer, Shijong Ryu, Jinyoung Choi, Sinwon Yoon, and Yeongmi Jeon, Korean Propbank, [Online]. Available:http://catalog.ldc.upenn.edu/LDC2006T03.
5 Michael Lesk, "Automatic Sense Disambiguation Using Machine Readable Dictionaries: How to Tell a Pine Cone from an Ice Cream Cone," in Proceedings of the 5th Annual International Conference on Systems Documentation, 1986.
6 G. A. Miller, "WordNet : An On-Line Lexical Database," International Journal of Lexicography, Jan. 1990.
7 Joonchoul Shin and Cheolyoung Ock, “A Stage Transition Model for Korean Part-of-Speech and Homograph Tagging,” Journal of KIISE,, Vol. 39, No. 11, pp. 889-901, 2012.
8 H. Schutze, "Automatic Word Sense Discrimination," Computational Linguistics, Vol. 24, No. 1, 1998.
9 Yongmin Park and Jaesung Lee, “Word Sense Disambiguation using Korean Word Space Model,” Journal of The Korea Contents Association, Vol. 12, No. 6, pp. 41-47, 2012.   DOI
10 Hanjo Jeong and Byeonghwa Park, "Korean Word Sense Disambiguation using Dictionary and Corpus," Journal of Intelligence and Information Systems, Vol. 21, pp. 1-13, 2015.
11 Sangyun Kim and Soowon Lee, “Automatic Extraction of Alternative Word Candidates using the Word2vec model,” Korean Institute of Information Scientists and Engineers, Vol. 2015, No. 12, pp. 769-771, 2015.
12 Junhyeok Park, and Songwook Lee, “Word Sense Classification Using Support Vector Machines,” KIPS Tr., Vol. 5, No. 11, pp. 563-568, 2016.
13 Sangwook Kang, Minho Kim, Hyukchul Kwon, Sungkyu Jeon, and Juhyun Oh, “Word Sense Disambiguation of Predicate using Sejong Electronic Dictionary and KorLex,” KIISE Transactions on Computing Practices, Vol. 21, No. 7, pp. 500-505, 2015.   DOI
14 Kongjoo Lee and Songwook Lee, "Error-driven Noun-Connection Rule Extraction for Morphological Analysis", Journal of the Korean society of Marine Engineering, Vol. 36, No. 8, pp. 1123-1128, 2012.   DOI