Search | Korea Science

Kim, Sihyung;Lee, Hyeon-gu;Kim, Harksoo
- Journal of KIISE
- /
- v.45 no.2
- /
- pp.134-140
- /
- 2018
A chat system is a computer program that understands user's miscellaneous utterances and generates appropriate responses. Sometimes a chat system needs to answer users' simple information-seeking questions. However, previous generative chat systems do not consider how to embed knowledge entities (i.e., subjects and objects in triple knowledge), essential elements for question-answering. The previous chat models have a disadvantage that they generate same responses although knowledge entities in users' utterances are changed. To alleviate this problem, we propose a knowledge entity embedding method for improving question-answering accuracies of a generative chat system. The proposed method uses a Siamese recurrent neural network for embedding knowledge entities and their synonyms. For experiments, we implemented a sequence-to-sequence model in which subjects and predicates are encoded and objects are decoded. The proposed embedding method showed 12.48% higher accuracies than the conventional embedding method based on a convolutional neural network.
https://doi.org/10.5626/JOK.2018.45.2.134 인용 KSCI

Lim, Geun-Young;Cho, Young-Bok
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.22 no.10
- /
- pp.1300-1306
- /
- 2018
The purpose of this study is to see possibility of Char2Vec as alternative of Word2Vec that most famous word embedding model in Sentence Similarity Measure Problem by Deep-Learning. In experiment, we used the Siamese Ma-LSTM recurrent neural network architecture for measure similarity two random sentences. Siamese Ma-LSTM model was implemented with tensorflow. We train each model with 200 epoch on gpu environment and it took about 20 hours. Then we compared Word2Vec based model training result with Char2Vec based model training result. as a result, model of based with Char2Vec that initialized random weight record 75.1% validation dataset accuracy and model of based with Word2Vec that pretrained with 3 million words and phrase record 71.6% validation dataset accuracy. so Char2Vec is suitable alternate of Word2Vec to optimize high system memory requirements problem.
https://doi.org/10.6109/jkiice.2018.22.10.1300 인용 PDF KSCI