DOI QR코드

DOI QR Code

Multi-channel Long Short-Term Memory with Domain Knowledge for Context Awareness and User Intention

  • 투고 : 2021.01.07
  • 심사 : 2021.05.04
  • 발행 : 2021.10.31

초록

In context awareness and user intention tasks, dataset construction is expensive because specific domain data are required. Although pretraining with a large corpus can effectively resolve the issue of lack of data, it ignores domain knowledge. Herein, we concentrate on data domain knowledge while addressing data scarcity and accordingly propose a multi-channel long short-term memory (LSTM). Because multi-channel LSTM integrates pretrained vectors such as task and general knowledge, it effectively prevents catastrophic forgetting between vectors of task and general knowledge to represent the context as a set of features. To evaluate the proposed model with reference to the baseline model, which is a single-channel LSTM, we performed two tasks: voice phishing with context awareness and movie review sentiment classification. The results verified that multi-channel LSTM outperforms single-channel LSTM in both tasks. We further experimented on different multi-channel LSTMs depending on the domain and data size of general knowledge in the model and confirmed that the effect of multi-channel LSTM integrating the two types of knowledge from downstream task data and raw data to overcome the lack of data.

키워드

과제정보

This work was supported by the Ministry of the Republic of Korea and the National Research Foundation of Korea (No. NRF-2019S1A5A2A03046571).

참고문헌

  1. A. Gupta, P. Zhang, G. Lalwani, and M. Diab, "CASA-NLU: context-aware self-attentive natural language understanding for task-oriented chatbots," in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, 2019, pp. 1285-1290.
  2. H. W. Tung and V. W. Soo, "A personalized restaurant recommender agent for mobile e-service," in Proceedings of IEEE International Conference on e-Technology, e-Commerce and e-Service, Taipei, Taiwan, 2004, pp. 259-262.
  3. A. Pashtan, A. Heusser, and P. Scheuermann, "Personal service areas for mobile web applications," IEEE Internet Computing, vol. 8, no. 6, pp. 34-39, 2004. https://doi.org/10.1109/MIC.2004.69
  4. W. Wang, S. Hosseini, A. H. Awadallah, P. N. Bennett, and C. Quirk, "Context-aware intent identification in email conversations," in Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, Paris, France, 2019, pp. 585-594.
  5. C. B. Shim, Y. W. Shin, and B. R. Park, "An Implementation of Context-Awareness Support System based on voice Service for Medical Environments," Journal of the Korea Society of Computer and Information, vol. 10, no. 4, pp. 29-36, 2005.
  6. D. B. Cho, H. Y. Lee, J. H. Park, and S. S. Kang, "Automatic bias classification of political news articles by using morpheme embedding and SVM," in Proceedings of the Korea Information Processing Society Conference, Virtual events, 2020, pp. 451-454.
  7. X. Chen and C. Cardie, "Multinomial adversarial networks for multi-domain text classification," in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), New Orleans, LA, 2018, pp. 1226-1240.
  8. P. Wang and C. Domeniconi, "Building semantic kernels for text classification using Wikipedia," in Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, NV, 2008, pp. 713-721.
  9. D. Cho, H. Lee, and S. Kang, "Voice phishing with context-awareness using large corpus," in Proceedings of the Korea Software Congress, Pyeongchang, Korea, 2020, pp. 310-312.
  10. Y. Kim, "Convolutional neural networks for sentence classification," in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 2014, pp. 1746-1751.
  11. M. Joshi, D. Chen, Y. Liu, D. S. Weld, L. Zettlemoyer, and O. Levy, "SpanBERT: improving pre-training by representing and predicting spans," Transactions of the Association for Computational Linguistics, vol. 8, pp. 64-77, 2020. https://doi.org/10.1162/tacl_a_00300
  12. S. Li and C. Zong, "Multi-domain adaptation for sentiment classification: using multiple classifier combining methods," in Proceedings of 2008 International Conference on Natural Language Processing and Knowledge Engineering, Beijing, China, 2008, pp. 1-8.
  13. X. Glorot, A. Bordes, and Y. Bengio, "Domain adaptation for large-scale sentiment classification: a deep learning approach," in Proceedings of the 28th International Conference on Machine Learning (ICML), Bellevue, WA, 2011, pp. 513-520.
  14. H. E. Kim, S. Kim, and J. Lee, "Keep and learn: continual learning by constraining the latent space for knowledge preservation in neural networks," in Medical Image Computing and Computer-Assisted Intervention - MICCAI 2008. Cham, Switzerland: Springer, 2008, pp. 520-528.
  15. M. Chen, K. Q. Weinberger, and J. Blitzer, "Co-training for domain adaptation," Advances in Neural Information Processing Systems, vol. 24, pp. 2456-2464, 2011.
  16. K. Nigam and R. Ghani, "Analyzing the effectiveness and applicability of co-training," in Proceedings of the 9th International Conference on Information and Knowledge Management, McLean, VA, 2000, pp. 86-93.
  17. Y. Wang, M. Huang, X. Zhu, and L. Zhao, "Attention-based LSTM for aspect-level sentiment classification," in Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, TX, 2016, pp. 606-615.
  18. D. Tang, F. Wei, N. Yang, M. Zhou, T. Liu, and B. Qin, "Learning sentiment-specific word embedding for twitter sentiment classification," in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Baltimore, MD, 2014, pp. 1555-1565.
  19. T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, "Distributed representations of words and phrases and their compositionality," Advances in Neural Information Processing Systems, vol. 26, pp. 3111-3119, 2013.
  20. T. Mikolov, K. Chen, G. Corrado, and J. Dean, "Efficient estimation of word representations in vector space," in Proceedings of the 1st International Conference on Learning Representations (ICLR), Scottsdale, AZ, 2013.
  21. D. Lee, Y. Lim, and T. T. Kwon, "Morpheme-based efficient Korean word embedding," Journal of KIISE, vol. 45, no. 5, pp. 444-450, 2018. https://doi.org/10.5626/jok.2018.45.5.444
  22. A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, and I. Sutskever, "Language models are unsupervised multitask learners," 2019 [Online]. Available: https://d4mucfpksywv.cloudfront.net/better-language-models/language-models.pdf.