[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.7472/jksii.2021.22.6.141

Multi-source information integration framework using self-supervised learning-based language model

Kim, Hanmin (Dept. of Industrial Engineering, Sungkyunkwan University)
Lee, Jeongbin (Dept. of Industrial Engineering, Sungkyunkwan University)
Park, Gyudong (Agency for Defense Development)
Sohn, Mye (Dept. of Industrial Engineering, Sungkyunkwan University)

Publication Information

Journal of Internet Computing and Services / v.22, no.6, 2021 , pp. 141-150 More about this Journal

Abstract

Based on Artificial Intelligence technology, AI-enabled warfare is expected to become the main issue in the future warfare. Natural language processing technology is a core technology of AI technology, and it can significantly contribute to reducing the information burden of underrstanidng reports, information objects and intelligences written in natural language by commanders and staff. In this paper, we propose a Language model-based Multi-source Information Integration (LAMII) framework to reduce the information overload of commanders and support rapid decision-making. The proposed LAMII framework consists of the key steps of representation learning based on language models in self-supervsied way and document integration using autoencoders. In the first step, representation learning that can identify the similar relationship between two heterogeneous sentences is performed using the self-supervised learning technique. In the second step, using the learned model, documents that implies similar contents or topics from multiple sources are found and integrated. At this time, the autoencoder is used to measure the information redundancy of the sentences in order to remove the duplicate sentences. In order to prove the superiority of this paper, we conducted comparison experiments using the language models and the benchmark sets used to evaluate their performance. As a result of the experiment, it was demonstrated that the proposed LAMII framework can effectively predict the similar relationship between heterogeneous sentence compared to other language models.

Keywords

Self-supervised learning; Language model; similar relationship between sentences; Multi-source information integration;

Citations & Related Records

Reference

1	Bastos, L., Capela, G., Koprulu, A., and Elzinga, G., "Potential of 5G technologies for military application", International Conference on Military Communication and Information Systems, pp.1-8, 2021. https://doi.org/10.1109/ICMCIS52405.2021.9486402
2	Liu, X., Zhang, F., Hou, Z., Mian, L., Wang, Z., Zhang, J., and Tang, J., "Self-supervised learning: Generative or contrastive", IEEE Transactions on Knowledge and Data Engineering, 2021. https://doi.org/10.1109/TKDE.2021.3090866 DOI
3	Clark, B., Patt, D., and Schramm, H., "Mosaic warfare: exploiting artificial intelligence and autonomous systems to implement decision-centric operations", Center for Strategic and Budgetary Assessments, 2020. https://csbaonline.org/research/publications/mosaic-warfare-exploiting-artificial-intelligence-and-autonomous-systems-to-implement-decision-centric-operations
4	Peters, M. E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., & Zettlemoyer, L., "Deep contextualized word representations", 2018. https://arxiv.org/abs/1802.05365v2
5	Seyong Kim, Hyukjin Kwon, and Minwoo Choi., "The study of Defense Artificial Intelligence and Block-chain Convergence", Journal of Internet Computing and Services, Vol 21, No.2, pp.81-90, 2020. https://doi.org/10.7472/jksii.2020.21.2.81 DOI
6	Changhee Han, Jong-Kwan Lee, "A Methodology for Defense AI Command & Control Platform Construction", The Journal of Korean Institute of Communications and Information Sciences, Vol.44, No.4, pp.774-781, 2019. https://doi.org/10.7840/kics.2019.44.4.774 DOI
7	Schmidt, E., Work, B., Catz, S., Chien, S., Darby, C., Ford, K., ... and Moore, A., "National Security Commission on Artificial Intelligence (AI)", National Security Commission on Artificial Intellegence, 2021. https://apps.dtic.mil/sti/citations/AD1124333
8	Liao, F., Ma, L., Pei, J., and Tan, L., "Combined Self-Attention Mechanism for Chinese Named Entity Recognition in Military", Future Internet, Vol.11, No.8, pp.180, 2019. https://doi.org/10.3390/fi11080180 DOI
9	Liu, Z., Lin, Y., and Sun, M., "Representation learning for natural language processing", Springer Nature, 2020. https://doi.org/10.1007/978-981-15-5573-2
10	Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I., "Improving language understanding by generative pre-training", 2018. https://www.cs.ubc.ca/~amuham01/LING530/papers/radford2018improving.pdf
11	Devlin, J., Chang, M. W., Lee, K., & Toutanova, K., "Bert: Pre-training of deep bidirectional transformers for language understanding", arXiv preprint arXiv:1810.04805., 2018. https://arxiv.org/abs/1810.04805v2
12	Nangia, N., & Bowman, S. R., "Human vs. muppet: A conservative estimate of human performance on the glue benchmark", arXiv preprint arXiv:1905.10425, 2019. https://arxiv.org/abs/1905.10425
13	Yin, W., "Meta-learning for few-shot natural language processing: A survey", arXiv preprint arXiv: 2007.09604., 2020. https://arxiv.org/abs/2007.09604
14	Zhong, M., Liu, P., Chen, Y., Wang, D., Qiu, X., & Huang, X., "Extractive summarization as text matching", arXiv preprint arXiv:2004.08795, 2020. https://arxiv.org/abs/2004.08795
15	Chen, Yen-Chun, and Mohit Bansal., "Fast abstractive summarization with reinforce-selected sentence rewriting", arXiv preprint arXiv:1805.11080, 2018. 10.18653/v1/P18-1063

KSCI

Multi-source information integration framework using self-supervised learning-based language model 자기 지도 학습 기반의 언어 모델을 활용한 다출처 정보 통합 프레임워크

Multi-source information integration framework using self-supervised learning-based language model