Browse > Article
http://dx.doi.org/10.7472/jksii.2021.22.6.141

Multi-source information integration framework using self-supervised learning-based language model  

Kim, Hanmin (Dept. of Industrial Engineering, Sungkyunkwan University)
Lee, Jeongbin (Dept. of Industrial Engineering, Sungkyunkwan University)
Park, Gyudong (Agency for Defense Development)
Sohn, Mye (Dept. of Industrial Engineering, Sungkyunkwan University)
Publication Information
Journal of Internet Computing and Services / v.22, no.6, 2021 , pp. 141-150 More about this Journal
Abstract
Based on Artificial Intelligence technology, AI-enabled warfare is expected to become the main issue in the future warfare. Natural language processing technology is a core technology of AI technology, and it can significantly contribute to reducing the information burden of underrstanidng reports, information objects and intelligences written in natural language by commanders and staff. In this paper, we propose a Language model-based Multi-source Information Integration (LAMII) framework to reduce the information overload of commanders and support rapid decision-making. The proposed LAMII framework consists of the key steps of representation learning based on language models in self-supervsied way and document integration using autoencoders. In the first step, representation learning that can identify the similar relationship between two heterogeneous sentences is performed using the self-supervised learning technique. In the second step, using the learned model, documents that implies similar contents or topics from multiple sources are found and integrated. At this time, the autoencoder is used to measure the information redundancy of the sentences in order to remove the duplicate sentences. In order to prove the superiority of this paper, we conducted comparison experiments using the language models and the benchmark sets used to evaluate their performance. As a result of the experiment, it was demonstrated that the proposed LAMII framework can effectively predict the similar relationship between heterogeneous sentence compared to other language models.
Keywords
Self-supervised learning; Language model; similar relationship between sentences; Multi-source information integration;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Bastos, L., Capela, G., Koprulu, A., and Elzinga, G., "Potential of 5G technologies for military application", International Conference on Military Communication and Information Systems, pp.1-8, 2021. https://doi.org/10.1109/ICMCIS52405.2021.9486402
2 Liu, X., Zhang, F., Hou, Z., Mian, L., Wang, Z., Zhang, J., and Tang, J., "Self-supervised learning: Generative or contrastive", IEEE Transactions on Knowledge and Data Engineering, 2021. https://doi.org/10.1109/TKDE.2021.3090866   DOI
3 Clark, B., Patt, D., and Schramm, H., "Mosaic warfare: exploiting artificial intelligence and autonomous systems to implement decision-centric operations", Center for Strategic and Budgetary Assessments, 2020. https://csbaonline.org/research/publications/mosaic-warfare-exploiting-artificial-intelligence-and-autonomous-systems-to-implement-decision-centric-operations
4 Peters, M. E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., & Zettlemoyer, L., "Deep contextualized word representations", 2018. https://arxiv.org/abs/1802.05365v2
5 Seyong Kim, Hyukjin Kwon, and Minwoo Choi., "The study of Defense Artificial Intelligence and Block-chain Convergence", Journal of Internet Computing and Services, Vol 21, No.2, pp.81-90, 2020. https://doi.org/10.7472/jksii.2020.21.2.81   DOI
6 Changhee Han, Jong-Kwan Lee, "A Methodology for Defense AI Command & Control Platform Construction", The Journal of Korean Institute of Communications and Information Sciences, Vol.44, No.4, pp.774-781, 2019. https://doi.org/10.7840/kics.2019.44.4.774   DOI
7 Schmidt, E., Work, B., Catz, S., Chien, S., Darby, C., Ford, K., ... and Moore, A., "National Security Commission on Artificial Intelligence (AI)", National Security Commission on Artificial Intellegence, 2021. https://apps.dtic.mil/sti/citations/AD1124333
8 Liao, F., Ma, L., Pei, J., and Tan, L., "Combined Self-Attention Mechanism for Chinese Named Entity Recognition in Military", Future Internet, Vol.11, No.8, pp.180, 2019. https://doi.org/10.3390/fi11080180   DOI
9 Liu, Z., Lin, Y., and Sun, M., "Representation learning for natural language processing", Springer Nature, 2020. https://doi.org/10.1007/978-981-15-5573-2
10 Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I., "Improving language understanding by generative pre-training", 2018. https://www.cs.ubc.ca/~amuham01/LING530/papers/radford2018improving.pdf
11 Devlin, J., Chang, M. W., Lee, K., & Toutanova, K., "Bert: Pre-training of deep bidirectional transformers for language understanding", arXiv preprint arXiv:1810.04805., 2018. https://arxiv.org/abs/1810.04805v2
12 Nangia, N., & Bowman, S. R., "Human vs. muppet: A conservative estimate of human performance on the glue benchmark", arXiv preprint arXiv:1905.10425, 2019. https://arxiv.org/abs/1905.10425
13 Yin, W., "Meta-learning for few-shot natural language processing: A survey", arXiv preprint arXiv: 2007.09604., 2020. https://arxiv.org/abs/2007.09604
14 Zhong, M., Liu, P., Chen, Y., Wang, D., Qiu, X., & Huang, X., "Extractive summarization as text matching", arXiv preprint arXiv:2004.08795, 2020. https://arxiv.org/abs/2004.08795
15 Chen, Yen-Chun, and Mohit Bansal., "Fast abstractive summarization with reinforce-selected sentence rewriting", arXiv preprint arXiv:1805.11080, 2018. 10.18653/v1/P18-1063