[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.3837/tiis.2022.05.004

MalDC: Malicious Software Detection and Classification using Machine Learning

Moon, Jaewoong (Sejong University)
Kim, Subin (Sejong University)
Park, Jangyong (Sejong University)
Lee, Jieun (Sejong University)
Kim, Kyungshin (Convergence Technology Collaboration Directorate, Agency for Defense Development)
Song, Jaeseung (Sejong University)

Publication Information

KSII Transactions on Internet and Information Systems (TIIS) / v.16, no.5, 2022 , pp. 1466-1488 More about this Journal

Abstract

Recently, the importance and necessity of artificial intelligence (AI), especially machine learning, has been emphasized. In fact, studies are actively underway to solve complex and challenging problems through the use of AI systems, such as intelligent CCTVs, intelligent AI security systems, and AI surgical robots. Information security that involves analysis and response to security vulnerabilities of software is no exception to this and is recognized as one of the fields wherein significant results are expected when AI is applied. This is because the frequency of malware incidents is gradually increasing, and the available security technologies are limited with regard to the use of software security experts or source code analysis tools. We conducted a study on MalDC, a technique that converts malware into images using machine learning, MalDC showed good performance and was able to analyze and classify different types of malware. MalDC applies a preprocessing step to minimize the noise generated in the image conversion process and employs an image augmentation technique to reinforce the insufficient dataset, thus improving the accuracy of the malware classification. To verify the feasibility of our method, we tested the malware classification technique used by MalDC on a dataset provided by Microsoft and malware data collected by the Korea Internet & Security Agency (KISA). Consequently, an accuracy of 97% was achieved.

Keywords

artificial intelligence; machine learning; malware classification; MalDC;

Citations & Related Records

Reference

1	M. Stephen, "Alan Turing and the development of Artificial Intelligence," AI communications, vol. 27, no. 1, pp. 3-10, 2014. DOI
2	M. I. Jordan, and T. M. Mitchell, "Machine learning: Trends, perspectives, and prospects," Science, vol. 349, no. 6245, pp. 255-260, 2015. DOI
3	J. S. Luo, and D. C. T. Lo, "Binary malware image classification using machine learning with local binary pattern," in Proc. of 2017 IEEE International Conference on Big Data (Big Data), IEEE, pp. 4664-4667, 2017.
4	D. Bruijne, M. Marleen, "Machine learning approaches in medical image analysis: From detection to diagnosis," Medical image analysis, vol. 33, pp. 94-97, 2016. DOI
5	R. Royi, M. Radu, C. Feuerstein, Y. T. Elad, and M. Ahmadi, "Microsoft malware classification challenge," 2018.
6	Korea Internet Security Agency, "CISC2017 data challence Malwares 2017," Jan. 2022.
7	X. Jin, J. Chi, S. Peng, Y. Tian, C. Ye, and X. Li, "Deep image aesthetics classification using inception modules and fine-tuning connected layer," in Proc. of 2016 8th international conference on wireless communications & signal processing (WCSP), IEEE, pp. 1-6, 2016.
8	O. Simeone, "A very brief introduction to machine learning with applications to communication systems," IEEE Transactions on Cognitive Communications and Networking, vol. 4, no. 4, pp. 648-664, 2018. DOI
9	H. Kaiming, X. Zhang, S. Ren, and J. Sun, "Identity mappings in deep residual networks," in Proc. of European conference on computer vision, Springer, Cham, pp. 630-645, Oct. 2016.
10	D. Simeone, and P. M. Ameer, "Brain tumor classification using deep CNN features via transfer learning," Computers in biology and medicine, vol. 111, pp. 103345, Jun. 2019. DOI
11	L. Yann, Y. Bengio, and G. Hinton, "Deep learning," nature, vol. 521, no. 7553, pp. 436-444, May. 2015. DOI
12	H. Andrew, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, "Mobilenets: Efficient convolutional neural networks for mobile vision applications," Apl. 2017.
13	N. F. Amalina, A. Feizollah, N. B. Anuar, and A. Gani, "Evaluation of machine learning classifiers for mobile malware detection," Soft Computing, vol. 20, no. 1, pp. 343-357, 2016. DOI
14	S. Joshua, and K. Berlin, "Deep neural network based malware detection using two dimensional binary program features," in Proc. of 2015 10th international conference on malicious and unwanted software (MALWARE), IEEE, pp. 11-20, 2015.
15	K. Rieck, P. Trinius, C. Willems, and T. Holz, "Automatic analysis of malware behavior using machine learning," Journal of Computer Security, vol. 19, no. 4, pp. 639-668, 2011. DOI
16	K. Tam, A. Feizollah, N. B. Anuar, R. Salleh, and L. Cavallaro, "The evolution of android malware and android analysis techniques," ACM Computing Surveys (CSUR), vol. 49, no. 4, pp.1-41, Dec. 2017.
17	T. Nima, J. Y. Shin, S. R. Gurudu, R. T. Hurst, C. B. Kendall, M. B. Gotway, and J. Liang, "Convolutional neural networks for medical image analysis: Full training or fine tuning?," IEEE transactions on medical imaging, vol. 35, no. 5, pp.1299-1312, May. 2016. DOI
18	D. J. Jeon, and D. G. Park, "Real-time malware detection method using machine learning," The Journal of Korean Institute of Information Technology, vol. 16, no. 3, pp. 101-113, Mar. 2018. DOI
19	L. Nataraj, S. Karthikeyan, G. Jacob, and B. S. Manjunath, "Malware images: visualization and automatic classification," in Proc. of the 8th international symposium on visualization for cyber security, pp. 1-7, 2011.
20	R. Waseem, and Z. Wang, "Deep convolutional neural networks for image classification: A comprehensive review," Neural computation, vol. 29, no. 9, pp. 2352-2449, 2017. DOI
21	Q. D. Ngo, H. T. Nguyen, V. H. Le, and D. H. Nguyen, "A survey of IoT malware and detection methods based on static features," ICT Express, vol. 6, no. 4, p. 280-286, 2020. DOI
22	E. Raff, J. Barker, J. Sylvester, R. Brandon, B. Catanzaro, and C. K. Nicholas, "Malware detection by eating a whole exe," in Proc. of Workshops at the Thirty-Second AAAI Conference on Artificial Intelligence, 2018.
23	S. Endang, R. Sustika, R. S. Yuwana, A. Subekti, and H. F. Pardede, "Deep structured convolutional neural network for tomato diseases detection," in Proc. of 2018 international conference on advanced computer science and information systems (ICACSIS), IEEE, pp. 385-390, 2018.
24	H. Mahbub, J. J. Bird, and D. R. Faria, "A study on cnn transfer learning for image classification," in Proc. of UK Workshop on computational Intelligence, Springer, Cham, pp. 191-202, 2018.
25	AV-TEST. Malware, 2021.
26	Z. K. Zhang, M. C. Y. Cho, C. W. Wang, C. W. Hsu, C. K. Chen, S. Shieh, "IoT Security: Ongoing Challenges and Research Opportunities," in Proc. of 2014 IEEE 7th International Conference on Service-Oriented Computing and Applications, pp. 230-234, 2014.
27	H. J. Matthew, A. M. Swiergosz, H. S. Haeberle, J. M. Karnuta, J. L. Schaffer, V. E. Krebs, A. I. Spitzer, and P. N. Ramkumar, "Machine learning and artificial intelligence: definitions, applications, and future directions," Current reviews in musculoskeletal medicine, vol. 13, no. 1, pp. 69-76, Feb. 2020. DOI
28	C. F. Tsai, Y. F. Hsu, C. Y. Lin, and W. Y. Lin, "Intrusion detection by machine learning: A review," expert systems with applications, vol. 36, no. 10, pp. 11994-12000, Dec. 2009. DOI
29	M. Agnieszka, and M. Grochowski, "Data augmentation for improving deep learning in image classification problem," in Proc. of 2018 international interdisciplinary PhD workshop (IIPhDW), IEEE, pp. 117-122, 2018.
30	R. Esteban, A. Aggarwal, Y. Huang, and Q. V. Le, "Regularized evolution for image classifier architecture search," in Proc. of the aaai conference on artificial intelligence, vol. 33, no. 01, pp. 4780-4789, Feb. 2019.
31	H. Gao, Z. Liu, L. V. D. Maaten, and K. Q. Weinberger, "Densely connected convolutional networks," in Proc. of the IEEE conference on computer vision and pattern recognition, pp. 4700-4708, 2017.