Browse > Article
http://dx.doi.org/10.13089/JKIISC.2022.32.5.933

CNN-Based Malware Detection Using Opcode Frequency-Based Image  

Ko, Seok Min (SoonChunHyang University)
Yang, JaeHyeok (SoonChunHyang University)
Choi, WonJun (SoonChunHyang University)
Kim, TaeGuen (SoonChunHyang University)
Abstract
As the Internet develops and the utilization rate of computers increases, the threats posed by malware keep increasing. This leads to the demand for a system to automatically analyzes a large amount of malware. In this paper, an automatic malware analysis technique using a deep learning algorithm is introduced. Our proposed method uses CNN (Convolutional Neural Network) to analyze the malicious features represented as images. To reflect semantic information of malware for detection, our method uses the opcode frequency data of binary for image generation, rather than using bytes of binary. As a result of the experiments using the datasets consisting of 20,000 samples, it was found that the proposed method can detect malicious codes with 91% accuracy.
Keywords
Machine Learning; Convolution Neural Network; Clustering; Malware Detection;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 O. A. Aslan and R. Samet, "A Comprehensive Reviewon Malware Detection Approaches," IEEEAccess,vol. 8, pp. 6249-6271, Jan. 2020
2 A. Sharma, P. Malacaria, andM.Khouzani, "Malware DetectionUsing1-Dimensional Convolutional Neural Networks," IEEE EuropeanS ymposium on Security and Privacy Workshops (EuroS&PW), pp. 247-256,June. 2019
3 Kim HY and Lee DH, "A Studyon Machine Learning Based Anti-Analysis Technique Detection Using N-gram Opcode," Journal of the Korea Institute of Information Security & Cryptology, vol. 32, no. 2, pp. 181-192, Apr. 2022.
4 T.-Y. Wang, and C.-H. Wu, "Detection of packed executables using support vector machines," International Conference on Machine Learning and Cybernetics, vol. 2, pp. 717-722, Jul. 2011.
5 Y. Lecun and L. Bottou and Y. Bengio and P. Haffner, "Gradient-based learning applied to document recognition," Proceedings of the IEEE, 86(11), pp. 2278-2324, Nov. 1998   DOI
6 Erich Schubert, Jorg Sander, MartinEster, Hans Peter Kriegel, and Xiaowei Xu, "DBSCANRevisited, Revisited: Why and How You Should(Still) Use DBSCAN," ACMTransactions on Database Systems,vol. 42, no. 3, pp. 1-21, Sept. 2017.
7 M. Ijaz, M. H. Durad, M. Ismail, "Static and Dynamic Malware Analysis Using Machine Learning," International Bhurban Conference on Applied Sciences and Technology (IBCAST), pp. 687-691, Jan. 2019.
8 Smith Michael, Ingram Joey, Lamb Christopher, Draelos Timothy, Doak Justin, Aimone James, and James Conrad, "Dynamic Analysis of Executables to Detect and Characterize Malware," IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 16-22, Dec. 2018.
9 S. Naval, V. Laxmi, M. Rajarajan,M. S. Gaur, and M. Conti,"Employing Program Semantics for Malware Detection," IEEETransactions on Information Forensicsand Security, vol. 10, no. 12, pp.2591-2604, Dec. 2015.   DOI
10 R. S. Pirscoveanu, S. S. Hansen, T.M. T. Larsen, M. Stevanovic, J. M.Pedersen, and A. Czech, "Analysis of Malware behavior: Type classification using machine learning," International Conference on Cyber Situational Awareness, Data Analytics and Assessment (CyberSA), pp. 1-7, Jun. 2015.
11 K. Aoki, T. Yagi, M. Iwamura, and M. Itoh, "Controlling malware HTTP communications in dynamic analysis system using search engine," Third International Workshop on Cyberspace Safety and Security (CSS), pp. 1-6, Sept. 2011.
12 Jun Han and Claudio Moraga, "Theinfluence of the sigmoid functionparameters on the speedof backpropagation learning," Springer,vol. 930, pp. 195-201, 1995.
13 Bakhshinejad, Nazanin, andAli Hamzeh. "Parallel-CNNnetworkformalware detection," IET Information Security vol. 14, no. 2, pp. 210-219,2020.   DOI
14 BinZhang, Wentao Xiao, Xi Xiao,Arun Kumar Sangaiah, Weizhe Zhang, and Jiajia Zhang,"Ransom ware classification using patch-based CNN and self-attention network on embedded N-grams of opcodes," Future Generation Computer Systems, vol. 110, pp.708-720, 2020.   DOI
15 S. loffe, and C. Szegedy, "BatchNormalization: Accelerating DeepNetwork Training by ReducingInternal Covariate Shift," Proceedingsof Machine Learning Research, vol.37, pp. 448-456, Jul. 2015.
16 Frank Nielsen, "Introduction to HPC with MPI for Data Science," Springer,pp. 195-211, 2016.
17 Information Security R&D Introduction, https://www.ksecurity.or.kr/kisis/subIndex/461.do, Accessed at Aug. 2022
18 M. Abadi, et al., "Tensor Flow:Large-Scale Machine Learningon Heterogeneous Systems," Mar. 2015.
19 A. Yewale and M. Singh, "Malwaredetection based on opcode frequency,"International Conference on AdvancedCommunication Control andComputing Technologies (ICACCCT),pp. 646-649, May. 2016.
20 Usukhbayar Baldangombo, NyamjavJambaljav and Shi-Jinn Horng, "AStatic Malware Detection System Using Data Mining Methods,"International Journal of Artificial Intelligence & Application, vol. no. 4,Jul. 2013.
21 YeJin Jeo, Jin-e Kim, and Joonseon Ahn, "Image Generation Method for Malware Detection Based on Machine Learning," Journal of the Korea Institute of Information Security and Cryptology, vol. 32, no. 2, pp. 381-390, Apr. 2022.   DOI
22 S. Wright, "Correlation and Causation," Journal of Agricultural Research, vol. 20, pp. 557-585, June 1921.
23 N. Etaher, G. R. S. Weir, and M. Alazab, "From ZeuS to Zitmo: Trends in Banking Malware," IEEE Trustcom/BigDataSE/ISPA, vol. 1, pp. 1386-1391, Aug. 2015
24 S. Tobiyama, Y. Yamaguchi, H.Shimada, T. Ikuse, and T. Yagi,"Malware Detection with DeepNeural Network Using Process Behavior,"IEEE 40th Annual Computer Softwareand Applications Conference(COMPSAC), vol. 2, pp. 577-582,June. 2018
25 Jaemin Jung, Jongmoo Choi, Seong-jeCho, Sangchul Han, Minkyu Park,and Young-Sup Hwang, "Androidmalware detection using convolutional neural networks and data sectionimages," Research in Adaptive and Convergent Systems (RACS18), pp.149-153, Oct. 2018.
26 Jagsir Singh and Jaswinder Singh, "Asurvey on machine learning-basedmalware detection in executable files,"Journal of Systems Architecture, vol.112, Jan. 2021.
27 Ahmed Bensaoud, Nawaf Abudawaood, and Jugal Kalita, "Classifying Malware Images with Convolutional Neural Network Models," Computer Vision and Pattern Recognition, vol. 22, no. 6, pp. 1022-1031, Nov. 2020.
28 Z. Salehi, M. Ghiasi, and A. Sami, "A miner for malware detection based on API function calls and their arguments," Artificial Intelligence and Signal Processing (AISP 2012), pp. 563-568, May. 2012
29 M. Yeo, Y. Koo, Y. Yoon, T. Hwang,J. Ryu, J. Song, and C. Park,"Flow-based malware detectionusing convolutional neural network,"International Conference onInformation Networking (ICOIN), pp.910-913, Jan. 2018