Browse > Article
http://dx.doi.org/10.7472/jksii.2021.22.2.1

A Study on Classification of Variant Malware Family Based on ResNet-Variational AutoEncoder  

Lee, Young-jeon (Department of Software, Gachon University)
Han, Myung-Mook (Department of Software, Gachon University)
Publication Information
Journal of Internet Computing and Services / v.22, no.2, 2021 , pp. 1-9 More about this Journal
Abstract
Traditionally, most malicious codes have been analyzed using feature information extracted by domain experts. However, this feature-based analysis method depends on the analyst's capabilities and has limitations in detecting variant malicious codes that have modified existing malicious codes. In this study, we propose a ResNet-Variational AutoEncder-based variant malware classification method that can classify a family of variant malware without domain expert intervention. The Variational AutoEncoder network has the characteristics of creating new data within a normal distribution and understanding the characteristics of the data well in the learning process of training data provided as input values. In this study, important features of malicious code could be extracted by extracting latent variables in the learning process of Variational AutoEncoder. In addition, transfer learning was performed to better learn the characteristics of the training data and increase the efficiency of learning. The learning parameters of the ResNet-152 model pre-trained with the ImageNet Dataset were transferred to the learning parameters of the Encoder Network. The ResNet-Variational AutoEncoder that performed transfer learning showed higher performance than the existing Variational AutoEncoder and provided learning efficiency. Meanwhile, an ensemble model, Stacking Classifier, was used as a method for classifying variant malicious codes. As a result of learning the Stacking Classifier based on the characteristic data of the variant malware extracted by the Encoder Network of the ResNet-VAE model, an accuracy of 98.66% and an F1-Score of 98.68 were obtained.
Keywords
Variant Malware; Malware Classification; Variational AutoEncoder; Tranfer Learning; Ensemble Learning;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Kingma, Diederik P, and Max Welling, "Auto-encoding variational bayes", arXiv preprint arXiv:1312.6114, 2013. https://doi.org/10.18653/v1/2020.coling-main.458
2 Nataraj, Lakshmanan et al., "Malware images: visualization and automatic classification", Proceedings of the 8th international symposium on visualization for cyber security, 2011. https://doi.org/10.1145/2016904.2016908   DOI
3 Nataraj and B. S. Manjunath, "SPAM: Signal processing to analyze malware", arXiv, 2016. https://doi.org/10.1109/msp.2015.2507185
4 Chalapathy, Raghavendra, and Sanjay Chawla, "Deep learning for anomaly detection: A survey", arXiv preprint arXiv:1901.03407, 2019. https://arxiv.org/abs/1901.03407
5 Jin-Young Kim, Seok-Jun Bu, and Sung-Bae Cho, "Zero-day malware detection using transferred generative adversarial networks based on deep autoencoders", Information Sciences 460, pp83-102, 2018. https://doi.org/10.1016/j.ins.2018.04.092   DOI
6 Jin-Young Kim, and Sung-Bae Cho, "Detecting intrusive malware with a hybrid generative deep learning model.", International Conference on Intelligent Data Engineering and Automated Learning. Springer, 2018. https://doi.org/10.1007/978-3-030-03493-1_52   DOI
7 Weiss, Karl, Taghi M. Khoshgoftaar, and DingDing Wang, "A survey of transfer learning.", Journal of Big data, 2016. https://doi.org/10.1186/s40537-016-0043-6   DOI
8 Bisong, Ekaba, "Google colaboratory", Building Machine Learning and Deep Learning Models on Google Cloud Platform. Apress, Berkeley, CA, 2019. https://doi.org/10.1007/978-1-4842-4470-8_7
9 Luo, Jhu-Sin, and Dan Chia-Tien Lo, "Binary malware image classification using machine learning with local binary pattern", IEEE International Conference on Big Data, IEEE, 2017. https://doi.org/10.1109/bigdata.2017.8258512   DOI
10 Gribbon, Kim T, and Donald G. Bailey, "A novel approach to real-time bilinear interpolation", Proceedings. DELTA 2004. Second IEEE International Workshop on Electronic Design, Test and Applications. IEEE, 2004. https://doi.org/10.1109/delta.2004.10055   DOI
11 Venkatraman, Sitalakshmi, Mamoun Alazab, and R. Vinayakumar, "A hybrid deep learning image-based analysis for effective malware detection", Journal of Information Security and Applications 47, 2019. https://doi.org/10.1016/j.jisa.2019.06.006   DOI
12 Moser, Andreas, Christopher Kruegel, and Engin Kirda, "Limits of static analysis for malware detection", Twenty-Third Annual Computer Security Applicaitons Conference IEEE, 2007. https://doi.org/10.1109/acsac.2007.21   DOI
13 Zhou, Xin, Jianmin Pang, and Guanghui Liang. "Image classification for malware detection using extremely randomized trees." 2017 11th IEEE International Conference on Anti-counterfeiting, Security, and Identification (ASID). IEEE, 2017. https://doi.org/10.1109/icasid.2017.8285743   DOI
14 Vasan, Danish, et al. "IMCFN: Image-based malware classification using fine-tuned convolutional neural network architecture." Computer Networks 171, 2020. https://doi.org/10.1016/j.comnet.2020.107138   DOI
15 He, Kaiming, et al., "Deep residual learning for image recognition", Proceedings of the IEEE conference on computer vision and pattern recognition, 2016. https://doi.org/10.1109/cvpr.2016.90   DOI