[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.3837/tiis.2021.11.013

Resilience against Adversarial Examples: Data-Augmentation Exploiting Generative Adversarial Networks

Kang, Mingu (Korea Advanced Institute of Science and Technology)
Kim, HyeungKyeom (Korea National University of Transportation)
Lee, Suchul (Korea National University of Transportation)
Han, Seokmin (Korea National University of Transportation)

Publication Information

KSII Transactions on Internet and Information Systems (TIIS) / v.15, no.11, 2021 , pp. 4105-4121 More about this Journal

Abstract

Recently, malware classification based on Deep Neural Networks (DNN) has gained significant attention due to the rise in popularity of artificial intelligence (AI). DNN-based malware classifiers are a novel solution to combat never-before-seen malware families because this approach is able to classify malwares based on structural characteristics rather than requiring particular signatures like traditional malware classifiers. However, these DNN-based classifiers have been found to lack robustness against malwares that are carefully crafted to evade detection. These specially crafted pieces of malware are referred to as adversarial examples. We consider a clever adversary who has a thorough knowledge of DNN-based malware classifiers and will exploit it to generate a crafty malware to fool DNN-based classifiers. In this paper, we propose a DNN-based malware classifier that becomes resilient to these kinds of attacks by exploiting Generative Adversarial Network (GAN) based data augmentation. The experimental results show that the proposed scheme classifies malware, including AEs, with a false positive rate (FPR) of 3.0% and a balanced accuracy of 70.16%. These are respective 26.1% and 18.5% enhancements when compared to a traditional DNN-based classifier that does not exploit GAN.

Keywords

Malware classification; Microsoft Malware Classification Challenge (BIG2015); conditional GAN; Data augmentation; Adversarial Examples;

Citations & Related Records

Reference

1	Kim, H., Han, S., Lee, S., & Lee, J.-R., "Visualization of Malwares for Classification Through Deep Learning," Journal of Internet Computing and Services, 19(5), 67-75, 2018. DOI
2	Biggio, Battista, et al., "Evasion attacks against machine learning at test time," in Proc. of Joint European conference on machine learning and knowledge discovery in databases, Springer, Berlin, Heidelberg, 387-402, 2013.
3	Zhu, Xinyue, et al., "Emotion classification with data augmentation using generative adversarial networks," in Proc. of Pacific-Asia Conference on Knowledge Discovery and Data Mining, Springer, Cham, 349-360, 2018.
4	G. Daniel, et al., "Using convolutional neural networks for classification of malware represented as images," Journal of Computer Virology and Hacking Techniques, 15.1, 15-28, 2019. DOI
5	Anderson, Hyrum S., et al., "Evading machine learning malware detection," Black Hat, 2017.
6	Nguyen, Anh, Jason Yosinski, and Jeff Clune, "Deep neural networks are easily fooled: High confidence predictions for unrecognizable images," in Proc. of the IEEE conference on computer vision and pattern recognition, 2015.
7	Nataraj, Lakshmanan, et al., "Malware images: visualization and automatic classification," in Proc. of the 8th international symposium on visualization for cyber security, ACM, 1-7, 2011.
8	Kim, Jin-Young, Seok-Jun Bu, and Sung-Bae Cho, "Zero-day malware detection using transferred generative adversarial networks based on deep autoencoders," Information Sciences, 460-461, 83-102, 2018. DOI
9	Hwantae Ji, Eulgyu Im, "Malware Classfication Using Machine Learning and Binary Visualization," The Korean Institute of Information Scientists and Engineers, 24, 198-203, 2018.
10	Saxe, Joshua, and Konstantin Berlin, "Deep neural network based malware detection using two dimensional binary program features," in Proc. of 2015 10th International Conference on Malicious and Unwanted Software (MALWARE), IEEE, 2015.
11	Szegedy, Christian, et al., "Intriguing properties of neural networks," arXiv preprint arXiv: 1312.6199, 2013.
12	Papernot, Nicolas, et al., "The limitations of deep learning in adversarial settings," in Proc. of 2016 IEEE European Symposium on Security and Privacy (EuroS&P), IEEE, 2016.
13	Chen, Li, "Deep transfer learning for static malware classification," arXiv preprint arXiv:1812.07606, 2018.
14	Papernot, Nicolas, et al., "Distillation as a defense to adversarial perturbations against deep neural networks," in Proc. of 2016 IEEE Symposium on Security and Privacy (SP), IEEE, 2016.
15	Maaten, Laurens van der, and Geoffrey Hinton, "Visualizing data using t-SNE," Journal of machine learning research, 9, 2579-2605, 2008.
16	Goodfellow, Ian, et al., "Generative adversarial nets," Advances in neural information processing systems, 2014.
17	Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton, "ImageNet classification with deep convolutional neural networks," Advances in neural information processing systems, 2012.
18	Oliva, A. and Torralba. A, "Modeling the shape of a scene: a holistic representation of the spatial envelope," International Journal of Computer Vision, 42(3), 145-175, 2001. DOI
19	Xiao, Chaowei, et al., "Generating adversarial examples with adversarial networks," arXiv preprint arXiv:1801.02610, 2018.
20	Radford, Alec, Luke Metz, and Soumith Chintala, "Unsupervised representation learning with deep convolutional generative adversarial networks," arXiv preprint arXiv:1511.06434, 2015.
21	Antoniou, Antreas, Amos Storkey, and Harrison Edwards, "Data augmentation generative adversarial networks," arXiv preprint arXiv:1711.04340, 2017.
22	Wang, Zhou, Eero P. Simoncelli, and Alan C. Bovik, "Multiscale structural similarity for image quality assessment," in Proc. of The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, Ieee, Vol. 2, 2003.
23	Ronen, Royi, et al., "Microsoft malware classification challenge," arXiv preprint arXiv:1802.10135, 2018.
24	Hanley, James A., and Barbara J. McNeil, "The meaning and use of the area under a receiver operating characteristic (ROC) curve," Radiology, 143.1, 29-36, 1982. DOI
25	Bowles, Christopher, et al., "GAN augmentation: augmenting training data using generative adversarial networks," arXiv preprint arXiv:1810.10863, 2018.
26	Grosse, Kathrin, et al., "Adversarial examples for malware detection," in Proc. of European Symposium on Research in Computer Security, Springer, Cham, 62-79, 2017.
27	Kolosnjaji, Bojan, et al., "Adversarial malware binaries: Evading deep learning for malware detection in executables," in Proc. of 2018 26th European Signal Processing Conference (EUSIPCO), IEEE, 2018.
28	Hu, Weiwei, and Ying Tan, "Generating adversarial malware examples for black-box attacks based on GAN," arXiv preprint arXiv:1702.05983, 2017.
29	Odena, Augustus, Christopher Olah, and Jonathon Shlens, "Conditional image synthesis with auxiliary classifier gans," in Proc. of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 2017.
30	Samangouei, Pouya, Maya Kabkab, and Rama Chellappa, "Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models," 2018.
31	Mariani, Giovanni, et al., "Bagan: Data augmentation with balancing gan," arXiv preprint arXiv:1803.09655, 2018.
32	Brodersen, Kay Henning, et al., "The balanced accuracy and its posterior distribution," in Proc. of 2010 20th International Conference on Pattern Recognition, IEEE, 2010.
33	Mirza, Mehdi, and Simon Osindero, "Conditional generative adversarial nets," arXiv preprint arXiv:1411.1784, 2014.
34	Kolosnjaji, Bojan, et al., "Adversarial malware binaries: Evading deep learning for malware detection in executables," in Proc. of 2018 26th European Signal Processing Conference (EUSIPCO), IEEE, 2018.
35	Goodfellow, Ian J., Jonathon Shlens, and Christian Szegedy, "Explaining and harnessing adversarial examples," arXiv preprint arXiv: 1412.6572, 2014.