[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.13089/JKIISC.2021.31.6.1227

Adversarial Example Detection and Classification Model Based on the Class Predicted by Deep Learning Model

Ko, Eun-na-rae (Korea University)
Moon, Jong-sub (Korea University)

Publication Information

Journal of the Korea Institute of Information Security & Cryptology / v.31, no.6, 2021 , pp. 1227-1236 More about this Journal

Abstract

Adversarial attack, one of the attacks on deep learning classification model, is attack that add indistinguishable perturbations to input data and cause deep learning classification model to misclassify the input data. There are various adversarial attack algorithms. Accordingly, many studies have been conducted to detect adversarial attack but few studies have been conducted to classify what adversarial attack algorithms to generate adversarial input. if adversarial attacks can be classified, more robust deep learning classification model can be established by analyzing differences between attacks. In this paper, we proposed a model that detects and classifies adversarial attacks by constructing a random forest classification model with input features extracted from a target deep learning model. In feature extraction, feature is extracted from a output value of hidden layer based on class predicted by the target deep learning model. Through Experiments the model proposed has shown 3.02% accuracy on clean data, 0.80% accuracy on adversarial data higher than the result of pre-existing studies and classify new adversarial attack that was not classified in pre-existing studies.

Keywords

Adversarial Attack; Evasion Attack; Deep Learning; Adversarial Example Detection;

Citations & Related Records

Reference

1	N. Carlini and D. Wagner, "Towards Evaluating the Robustness of Neural Networks," IEEE Symposium on Security and Privacy. pp. 39-57, May. 2017.
2	A. Shafahi, M. Najibi and A. Ghiasi, "Adversarial Training for Free!," Proceedings of the 33rd International Conference on Neural Information Processing Systems, pp. 3358-3369, Dec. 2019.
3	N. Papernot, P. McDaniel and X. Wu, "Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks," IEEE Symposium on Security and Privacy, pp. 582-597, May. 2016.
4	A. Aldahdooh, W. Hamidouche and S A. Fezza, "Adversarial Example Detection for DNN Models: A Review," arXiv preprint arXiv: 2105.00203v2, Sep. 2021. DOI
5	S. Pertigkiozoglou and P. Maragos, "Detecting Adversarial Examples in Convolutional Neural Networks," arXiv preprint arXiv: 1812.03303v1, Dec. 2018.
6	F. Carrara, F. Falchi and R. Caldelli, "Detecting adversarial example attacks to deep neural networks," Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing, pp. 1-7, Jun. 2017.
7	W. Xu, D. Evans and Y. Qi, "Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks," arXiv preprint arXiv: 1704.01155v2, Dec. 2017.
8	F. Tramer, A. Kurakin and N. Papernot, "Ensemble Adversarial Training: Attacks and Defenses," arXiv preprint arXiv: 1705.07204v5, Apr. 2020.
9	HF. Eniser, M. Christakis and V. Wustholz, "RAID: Randomized adversarial-input detection for neural networks," arXiv preprint arXiv: 2002.02776v1, Feb. 2020.
10	S. Zagoruyko and N. Komodakis, "Wide Residual Networks," arXiv preprint arXiv: 1605.07146v4, Jun. 2017.
11	N. Manohar-Alers, R. Feng and S. Singh, "Using Anomaly Feature Vectors for Detecting, Classifying and Warning of Outlier Adversarial Examples," arXiv preprint arXiv: 2107.00561v1, Jul. 2021.
12	A. Madry, A. Makelov and L. Schmidt, "Towards Deep Learning Models Resistant to Adversarial Attacks," arXiv preprint arXiv:1706.06083v4, Sep. 2019.
13	N. Papernot, P. McDaniel and S. Jha, "The Limitations of Deep Learning in Adversarial Settings," IEEE European Symposium on Security and Privacy (EuroS&P), pp. 372-387, Mar. 2016.
14	G. Litjens, T. Kooi and B.E. Bejnordi, "A survey on deep learning in medical image analysis," Medical Image Analysis, vol. 42, pp. 60-88, Jul. 2017. DOI
15	F. Croce and M. Hein, "Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks," Proceedings of the 37th International Conference on Machine Learning (PMLR), vol.119, pp. 2206-2216, Jul. 2020.
16	A. Prakash, N. Moran and S. Garber, "Deflecting adversarial attacks with pixel deflection," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8571-8580, Jun. 2018.
17	H. Caesar, V. Bankiti and AH. Lang, "nuScenes: A multimodal dataset for autonomous driving," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11621-11631, Jun. 2020.
18	Ian J. Goodfellow, J. Shlens and C. Szegedy, "Explaining and Harnessing Adversarial Examples," arXiv preprint arXiv: 1412.6572v3, Mar. 2015.
19	X. Li and F. Li, "Adversarial examples detection in deep networks with convolutional filter statistics," Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 5764-5772, Oct. 2017.
20	J. Lu, T. Issaranon and D. Forsyth, "SafetyNet: Detecting and Rejecting Adversarial Examples Robustly," Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 446-454, Oct. 2017.
21	S.M. Moosavi-Dezfooli, A. Fawzi and P. Frossard, "DeepFool: a simple and accurate method to fool deep neural networks," Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2574-2582, Jun. 2016.

KSCI

Adversarial Example Detection and Classification Model Based on the Class Predicted by Deep Learning Model 데이터 예측 클래스 기반 적대적 공격 탐지 및 분류 모델

Adversarial Example Detection and Classification Model Based on the Class Predicted by Deep Learning Model