Browse > Article
http://dx.doi.org/10.13089/JKIISC.2021.31.6.1227

Adversarial Example Detection and Classification Model Based on the Class Predicted by Deep Learning Model  

Ko, Eun-na-rae (Korea University)
Moon, Jong-sub (Korea University)
Abstract
Adversarial attack, one of the attacks on deep learning classification model, is attack that add indistinguishable perturbations to input data and cause deep learning classification model to misclassify the input data. There are various adversarial attack algorithms. Accordingly, many studies have been conducted to detect adversarial attack but few studies have been conducted to classify what adversarial attack algorithms to generate adversarial input. if adversarial attacks can be classified, more robust deep learning classification model can be established by analyzing differences between attacks. In this paper, we proposed a model that detects and classifies adversarial attacks by constructing a random forest classification model with input features extracted from a target deep learning model. In feature extraction, feature is extracted from a output value of hidden layer based on class predicted by the target deep learning model. Through Experiments the model proposed has shown 3.02% accuracy on clean data, 0.80% accuracy on adversarial data higher than the result of pre-existing studies and classify new adversarial attack that was not classified in pre-existing studies.
Keywords
Adversarial Attack; Evasion Attack; Deep Learning; Adversarial Example Detection;
Citations & Related Records
연도 인용수 순위
  • Reference
1 N. Carlini and D. Wagner, "Towards Evaluating the Robustness of Neural Networks," IEEE Symposium on Security and Privacy. pp. 39-57, May. 2017.
2 A. Shafahi, M. Najibi and A. Ghiasi, "Adversarial Training for Free!," Proceedings of the 33rd International Conference on Neural Information Processing Systems, pp. 3358-3369, Dec. 2019.
3 N. Papernot, P. McDaniel and X. Wu, "Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks," IEEE Symposium on Security and Privacy, pp. 582-597, May. 2016.
4 A. Aldahdooh, W. Hamidouche and S A. Fezza, "Adversarial Example Detection for DNN Models: A Review," arXiv preprint arXiv: 2105.00203v2, Sep. 2021.   DOI
5 S. Pertigkiozoglou and P. Maragos, "Detecting Adversarial Examples in Convolutional Neural Networks," arXiv preprint arXiv: 1812.03303v1, Dec. 2018.
6 F. Carrara, F. Falchi and R. Caldelli, "Detecting adversarial example attacks to deep neural networks," Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing, pp. 1-7, Jun. 2017.
7 W. Xu, D. Evans and Y. Qi, "Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks," arXiv preprint arXiv: 1704.01155v2, Dec. 2017.
8 F. Tramer, A. Kurakin and N. Papernot, "Ensemble Adversarial Training: Attacks and Defenses," arXiv preprint arXiv: 1705.07204v5, Apr. 2020.
9 HF. Eniser, M. Christakis and V. Wustholz, "RAID: Randomized adversarial-input detection for neural networks," arXiv preprint arXiv: 2002.02776v1, Feb. 2020.
10 S. Zagoruyko and N. Komodakis, "Wide Residual Networks," arXiv preprint arXiv: 1605.07146v4, Jun. 2017.
11 G. Litjens, T. Kooi and B.E. Bejnordi, "A survey on deep learning in medical image analysis," Medical Image Analysis, vol. 42, pp. 60-88, Jul. 2017.   DOI
12 F. Croce and M. Hein, "Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks," Proceedings of the 37th International Conference on Machine Learning (PMLR), vol.119, pp. 2206-2216, Jul. 2020.
13 A. Prakash, N. Moran and S. Garber, "Deflecting adversarial attacks with pixel deflection," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8571-8580, Jun. 2018.
14 H. Caesar, V. Bankiti and AH. Lang, "nuScenes: A multimodal dataset for autonomous driving," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11621-11631, Jun. 2020.
15 Ian J. Goodfellow, J. Shlens and C. Szegedy, "Explaining and Harnessing Adversarial Examples," arXiv preprint arXiv: 1412.6572v3, Mar. 2015.
16 X. Li and F. Li, "Adversarial examples detection in deep networks with convolutional filter statistics," Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 5764-5772, Oct. 2017.
17 J. Lu, T. Issaranon and D. Forsyth, "SafetyNet: Detecting and Rejecting Adversarial Examples Robustly," Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 446-454, Oct. 2017.
18 N. Manohar-Alers, R. Feng and S. Singh, "Using Anomaly Feature Vectors for Detecting, Classifying and Warning of Outlier Adversarial Examples," arXiv preprint arXiv: 2107.00561v1, Jul. 2021.
19 S.M. Moosavi-Dezfooli, A. Fawzi and P. Frossard, "DeepFool: a simple and accurate method to fool deep neural networks," Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2574-2582, Jun. 2016.
20 N. Papernot, P. McDaniel and S. Jha, "The Limitations of Deep Learning in Adversarial Settings," IEEE European Symposium on Security and Privacy (EuroS&P), pp. 372-387, Mar. 2016.
21 A. Madry, A. Makelov and L. Schmidt, "Towards Deep Learning Models Resistant to Adversarial Attacks," arXiv preprint arXiv:1706.06083v4, Sep. 2019.