Browse > Article
http://dx.doi.org/10.17661/jkiiect.2019.12.6.629

Random Noise Addition for Detecting Adversarially Generated Image Dataset  

Hwang, Jeonghwan (School of Cybersecurity, Korea University)
Yoon, Ji Won (School of Cybersecurity, Korea University)
Publication Information
The Journal of Korea Institute of Information, Electronics, and Communication Technology / v.12, no.6, 2019 , pp. 629-635 More about this Journal
Abstract
In Deep Learning models derivative is implemented by error back-propagation which enables the model to learn the error and update parameters. It can find the global (or local) optimal points of parameters even in the complex models taking advantage of a huge improvement in computing power. However, deliberately generated data points can 'fool' models and degrade the performance such as prediction accuracy. Not only these adversarial examples reduce the performance but also these examples are not easily detectable with human's eyes. In this work, we propose the method to detect adversarial datasets with random noise addition. We exploit the fact that when random noise is added, prediction accuracy of non-adversarial dataset remains almost unchanged, but that of adversarial dataset changes. We set attack methods (FGSM, Saliency Map) and noise level (0-19 with max pixel value 255) as independent variables and difference of prediction accuracy when noise was added as dependent variable in a simulation experiment. We have succeeded in extracting the threshold that separates non-adversarial and adversarial dataset. We detected the adversarial dataset using this threshold.
Keywords
Adversarial examples; Adversarial attack detection; Convolutional neural network; Deep learning; Random noise addition;
Citations & Related Records
연도 인용수 순위
  • Reference
1 G. Tao, S. Ma, Y. Liu, and X. Zhang, "Attacks meet interpretability: Attribute-steered detection of adversarial samples," Advances in Neural Information Processing Systems, pp. 7717-7728, 2018.
2 B. Liang, H. Li, M. Su, X. Li, W. Shi, and X. Wang, "Detecting adversarial image examples in deep neural networks with adaptive noise reduction," IEEE Transactions on Dependable and Secure Computing, 2018.
3 U. Hwang, J. Park, H. Jang, S. Yoon, and N. I. Cho, "Puvae: A variational autoencoder to purify adversarial examples," arXiv preprint arXiv:1903.00585, 2019.
4 J. Rauber, W. Brendel, and M. Bethge, "Foolbox: A python toolbox to benchmark the robustness of machine learning models," arXiv preprint arXiv:1707.04131, 2017.
5 A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," Advances in neural information processing systems, pp. 1097-1105, 2012.
6 K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," arXiv preprint arXiv:1409.1556, 2014.
7 C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus, "Intriguing properties of neural networks," arXiv preprint arXiv:1312.6199, 2013.
8 C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed,D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, "Going deeper with convolutions," Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1-9, 2015.
9 K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770-778, 2016.
10 M. Bojarski, D. Del Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, L. D. Jackel, M. Monfort, U. Muller, J. Zhang, et al., "End to end learning for self-driving cars," arXiv preprint arXiv:1604.07316, 2016.
11 I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. WardeFarley, S. Ozair, A. Courville, and Y. Bengio, "Generative adversarial nets," Advances in neural information processing systems, pp. 2672-2680, 2014.
12 I. J. Goodfellow, J. Shlens, and C. Szegedy, "Explaining and harnessing adversarial examples," arXiv preprint arXiv:1412.6572, 2014.
13 N. Papernot, P. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, and A. Swami, "The limitations of deep learning in adversarial settings," 2016 IEEE European Symposium on Security and Privacy (EuroS&P), pp. 372-387, IEEE, 2016.
14 R. Feinman, R. R. Curtin, S. Shintre, and A. B. Gardner, "Detecting adversarial samples from artifacts," arXiv preprint arXiv:1703.00410, 2017.
15 K. Lee, K. Lee, H. Lee, and J. Shin, "A simple unified framework for detecting out-of-distribution samples and adversarial attacks," Advances in Neural Information Processing Systems, pp. 7167-7177, 2018