• Title/Summary/Keyword: Adversarial defense

Search Result 33, Processing Time 0.023 seconds

Adversarial Detection with Gaussian Process Regression-based Detector

  • Lee, Sangheon;Kim, Noo-ri;Cho, Youngwha;Choi, Jae-Young;Kim, Suntae;Kim, Jeong-Ah;Lee, Jee-Hyong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.8
    • /
    • pp.4285-4299
    • /
    • 2019
  • Adversarial attack is a technique that causes a malfunction of classification models by adding noise that cannot be distinguished by humans, which poses a threat to a deep learning model. In this paper, we propose an efficient method to detect adversarial images using Gaussian process regression. Existing deep learning-based adversarial detection methods require numerous adversarial images for their training. The proposed method overcomes this problem by performing classification based on the statistical features of adversarial images and clean images that are extracted by Gaussian process regression with a small number of images. This technique can determine whether the input image is an adversarial image by applying Gaussian process regression based on the intermediate output value of the classification model. Experimental results show that the proposed method achieves higher detection performance than the other deep learning-based adversarial detection methods for powerful attacks. In particular, the Gaussian process regression-based detector shows better detection performance than the baseline models for most attacks in the case with fewer adversarial examples.

Study on the White Noise effect Against Adversarial Attack for Deep Learning Model for Image Recognition (영상 인식을 위한 딥러닝 모델의 적대적 공격에 대한 백색 잡음 효과에 관한 연구)

  • Lee, Youngseok;Kim, Jongweon
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.15 no.1
    • /
    • pp.27-35
    • /
    • 2022
  • In this paper we propose white noise adding method to prevent missclassification of deep learning system by adversarial attacks. The proposed method is that adding white noise to input image that is benign or adversarial example. The experimental results are showing that the proposed method is robustness to 3 adversarial attacks such as FGSM attack, BIN attack and CW attack. The recognition accuracies of Resnet model with 18, 34, 50 and 101 layers are enhanced when white noise is added to test data set while it does not affect to classification of benign test dataset. The proposed model is applicable to defense to adversarial attacks and replace to time- consuming and high expensive defense method against adversarial attacks such as adversarial training method and deep learning replacing method.

Effective Adversarial Training by Adaptive Selection of Loss Function in Federated Learning (연합학습에서의 손실함수의 적응적 선택을 통한 효과적인 적대적 학습)

  • Suchul Lee
    • Journal of Internet Computing and Services
    • /
    • v.25 no.2
    • /
    • pp.1-9
    • /
    • 2024
  • Although federated learning is designed to be safer than centralized methods in terms of security and privacy, it still has many vulnerabilities. An attacker performing an adversarial attack intentionally manipulates the deep learning model by injecting carefully crafted input data, that is, adversarial examples, into the client's training data to induce misclassification. A common defense strategy against this is so-called adversarial training, which involves preemptively learning the characteristics of adversarial examples into the model. Existing research assumes a scenario where all clients are under adversarial attack, but considering the number of clients in federated learning is very large, this is far from reality. In this paper, we experimentally examine aspects of adversarial training in a scenario where some of the clients are under attack. Through experiments, we found that there is a trade-off relationship in which the classification accuracy for normal samples decreases as the classification accuracy for adversarial examples increases. In order to effectively utilize this trade-off relationship, we present a method to perform adversarial training by adaptively selecting a loss function depending on whether the client is attacked.

Autoencoder-Based Defense Technique against One-Pixel Adversarial Attacks in Image Classification (이미지 분류를 위한 오토인코더 기반 One-Pixel 적대적 공격 방어기법)

  • Jeong-hyun Sim;Hyun-min Song
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.33 no.6
    • /
    • pp.1087-1098
    • /
    • 2023
  • The rapid advancement of artificial intelligence (AI) technology has led to its proactive utilization across various fields. However, this widespread adoption of AI-based systems has raised concerns about the increasing threat of attacks on these systems. In particular, deep neural networks, commonly used in deep learning, have been found vulnerable to adversarial attacks that intentionally manipulate input data to induce model errors. In this study, we propose a method to protect image classification models from visually imperceptible One-Pixel attacks, where only a single pixel is altered in an image. The proposed defense technique utilizes an autoencoder model to remove potential threat elements from input images before forwarding them to the classification model. Experimental results, using the CIFAR-10 dataset, demonstrate that the autoencoder-based defense approach significantly improves the robustness of pretrained image classification models against One-Pixel attacks, with an average defense rate enhancement of 81.2%, all without the need for modifications to the existing models.

Resolution Conversion of SAR Target Images Using Conditional GAN (Conditional GAN을 이용한 SAR 표적영상의 해상도 변환)

  • Park, Ji-Hoon;Seo, Seung-Mo;Choi, Yeo-Reum;Yoo, Ji Hee
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.24 no.1
    • /
    • pp.12-21
    • /
    • 2021
  • For successful automatic target recognition(ATR) with synthetic aperture radar(SAR) imagery, SAR target images of the database should have the identical or highly similar resolution with those collected from SAR sensors. However, it is time-consuming or infeasible to construct the multiple databases with different resolutions depending on the operating SAR system. In this paper, an approach for resolution conversion of SAR target images is proposed based on conditional generative adversarial network(cGAN). First, a number of pairs consisting of SAR target images with two different resolutions are obtained via SAR simulation and then used to train the cGAN model. Finally, the model generates the SAR target image whose resolution is converted from the original one. The similarity analysis is performed to validate reliability of the generated images. The cGAN model is further applied to measured MSTAR SAR target images in order to estimate its potential for real application.

Detecting Adversarial Example Using Ensemble Method on Deep Neural Network (딥뉴럴네트워크에서의 적대적 샘플에 관한 앙상블 방어 연구)

  • Kwon, Hyun;Yoon, Joonhyeok;Kim, Junseob;Park, Sangjun;Kim, Yongchul
    • Convergence Security Journal
    • /
    • v.21 no.2
    • /
    • pp.57-66
    • /
    • 2021
  • Deep neural networks (DNNs) provide excellent performance for image, speech, and pattern recognition. However, DNNs sometimes misrecognize certain adversarial examples. An adversarial example is a sample that adds optimized noise to the original data, which makes the DNN erroneously misclassified, although there is nothing wrong with the human eye. Therefore studies on defense against adversarial example attacks are required. In this paper, we have experimentally analyzed the success rate of detection for adversarial examples by adjusting various parameters. The performance of the ensemble defense method was analyzed using fast gradient sign method, DeepFool method, Carlini & Wanger method, which are adversarial example attack methods. Moreover, we used MNIST as experimental data and Tensorflow as a machine learning library. As an experimental method, we carried out performance analysis based on three adversarial example attack methods, threshold, number of models, and random noise. As a result, when there were 7 models and a threshold of 1, the detection rate for adversarial example is 98.3%, and the accuracy of 99.2% of the original sample is maintained.

Study on Neuron Activities for Adversarial Examples in Convolutional Neural Network Model by Population Sparseness Index (개체군 희소성 인덱스에 의한 컨벌루션 신경망 모델의 적대적 예제에 대한 뉴런 활동에 관한 연구)

  • Youngseok Lee
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.16 no.1
    • /
    • pp.1-7
    • /
    • 2023
  • Convolutional neural networks have already been applied to various fields beyond human visual processing capabilities in the image processing area. However, they are exposed to a severe risk of deteriorating model performance due to the appearance of adversarial attacks. In addition, defense technology to respond to adversarial attacks is effective against the attack but is vulnerable to other types of attacks. Therefore, to respond to an adversarial attack, it is necessary to analyze how the performance of the adversarial attack deteriorates through the process inside the convolutional neural network. In this study, the adversarial attack of the Alexnet and VGG11 models was analyzed using the population sparseness index, a measure of neuronal activity in neurophysiology. Through the research, it was observed in each layer that the population sparsity index for adversarial examples showed differences from that of benign examples.

An Adversarial Attack Type Classification Method Using Linear Discriminant Analysis and k-means Algorithm (선형 판별 분석 및 k-means 알고리즘을 이용한 적대적 공격 유형 분류 방안)

  • Choi, Seok-Hwan;Kim, Hyeong-Geon;Choi, Yoon-Ho
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.31 no.6
    • /
    • pp.1215-1225
    • /
    • 2021
  • Although Artificial Intelligence (AI) techniques have shown impressive performance in various fields, they are vulnerable to adversarial examples which induce misclassification by adding human-imperceptible perturbations to the input. Previous studies to defend the adversarial examples can be classified into three categories: (1) model retraining methods; (2) input transformation methods; and (3) adversarial examples detection methods. However, even though the defense methods against adversarial examples have constantly been proposed, there is no research to classify the type of adversarial attack. In this paper, we proposed an adversarial attack family classification method based on dimensionality reduction and clustering. Specifically, after extracting adversarial perturbation from adversarial example, we performed Linear Discriminant Analysis (LDA) to reduce the dimensionality of adversarial perturbation and performed K-means algorithm to classify the type of adversarial attack family. From the experimental results using MNIST dataset and CIFAR-10 dataset, we show that the proposed method can efficiently classify five tyeps of adversarial attack(FGSM, BIM, PGD, DeepFool, C&W). We also show that the proposed method provides good classification performance even in a situation where the legitimate input to the adversarial example is unknown.

AI Security Vulnerabilities in Fully Unmanned Stores: Adversarial Patch Attacks on Object Detection Model & Analysis of the Defense Effectiveness of Data Augmentation (완전 무인 매장의 AI 보안 취약점: 객체 검출 모델에 대한 Adversarial Patch 공격 및 Data Augmentation의 방어 효과성 분석)

  • Won-ho Lee;Hyun-sik Na;So-hee Park;Dae-seon Choi
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.34 no.2
    • /
    • pp.245-261
    • /
    • 2024
  • The COVID-19 pandemic has led to the widespread adoption of contactless transactions, resulting in a noticeable increase in the trend towards fully unmanned stores. In such stores, all operational processes are automated, primarily using artificial intelligence (AI) technology. However, this AI technology has several security vulnerabilities, which can be critical in the environment of fully unmanned stores. This paper analyzes the security vulnerabilities that AI-based fully unmanned stores may face, focusing particularly on the object detection model YOLO, demonstrating that Hiding Attacks and Altering Attacks using adversarial patches are possible. It is confirmed that objects with adversarial patches attached may not be recognized by the detection model or may be incorrectly recognized as other objects. Furthermore, the paper analyzes how Data Augmentation techniques can mitigate security threats by providing a defensive effect against adversarial patch attacks. Based on these results, we emphasize the need for proactive research into defensive measures to address the inherent security threats in AI technology used in fully unmanned stores.

Detecting Adversarial Examples Using Edge-based Classification

  • Jaesung Shim;Kyuri Jo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.10
    • /
    • pp.67-76
    • /
    • 2023
  • Although deep learning models are making innovative achievements in the field of computer vision, the problem of vulnerability to adversarial examples continues to be raised. Adversarial examples are attack methods that inject fine noise into images to induce misclassification, which can pose a serious threat to the application of deep learning models in the real world. In this paper, we propose a model that detects adversarial examples using differences in predictive values between edge-learned classification models and underlying classification models. The simple process of extracting the edges of the objects and reflecting them in learning can increase the robustness of the classification model, and economical and efficient detection is possible by detecting adversarial examples through differences in predictions between models. In our experiments, the general model showed accuracy of {49.9%, 29.84%, 18.46%, 4.95%, 3.36%} for adversarial examples (eps={0.02, 0.05, 0.1, 0.2, 0.3}), whereas the Canny edge model showed accuracy of {82.58%, 65.96%, 46.71%, 24.94%, 13.41%} and other edge models showed a similar level of accuracy also, indicating that the edge model was more robust against adversarial examples. In addition, adversarial example detection using differences in predictions between models revealed detection rates of {85.47%, 84.64%, 91.44%, 95.47%, and 87.61%} for each epsilon-specific adversarial example. It is expected that this study will contribute to improving the reliability of deep learning models in related research and application industries such as medical, autonomous driving, security, and national defense.