• Title/Summary/Keyword: Adversarial example

Search Result 23, Processing Time 0.02 seconds

Defending and Detecting Audio Adversarial Example using Frame Offsets

  • Gong, Yongkang;Yan, Diqun;Mao, Terui;Wang, Donghua;Wang, Rangding
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.4
    • /
    • pp.1538-1552
    • /
    • 2021
  • Machine learning models are vulnerable to adversarial examples generated by adding a deliberately designed perturbation to a benign sample. Particularly, for automatic speech recognition (ASR) system, a benign audio which sounds normal could be decoded as a harmful command due to potential adversarial attacks. In this paper, we focus on the countermeasures against audio adversarial examples. By analyzing the characteristics of ASR systems, we find that frame offsets with silence clip appended at the beginning of an audio can degenerate adversarial perturbations to normal noise. For various scenarios, we exploit frame offsets by different strategies such as defending, detecting and hybrid strategy. Compared with the previous methods, our proposed method can defense audio adversarial example in a simpler, more generic and efficient way. Evaluated on three state-of-the-arts adversarial attacks against different ASR systems respectively, the experimental results demonstrate that the proposed method can effectively improve the robustness of ASR systems.

Perceptual Ad-Blocker Design For Adversarial Attack (적대적 공격에 견고한 Perceptual Ad-Blocker 기법)

  • Kim, Min-jae;Kim, Bo-min;Hur, Junbeom
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.30 no.5
    • /
    • pp.871-879
    • /
    • 2020
  • Perceptual Ad-Blocking is a new advertising blocking technique that detects online advertising by using an artificial intelligence-based advertising image classification model. A recent study has shown that these Perceptual Ad-Blocking models are vulnerable to adversarial attacks using adversarial examples to add noise to images that cause them to be misclassified. In this paper, we prove that existing perceptual Ad-Blocking technique has a weakness for several adversarial example and that Defense-GAN and MagNet who performed well for MNIST dataset and CIFAR-10 dataset are good to advertising dataset. Through this, using Defense-GAN and MagNet techniques, it presents a robust new advertising image classification model for adversarial attacks. According to the results of experiments using various existing adversarial attack techniques, the techniques proposed in this paper were able to secure the accuracy and performance through the robust image classification techniques, and furthermore, they were able to defend a certain level against white-box attacks by attackers who knew the details of defense techniques.

An Adversarial Attack Type Classification Method Using Linear Discriminant Analysis and k-means Algorithm (선형 판별 분석 및 k-means 알고리즘을 이용한 적대적 공격 유형 분류 방안)

  • Choi, Seok-Hwan;Kim, Hyeong-Geon;Choi, Yoon-Ho
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.31 no.6
    • /
    • pp.1215-1225
    • /
    • 2021
  • Although Artificial Intelligence (AI) techniques have shown impressive performance in various fields, they are vulnerable to adversarial examples which induce misclassification by adding human-imperceptible perturbations to the input. Previous studies to defend the adversarial examples can be classified into three categories: (1) model retraining methods; (2) input transformation methods; and (3) adversarial examples detection methods. However, even though the defense methods against adversarial examples have constantly been proposed, there is no research to classify the type of adversarial attack. In this paper, we proposed an adversarial attack family classification method based on dimensionality reduction and clustering. Specifically, after extracting adversarial perturbation from adversarial example, we performed Linear Discriminant Analysis (LDA) to reduce the dimensionality of adversarial perturbation and performed K-means algorithm to classify the type of adversarial attack family. From the experimental results using MNIST dataset and CIFAR-10 dataset, we show that the proposed method can efficiently classify five tyeps of adversarial attack(FGSM, BIM, PGD, DeepFool, C&W). We also show that the proposed method provides good classification performance even in a situation where the legitimate input to the adversarial example is unknown.

Detecting Adversarial Example Using Ensemble Method on Deep Neural Network (딥뉴럴네트워크에서의 적대적 샘플에 관한 앙상블 방어 연구)

  • Kwon, Hyun;Yoon, Joonhyeok;Kim, Junseob;Park, Sangjun;Kim, Yongchul
    • Convergence Security Journal
    • /
    • v.21 no.2
    • /
    • pp.57-66
    • /
    • 2021
  • Deep neural networks (DNNs) provide excellent performance for image, speech, and pattern recognition. However, DNNs sometimes misrecognize certain adversarial examples. An adversarial example is a sample that adds optimized noise to the original data, which makes the DNN erroneously misclassified, although there is nothing wrong with the human eye. Therefore studies on defense against adversarial example attacks are required. In this paper, we have experimentally analyzed the success rate of detection for adversarial examples by adjusting various parameters. The performance of the ensemble defense method was analyzed using fast gradient sign method, DeepFool method, Carlini & Wanger method, which are adversarial example attack methods. Moreover, we used MNIST as experimental data and Tensorflow as a machine learning library. As an experimental method, we carried out performance analysis based on three adversarial example attack methods, threshold, number of models, and random noise. As a result, when there were 7 models and a threshold of 1, the detection rate for adversarial example is 98.3%, and the accuracy of 99.2% of the original sample is maintained.

Adversarial Example Detection and Classification Model Based on the Class Predicted by Deep Learning Model (데이터 예측 클래스 기반 적대적 공격 탐지 및 분류 모델)

  • Ko, Eun-na-rae;Moon, Jong-sub
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.31 no.6
    • /
    • pp.1227-1236
    • /
    • 2021
  • Adversarial attack, one of the attacks on deep learning classification model, is attack that add indistinguishable perturbations to input data and cause deep learning classification model to misclassify the input data. There are various adversarial attack algorithms. Accordingly, many studies have been conducted to detect adversarial attack but few studies have been conducted to classify what adversarial attack algorithms to generate adversarial input. if adversarial attacks can be classified, more robust deep learning classification model can be established by analyzing differences between attacks. In this paper, we proposed a model that detects and classifies adversarial attacks by constructing a random forest classification model with input features extracted from a target deep learning model. In feature extraction, feature is extracted from a output value of hidden layer based on class predicted by the target deep learning model. Through Experiments the model proposed has shown 3.02% accuracy on clean data, 0.80% accuracy on adversarial data higher than the result of pre-existing studies and classify new adversarial attack that was not classified in pre-existing studies.

Rapid Misclassification Sample Generation Attack on Deep Neural Network (딥뉴럴네트워크 상에 신속한 오인식 샘플 생성 공격)

  • Kwon, Hyun;Park, Sangjun;Kim, Yongchul
    • Convergence Security Journal
    • /
    • v.20 no.2
    • /
    • pp.111-121
    • /
    • 2020
  • Deep neural networks (DNNs) provide good performance for machine learning tasks such as image recognition and object recognition. However, DNNs are vulnerable to an adversarial example. An adversarial example is an attack sample that causes the neural network to recognize it incorrectly by adding minimal noise to the original sample. However, the disadvantage is that it takes a long time to generate such an adversarial example. Therefore, in some cases, an attack may be necessary that quickly causes the neural network to recognize it incorrectly. In this paper, we propose a fast misclassification sample that can rapidly attack neural networks. The proposed method does not consider the distortion of the original sample when adding noise. We used MNIST and CIFAR10 as experimental data and Tensorflow as a machine learning library. Experimental results show that the fast misclassification sample generated by the proposed method can be generated with 50% and 80% reduced number of iterations for MNIST and CIFAR10, respectively, compared to the conventional Carlini method, and has 100% attack rate.

Adversarial Example Detection Based on Symbolic Representation of Image (이미지의 Symbolic Representation 기반 적대적 예제 탐지 방법)

  • Park, Sohee;Kim, Seungjoo;Yoon, Hayeon;Choi, Daeseon
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.5
    • /
    • pp.975-986
    • /
    • 2022
  • Deep learning is attracting great attention, showing excellent performance in image processing, but is vulnerable to adversarial attacks that cause the model to misclassify through perturbation on input data. Adversarial examples generated by adversarial attacks are minimally perturbated where it is difficult to identify, so visual features of the images are not generally changed. Unlikely deep learning models, people are not fooled by adversarial examples, because they classify the images based on such visual features of images. This paper proposes adversarial attack detection method using Symbolic Representation, which is a visual and symbolic features such as color, shape of the image. We detect a adversarial examples by comparing the converted Symbolic Representation from the classification results for the input image and Symbolic Representation extracted from the input images. As a result of measuring performance on adversarial examples by various attack method, detection rates differed depending on attack targets and methods, but was up to 99.02% for specific target attack.

A Study on the Efficacy of Edge-Based Adversarial Example Detection Model: Across Various Adversarial Algorithms

  • Jaesung Shim;Kyuri Jo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.2
    • /
    • pp.31-41
    • /
    • 2024
  • Deep learning models show excellent performance in tasks such as image classification and object detection in the field of computer vision, and are used in various ways in actual industrial sites. Recently, research on improving robustness has been actively conducted, along with pointing out that this deep learning model is vulnerable to hostile examples. A hostile example is an image in which small noise is added to induce misclassification, and can pose a significant threat when applying a deep learning model to a real environment. In this paper, we tried to confirm the robustness of the edge-learning classification model and the performance of the adversarial example detection model using it for adversarial examples of various algorithms. As a result of robustness experiments, the basic classification model showed about 17% accuracy for the FGSM algorithm, while the edge-learning models maintained accuracy in the 60-70% range, and the basic classification model showed accuracy in the 0-1% range for the PGD/DeepFool/CW algorithm, while the edge-learning models maintained accuracy in 80-90%. As a result of the adversarial example detection experiment, a high detection rate of 91-95% was confirmed for all algorithms of FGSM/PGD/DeepFool/CW. By presenting the possibility of defending against various hostile algorithms through this study, it is expected to improve the safety and reliability of deep learning models in various industries using computer vision.

Detecting Adversarial Examples Using Edge-based Classification

  • Jaesung Shim;Kyuri Jo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.10
    • /
    • pp.67-76
    • /
    • 2023
  • Although deep learning models are making innovative achievements in the field of computer vision, the problem of vulnerability to adversarial examples continues to be raised. Adversarial examples are attack methods that inject fine noise into images to induce misclassification, which can pose a serious threat to the application of deep learning models in the real world. In this paper, we propose a model that detects adversarial examples using differences in predictive values between edge-learned classification models and underlying classification models. The simple process of extracting the edges of the objects and reflecting them in learning can increase the robustness of the classification model, and economical and efficient detection is possible by detecting adversarial examples through differences in predictions between models. In our experiments, the general model showed accuracy of {49.9%, 29.84%, 18.46%, 4.95%, 3.36%} for adversarial examples (eps={0.02, 0.05, 0.1, 0.2, 0.3}), whereas the Canny edge model showed accuracy of {82.58%, 65.96%, 46.71%, 24.94%, 13.41%} and other edge models showed a similar level of accuracy also, indicating that the edge model was more robust against adversarial examples. In addition, adversarial example detection using differences in predictions between models revealed detection rates of {85.47%, 84.64%, 91.44%, 95.47%, and 87.61%} for each epsilon-specific adversarial example. It is expected that this study will contribute to improving the reliability of deep learning models in related research and application industries such as medical, autonomous driving, security, and national defense.

StarGAN-Based Detection and Purification Studies to Defend against Adversarial Attacks (적대적 공격을 방어하기 위한 StarGAN 기반의 탐지 및 정화 연구)

  • Sungjune Park;Gwonsang Ryu;Daeseon Choi
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.33 no.3
    • /
    • pp.449-458
    • /
    • 2023
  • Artificial Intelligence is providing convenience in various fields using big data and deep learning technologies. However, deep learning technology is highly vulnerable to adversarial examples, which can cause misclassification of classification models. This study proposes a method to detect and purification various adversarial attacks using StarGAN. The proposed method trains a StarGAN model with added Categorical Entropy loss using adversarial examples generated by various attack methods to enable the Discriminator to detect adversarial examples and the Generator to purification them. Experimental results using the CIFAR-10 dataset showed an average detection performance of approximately 68.77%, an average purification performance of approximately 72.20%, and an average defense performance of approximately 93.11% derived from restoration and detection performance.