• Title/Summary/Keyword: data augmentation

Search Result 563, Processing Time 0.02 seconds

A Substitute Model Learning Method Using Data Augmentation with a Decay Factor and Adversarial Data Generation Using Substitute Model (감쇠 요소가 적용된 데이터 어그멘테이션을 이용한 대체 모델 학습과 적대적 데이터 생성 방법)

  • Min, Jungki;Moon, Jong-sub
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.29 no.6
    • /
    • pp.1383-1392
    • /
    • 2019
  • Adversarial attack, which geneartes adversarial data to make target model misclassify the input data, is able to confuse real life applications of classification models and cause severe damage to the classification system. An Black-box adversarial attack learns a substitute model, which have similar decision boundary to the target model, and then generates adversarial data with the substitute model. Jacobian-based data augmentation is used to synthesize the training data to learn substitutes, but has a drawback that the data synthesized by the augmentation get distorted more and more as the training loop proceeds. We suggest data augmentation with 'decay factor' to alleviate this problem. The result shows that attack success rate of our method is higher(around 8.5%) than the existing method.

Bio-signal Data Augumentation Technique for CNN based Human Activity Recognition (CNN 기반 인간 동작 인식을 위한 생체신호 데이터의 증강 기법)

  • Gerelbat BatGerel;Chun-Ki Kwon
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.24 no.2
    • /
    • pp.90-96
    • /
    • 2023
  • Securing large amounts of training data in deep learning neural networks, including convolutional neural networks, is of importance for avoiding overfitting phenomenon or for the excellent performance. However, securing labeled training data in deep learning neural networks is very limited in reality. To overcome this, several augmentation methods have been proposed in the literature to generate an additional large amount of training data through transformation or manipulation of the already acquired traing data. However, unlike training data such as images and texts, it is barely to find an augmentation method in the literature that additionally generates bio-signal training data for convolutional neural network based human activity recognition. Thus, this study proposes a simple but effective augmentation method of bio-signal training data for convolutional neural network based human activity recognition. The usefulness of the proposed augmentation method is validated by showing that human activity is recognized with high accuracy by convolutional neural network trained with its augmented bio-signal training data.

A study on the performance improvement of learning based on consistency regularization and unlabeled data augmentation (일치성규칙과 목표값이 없는 데이터 증대를 이용하는 학습의 성능 향상 방법에 관한 연구)

  • Kim, Hyunwoong;Seok, Kyungha
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.2
    • /
    • pp.167-175
    • /
    • 2021
  • Semi-supervised learning uses both labeled data and unlabeled data. Recently consistency regularization is very popular in semi-supervised learning. Unsupervised data augmentation (UDA) that uses unlabeled data augmentation is also based on the consistency regularization. The Kullback-Leibler divergence is used for the loss of unlabeled data and cross-entropy for the loss of labeled data through UDA learning. UDA uses techniques such as training signal annealing (TSA) and confidence-based masking to promote performance. In this study, we propose to use Jensen-Shannon divergence instead of Kullback-Leibler divergence, reverse-TSA and not to use confidence-based masking for performance improvement. Through experiment, we show that the proposed technique yields better performance than those of UDA.

Data augmentation in voice spoofing problem (데이터 증강기법을 이용한 음성 위조 공격 탐지모형의 성능 향상에 대한 연구)

  • Choi, Hyo-Jung;Kwak, Il-Youp
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.3
    • /
    • pp.449-460
    • /
    • 2021
  • ASVspoof 2017 deals with detection of replay attacks and aims to classify real human voices and fake voices. The spoofed voice refers to the voice that reproduces the original voice by different types of microphones and speakers. data augmentation research on image data has been actively conducted, and several studies have been conducted to attempt data augmentation on voice. However, there are not many attempts to augment data for voice replay attacks, so this paper explores how audio modification through data augmentation techniques affects the detection of replay attacks. A total of 7 data augmentation techniques were applied, and among them, dynamic value change (DVC) and pitch techniques helped improve performance. DVC and pitch showed an improvement of about 8% of the base model EER, and DVC in particular showed noticeable improvement in accuracy in some environments among 57 replay configurations. The greatest increase was achieved in RC53, and DVC led to an approximately 45% improvement in base model accuracy. The high-end recording and playback devices that were previously difficult to detect were well identified. Based on this study, we found that the DVC and pitch data augmentation techniques are helpful in improving performance in the voice spoofing detection problem.

Development of Augmentation Method of Ballistic Missile Trajectory using Variational Autoencoder (변이형 오토인코더를 이용한 탄도미사일 궤적 증강기법 개발)

  • Dong Kyu Lee;Dong Wg Hong
    • Journal of the Korean Society of Systems Engineering
    • /
    • v.19 no.2
    • /
    • pp.145-156
    • /
    • 2023
  • Trajectory of ballistic missile is defined by inherent flight dynamics, which decided range and maneuvering characteristics. It is crucial to predict range and maneuvering characteristics of ballistic missile in KAMD (Korea Air and Missile Defense) to minimize damage due to ballistic missile attacks, Nowadays, needs for applying AI(Artificial Intelligence) technologies are increasing due to rapid developments of DNN(Deep Neural Networks) technologies. To apply these DNN technologies amount of data are required for superviesed learning, but trajectory data of ballistic missiles is limited because of security issues. Trajectory data could be considered as multivariate time series including many variables. And augmentation in time series data is a developing area of research. In this paper, we tried to augment trajectory data of ballistic missiles using recently developed methods. We used TimeVAE(Time Variational AutoEncoder) method and TimeGAN(Time Generative Adversarial Networks) to synthesize missile trajectory data. We also compare the results of two methods and analyse for future works.

Data Augmentation Techniques for Deep Learning-Based Medical Image Analyses (딥러닝 기반 의료영상 분석을 위한 데이터 증강 기법)

  • Mingyu Kim;Hyun-Jin Bae
    • Journal of the Korean Society of Radiology
    • /
    • v.81 no.6
    • /
    • pp.1290-1304
    • /
    • 2020
  • Medical image analyses have been widely used to differentiate normal and abnormal cases, detect lesions, segment organs, etc. Recently, owing to many breakthroughs in artificial intelligence techniques, medical image analyses based on deep learning have been actively studied. However, sufficient medical data are difficult to obtain, and data imbalance between classes hinder the improvement of deep learning performance. To resolve these issues, various studies have been performed, and data augmentation has been found to be a solution. In this review, we introduce data augmentation techniques, including image processing, such as rotation, shift, and intensity variation methods, generative adversarial network-based method, and image property mixing methods. Subsequently, we examine various deep learning studies based on data augmentation techniques. Finally, we discuss the necessity and future directions of data augmentation.

A Study on Visual Emotion Classification using Balanced Data Augmentation (균형 잡힌 데이터 증강 기반 영상 감정 분류에 관한 연구)

  • Jeong, Chi Yoon;Kim, Mooseop
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.7
    • /
    • pp.880-889
    • /
    • 2021
  • In everyday life, recognizing people's emotions from their frames is essential and is a popular research domain in the area of computer vision. Visual emotion has a severe class imbalance in which most of the data are distributed in specific categories. The existing methods do not consider class imbalance and used accuracy as the performance metric, which is not suitable for evaluating the performance of the imbalanced dataset. Therefore, we proposed a method for recognizing visual emotion using balanced data augmentation to address the class imbalance. The proposed method generates a balanced dataset by adopting the random over-sampling and image transformation methods. Also, the proposed method uses the Focal loss as a loss function, which can mitigate the class imbalance by down weighting the well-classified samples. EfficientNet, which is the state-of-the-art method for image classification is used to recognize visual emotion. We compare the performance of the proposed method with that of conventional methods by using a public dataset. The experimental results show that the proposed method increases the F1 score by 40% compared with the method without data augmentation, mitigating class imbalance without loss of classification accuracy.

Text Augmentation Using Hierarchy-based Word Replacement

  • Kim, Museong;Kim, Namgyu
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.1
    • /
    • pp.57-67
    • /
    • 2021
  • Recently, multi-modal deep learning techniques that combine heterogeneous data for deep learning analysis have been utilized a lot. In particular, studies on the synthesis of Text to Image that automatically generate images from text are being actively conducted. Deep learning for image synthesis requires a vast amount of data consisting of pairs of images and text describing the image. Therefore, various data augmentation techniques have been devised to generate a large amount of data from small data. A number of text augmentation techniques based on synonym replacement have been proposed so far. However, these techniques have a common limitation in that there is a possibility of generating a incorrect text from the content of an image when replacing the synonym for a noun word. In this study, we propose a text augmentation method to replace words using word hierarchy information for noun words. Additionally, we performed experiments using MSCOCO data in order to evaluate the performance of the proposed methodology.

3D Medical Image Data Augmentation for CT Image Segmentation (CT 이미지 세그멘테이션을 위한 3D 의료 영상 데이터 증강 기법)

  • Seonghyeon Ko;Huigyu Yang;Moonseong Kim;Hyunseung Choo
    • Journal of Internet Computing and Services
    • /
    • v.24 no.4
    • /
    • pp.85-92
    • /
    • 2023
  • Deep learning applications are increasingly being leveraged for disease detection tasks in medical imaging modalities such as X-ray, Computed Tomography (CT), and Magnetic Resonance Imaging (MRI). Most data-centric deep learning challenges necessitate the use of supervised learning methodologies to attain high accuracy and to facilitate performance evaluation through comparison with the ground truth. Supervised learning mandates a substantial amount of image and label sets, however, procuring an adequate volume of medical imaging data for training is a formidable task. Various data augmentation strategies can mitigate the underfitting issue inherent in supervised learning-based models that are trained on limited medical image and label sets. This research investigates the enhancement of a deep learning-based rib fracture segmentation model and the efficacy of data augmentation techniques such as left-right flipping, rotation, and scaling. Augmented dataset with L/R flipping and rotations(30°, 60°) increased model performance, however, dataset with rotation(90°) and ⨯0.5 rescaling decreased model performance. This indicates the usage of appropriate data augmentation methods depending on datasets and tasks.

GAN based Data Augmentation of Channel Data for the Application of RF Finger-printing in NFC (NFC에서 무선 핑거프린팅 기술 적용을 위한 GAN 기반 채널데이터 증강방안)

  • Lee, Woongsup
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.9
    • /
    • pp.1271-1274
    • /
    • 2021
  • RF fingerprinting based on deep learning (DL) has gained interests as a means to improve the security of near field communication (NFC) by allowing identification of NFC tags based on unique physical characteristics. To achieve high accuracy in the identification of NFC tags, it is crucial to utilize a large number of training data, however it is hard to collect such dataset in practice. In this study, we have provided new methodology to generate RF waveform from NFC tags, i.e., data augmentation, based on a conditional generative adversarial network (CGAN). By using the RF waveform of NFC tags which is collected from the testbed with software defined radio (SDR), we have confirmed that the realistic RF waveform can be generated through our proposed scheme.