• 제목/요약/키워드: data augmentation

검색결과 572건 처리시간 0.025초

딥러닝을 이용한 당뇨성황반부종 등급 분류의 정확도 개선을 위한 검증 데이터 증강 기법 (Validation Data Augmentation for Improving the Grading Accuracy of Diabetic Macular Edema using Deep Learning)

  • 이태수
    • 대한의용생체공학회:의공학회지
    • /
    • 제40권2호
    • /
    • pp.48-54
    • /
    • 2019
  • This paper proposed a method of validation data augmentation for improving the grading accuracy of diabetic macular edema (DME) using deep learning. The data augmentation technique is basically applied in order to secure diversity of data by transforming one image to several images through random translation, rotation, scaling and reflection in preparation of input data of the deep neural network (DNN). In this paper, we apply this technique in the validation process of the trained DNN, and improve the grading accuracy by combining the classification results of the augmented images. To verify the effectiveness, 1,200 retinal images of Messidor dataset was divided into training and validation data at the ratio 7:3. By applying random augmentation to 359 validation data, $1.61{\pm}0.55%$ accuracy improvement was achieved in the case of six times augmentation (N=6). This simple method has shown that the accuracy can be improved in the N range from 2 to 6 with the correlation coefficient of 0.5667. Therefore, it is expected to help improve the diagnostic accuracy of DME with the grading information provided by the proposed DNN.

준 지도 이상 탐지 기법의 성능 향상을 위한 섭동을 활용한 초구 기반 비정상 데이터 증강 기법 (Abnormal Data Augmentation Method Using Perturbation Based on Hypersphere for Semi-Supervised Anomaly Detection)

  • 정병길;권준형;민동준;이상근
    • 정보보호학회논문지
    • /
    • 제32권4호
    • /
    • pp.647-660
    • /
    • 2022
  • 최근 정상 데이터와 일부 비정상 데이터를 보유한 환경에서 딥러닝 기반 준 지도 학습 이상 탐지 기법이 매우 효과적으로 동작함이 알려져 있다. 하지만 사이버 보안 분야와 같이 실제 시스템에 대한 알려지지 않은 공격 등 비정상 데이터 확보가 어려운 환경에서는 비정상 데이터 부족이 발생할 가능성이 있다. 본 논문은 비정상 데이터가 정상 데이터보다 극히 작은 환경에서 준 지도 이상 탐지 기법에 적용 가능한 섭동을 활용한 초구 기반 비정상 데이터 증강 기법인 ADA-PH(Abnormal Data Augmentation Method using Perturbation based on Hypersphere)를 제안한다. ADA-PH는 정상 데이터를 잘 표현할 수 있는 초구의 중심으로부터 상대적으로 먼 거리에 위치한 샘플에 대해 적대적 섭동을 추가함으로써 비정상 데이터를 생성한다. 제안하는 기법은 비정상 데이터가 극소수로 존재하는 네트워크 침입 탐지 데이터셋에 대하여 데이터 증강을 수행하지 않았을 경우보다 평균적으로 23.63% 향상된 AUC가 도출되었고, 다른 증강 기법들과 비교했을 때 가장 높은 AUC가 또한 도출되었다. 또한, 실제 비정상 데이터에 유사한지에 대한 정량적 및 정성적 분석을 수행하였다.

안저 영상에서 당뇨병 망막병증 등급을 위한 data augmentation (Data Augmentation for Diabetic Retinopathy Grading in Fundus Images)

  • ;추현승
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2022년도 추계학술발표대회
    • /
    • pp.556-558
    • /
    • 2022
  • Diabetic retinopathy (DR) is one of the leading diseases causing vision loss. Early detection of this disease has a crucial role in protecting patients' eyes. Recent works have achieved impressive result when performing DR detection on fundus images using deep learning. In the deep learning-based approach, data augmentation has significant impact on the result. Recently, many data augmentation policies have been proposed and achieved state-of-the-art performance on different tasks. In this work, we compare effects of three data augmentation policies on DR grading in fundus images.

변분 오토인코더와 비교사 데이터 증강을 이용한 음성인식기 준지도 학습 (Semi-supervised learning of speech recognizers based on variational autoencoder and unsupervised data augmentation)

  • 조현호;강병옥;권오욱
    • 한국음향학회지
    • /
    • 제40권6호
    • /
    • pp.578-586
    • /
    • 2021
  • 종단간 음성인식기의 성능향상을 위한 변분 오토인코더(Variational AutoEncoder, VAE) 및 비교사 데이터 증강(Unsupervised Data Augmentation, UDA) 기반의 준지도 학습 방법을 제안한다. 제안된 방법에서는 먼저 원래의 음성데이터를 이용하여 VAE 기반 증강모델과 베이스라인 종단간 음성인식기를 학습한다. 그 다음, 학습된 증강모델로부터 증강된 데이터를 이용하여 베이스라인 종단간 음성인식기를 다시 학습한다. 마지막으로, 학습된 증강모델 및 종단간 음성인식기를 비교사 데이터 증강 기반의 준지도 학습 방법으로 다시 학습한다. 컴퓨터 모의실험 결과, 증강모델은 기존의 종단간 음성인식기의 단어오류율(Word Error Rate, WER)을 개선하였으며, 비교사 데이터 증강학습방법과 결합함으로써 성능을 더욱 개선하였다.

A Study on Improving the Accuracy of Medical Images Classification Using Data Augmentation

  • Cheon-Ho Park;Min-Guan Kim;Seung-Zoon Lee;Jeongil Choi
    • 한국컴퓨터정보학회논문지
    • /
    • 제28권12호
    • /
    • pp.167-174
    • /
    • 2023
  • 본 연구는 합성곱 신경망 모델에서 이미지 데이터 증강을 통하여 대장암 진단 모델의 정확도를 개선하고자 하였다. 이미지 데이터 증강은 기초 이미지 조작 방법을 이용하여 뒤집기, 회전, 이동, 밀림, 주밍을 사용하였다. 본 연구에서는 실험설계를 위해 보유하고 있는 5000개의 이미지 데이터에 대해 훈련 데이터와 평가 데이터로 각각 4000개와 1000개로 나누었으며, 훈련 데이터 4000개에 대해 이미지 데이터 증강 기법으로 4000개와 8000개의 이미지를 추가하여 모델을 학습시켰다. 평가 결과는 훈련 데이터 4000개, 8000개, 12000개에 대한 분류 정확도가 각각 85.1%, 87.0%, 90.2%로 나왔으며 이미지 데이터 증강에 따른 개선 효과를 확인하였다.

표면 결함 검출을 위한 데이터 확장 및 성능분석 (Performance Analysis of Data Augmentation for Surface Defects Detection)

  • 김준봉;서기성
    • 전기학회논문지
    • /
    • 제67권5호
    • /
    • pp.669-674
    • /
    • 2018
  • Data augmentation is an efficient way to reduce overfitting on models and to improve a performance supplementing extra data for training. It is more important in deep learning based industrial machine vision. Because deep learning requires huge scale of learning data to learn a model, but acquisition of data can be limited in most of industrial applications. A very generic method for augmenting image data is to perform geometric transformations, such as cropping, rotating, translating and adjusting brightness of the image. The effectiveness of data augmentation in image classification has been reported, but it is rare in defect inspections. We explore and compare various basic augmenting operations for the metal surface defects. The experiments were executed for various types of defects and different CNN networks and analysed for performance improvements by the data augmentations.

소량 데이터 딥러닝 기반 강판 표면 결함 검출 시스템 개발 (Development of a Steel Plate Surface Defect Detection System Based on Small Data Deep Learning)

  • 게이뷸라예프 압둘라지즈;이나현;이기환;김태형
    • 대한임베디드공학회논문지
    • /
    • 제17권3호
    • /
    • pp.129-138
    • /
    • 2022
  • Collecting and labeling sufficient training data, which is essential to deep learning-based visual inspection, is difficult for manufacturers to perform because it is very expensive. This paper presents a steel plate surface defect detection system with industrial-grade detection performance by training a small amount of steel plate surface images consisting of labeled and non-labeled data. To overcome the problem of lack of training data, we propose two data augmentation techniques: program-based augmentation, which generates defect images in a geometric way, and generative model-based augmentation, which learns the distribution of labeled data. We also propose a 4-step semi-supervised learning using pseudo labels and consistency training with fixed-size augmentation in order to utilize unlabeled data for training. The proposed technique obtained about 99% defect detection performance for four defect types by using 100 real images including labeled and unlabeled data.

Experimental Analysis of Equilibrization in Binary Classification for Non-Image Imbalanced Data Using Wasserstein GAN

  • Wang, Zhi-Yong;Kang, Dae-Ki
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제11권4호
    • /
    • pp.37-42
    • /
    • 2019
  • In this paper, we explore the details of three classic data augmentation methods and two generative model based oversampling methods. The three classic data augmentation methods are random sampling (RANDOM), Synthetic Minority Over-sampling Technique (SMOTE), and Adaptive Synthetic Sampling (ADASYN). The two generative model based oversampling methods are Conditional Generative Adversarial Network (CGAN) and Wasserstein Generative Adversarial Network (WGAN). In imbalanced data, the whole instances are divided into majority class and minority class, where majority class occupies most of the instances in the training set and minority class only includes a few instances. Generative models have their own advantages when they are used to generate more plausible samples referring to the distribution of the minority class. We also adopt CGAN to compare the data augmentation performance with other methods. The experimental results show that WGAN-based oversampling technique is more stable than other approaches (RANDOM, SMOTE, ADASYN and CGAN) even with the very limited training datasets. However, when the imbalanced ratio is too small, generative model based approaches cannot achieve satisfying performance than the conventional data augmentation techniques. These results suggest us one of future research directions.

SinGAN기반 데이터 증강과 random forest알고리즘을 이용한 고무 오링 결함 검출 시스템 (A rubber o-ring defect detection system using data augmentation based on the SinGAN and random forest algorithm)

  • 이용은;이한성;김대원;김경천
    • 한국가시화정보학회지
    • /
    • 제19권3호
    • /
    • pp.63-68
    • /
    • 2021
  • In this study, data was augmentation through the SinGAN algorithm using small image data, and defects in rubber O-rings were detected using the random forest algorithm. Unlike the commonly used data augmentation image rotation method to solve the data imbalance problem, the data imbalance problem was solved by using the SinGAN algorithm. A study was conducted to distinguish between normal products and defective products of rubber o-ring by using the random forest algorithm. A total of 20,000 image date were divided into transit and testing datasets, and an accuracy result was obtained to distinguish 97.43% defects as a result of the test.

1D-CNN-LSTM Hybrid-Model-Based Pet Behavior Recognition through Wearable Sensor Data Augmentation

  • Hyungju Kim;Nammee Moon
    • Journal of Information Processing Systems
    • /
    • 제20권2호
    • /
    • pp.159-172
    • /
    • 2024
  • The number of healthcare products available for pets has increased in recent times, which has prompted active research into wearable devices for pets. However, the data collected through such devices are limited by outliers and missing values owing to the anomalous and irregular characteristics of pets. Hence, we propose pet behavior recognition based on a hybrid one-dimensional convolutional neural network (CNN) and long short- term memory (LSTM) model using pet wearable devices. An Arduino-based pet wearable device was first fabricated to collect data for behavior recognition, where gyroscope and accelerometer values were collected using the device. Then, data augmentation was performed after replacing any missing values and outliers via preprocessing. At this time, the behaviors were classified into five types. To prevent bias from specific actions in the data augmentation, the number of datasets was compared and balanced, and CNN-LSTM-based deep learning was performed. The five subdivided behaviors and overall performance were then evaluated, and the overall accuracy of behavior recognition was found to be about 88.76%.