• 제목/요약/키워드: data augmentation method

검색결과 203건 처리시간 0.026초

K-겹 교차 검증과 서포트 벡터 머신을 이용한 고무 오링결함 검출 시스템 (Rubber O-ring defect detection system using K-fold cross validation and support vector machine)

  • 이용은;최낙준;변영후;김대원;김경천
    • 한국가시화정보학회지
    • /
    • 제19권1호
    • /
    • pp.68-73
    • /
    • 2021
  • In this study, the detection of rubber o-ring defects was carried out using k-fold cross validation and Support Vector Machine (SVM) algorithm. The data process was carried out in 3 steps. First, we proceeded with a frame alignment to eliminate unnecessary regions in the learning and secondly, we applied gray-scale changes for computational reduction. Finally, data processing was carried out using image augmentation to prevent data overfitting. After processing data, SVM algorithm was used to obtain normal and defect detection accuracy. In addition, we applied the SVM algorithm through the k-fold cross validation method to compare the classification accuracy. As a result, we obtain results that show better performance by applying the k-fold cross validation method.

히스토그램 등화와 데이터 증강 기법을 이용한 개선된 음성 감정 인식 (Improved speech emotion recognition using histogram equalization and data augmentation techniques)

  • 허운행;권오욱
    • 말소리와 음성과학
    • /
    • 제9권2호
    • /
    • pp.77-83
    • /
    • 2017
  • We propose a new method to reduce emotion recognition errors caused by variation in speaker characteristics and speech rate. Firstly, for reducing variation in speaker characteristics, we adjust features from a test speaker to fit the distribution of all training data by using the histogram equalization (HE) algorithm. Secondly, for dealing with variation in speech rate, we augment the training data with speech generated in various speech rates. In computer experiments using EMO-DB, KRN-DB and eNTERFACE-DB, the proposed method is shown to improve weighted accuracy relatively by 34.7%, 23.7% and 28.1%, respectively.

A Bayesian joint model for continuous and zero-inflated count data in developmental toxicity studies

  • Hwang, Beom Seuk
    • Communications for Statistical Applications and Methods
    • /
    • 제29권2호
    • /
    • pp.239-250
    • /
    • 2022
  • In many applications, we frequently encounter correlated multiple outcomes measured on the same subject. Joint modeling of such multiple outcomes can improve efficiency of inference compared to independent modeling. For instance, in developmental toxicity studies, fetal weight and number of malformed pups are measured on the pregnant dams exposed to different levels of a toxic substance, in which the association between such outcomes should be taken into account in the model. The number of malformations may possibly have many zeros, which should be analyzed via zero-inflated count models. Motivated by applications in developmental toxicity studies, we propose a Bayesian joint modeling framework for continuous and count outcomes with excess zeros. In our model, zero-inflated Poisson (ZIP) regression model would be used to describe count data, and a subject-specific random effects would account for the correlation across the two outcomes. We implement a Bayesian approach using MCMC procedure with data augmentation method and adaptive rejection sampling. We apply our proposed model to dose-response analysis in a developmental toxicity study to estimate the benchmark dose in a risk assessment.

Research on data augmentation algorithm for time series based on deep learning

  • Shiyu Liu;Hongyan Qiao;Lianhong Yuan;Yuan Yuan;Jun Liu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제17권6호
    • /
    • pp.1530-1544
    • /
    • 2023
  • Data monitoring is an important foundation of modern science. In most cases, the monitoring data is time-series data, which has high application value. The deep learning algorithm has a strong nonlinear fitting capability, which enables the recognition of time series by capturing anomalous information in time series. At present, the research of time series recognition based on deep learning is especially important for data monitoring. Deep learning algorithms require a large amount of data for training. However, abnormal sample is a small sample in time series, which means the number of abnormal time series can seriously affect the accuracy of recognition algorithm because of class imbalance. In order to increase the number of abnormal sample, a data augmentation method called GANBATS (GAN-based Bi-LSTM and Attention for Time Series) is proposed. In GANBATS, Bi-LSTM is introduced to extract the timing features and then transfer features to the generator network of GANBATS.GANBATS also modifies the discriminator network by adding an attention mechanism to achieve global attention for time series. At the end of discriminator, GANBATS is adding averagepooling layer, which merges temporal features to boost the operational efficiency. In this paper, four time series datasets and five data augmentation algorithms are used for comparison experiments. The generated data are measured by PRD(Percent Root Mean Square Difference) and DTW(Dynamic Time Warping). The experimental results show that GANBATS reduces up to 26.22 in PRD metric and 9.45 in DTW metric. In addition, this paper uses different algorithms to reconstruct the datasets and compare them by classification accuracy. The classification accuracy is improved by 6.44%-12.96% on four time series datasets.

Data Augmentation Techniques of Power Facilities for Improve Deep Learning Performance

  • 장승민;손승우;김봉석
    • KEPCO Journal on Electric Power and Energy
    • /
    • 제7권2호
    • /
    • pp.323-328
    • /
    • 2021
  • Diagnostic models are required. Data augmentation is one of the best ways to improve deep learning performance. Traditional augmentation techniques that modify image brightness or spatial information are difficult to achieve great results. To overcome this, a generative adversarial network (GAN) technology that generates virtual data to increase deep learning performance has emerged. GAN can create realistic-looking fake images by competitive learning two networks, a generator that creates fakes and a discriminator that determines whether images are real or fake made by the generator. GAN is being used in computer vision, IT solutions, and medical imaging fields. It is essential to secure additional learning data to advance deep learning-based fault diagnosis solutions in the power industry where facilities are strictly maintained more than other industries. In this paper, we propose a method for generating power facility images using GAN and a strategy for improving performance when only used a small amount of data. Finally, we analyze the performance of the augmented image to see if it could be utilized for the deep learning-based diagnosis system or not.

RAYLEIGH와 ERLANG 추세를 가진 혼합 고장모형에 대한 베이지안 추론에 관한 연구 (Bayesian Inference for Mixture Failure Model of Rayleigh and Erlang Pattern)

  • 김희철;이승주
    • 응용통계연구
    • /
    • 제13권2호
    • /
    • pp.505-514
    • /
    • 2000
  • 마코브체인 몬테칼로방법중에서 깁스 추출방법을 혼합 고장모형에 이용하였다. 베이자안 추론에서 조건부분포를 가지고 사후 분포를 결정하는데 있어서 계산 문제와 이론적인 정당성을 고려하여 감마족인 Rayleigh와 Erlang추세를 가진 혼합모형에 대하여 깁스샘플링 알고리즘을 이용하여 베이지안 계산과 신뢰도 추이를 알아보고 모의실험자료를 이용하여 수치적인 계산을 시행하고 그 결과를 제시하였다.

  • PDF

A Practical Implementation of Deep Learning Method for Supporting the Classification of Breast Lesions in Ultrasound Images

  • Han, Seokmin;Lee, Suchul;Lee, Jun-Rak
    • International journal of advanced smart convergence
    • /
    • 제8권1호
    • /
    • pp.24-34
    • /
    • 2019
  • In this research, a practical deep learning framework to differentiate the lesions and nodules in breast acquired with ultrasound imaging has been proposed. 7408 ultrasound breast images of 5151 patient cases were collected. All cases were biopsy proven and lesions were semi-automatically segmented. To compensate for the shift caused in the segmentation, the boundaries of each lesion were drawn using Fully Convolutional Networks(FCN) segmentation method based on the radiologist's specified point. The data set consists of 4254 benign and 3154 malignant lesions. In 7408 ultrasound breast images, the number of training images is 6579, and the number of test images is 829. The margin between the boundary of each lesion and the boundary of the image itself varied for training image augmentation. The training images were augmented by varying the margin between the boundary of each lesion and the boundary of the image itself. The images were processed through histogram equalization, image cropping, and margin augmentation. The networks trained on the data with augmentation and the data without augmentation all had AUC over 0.95. The network exhibited about 90% accuracy, 0.86 sensitivity and 0.95 specificity. Although the proposed framework still requires to point to the location of the target ROI with the help of radiologists, the result of the suggested framework showed promising results. It supports human radiologist to give successful performance and helps to create a fluent diagnostic workflow that meets the fundamental purpose of CADx.

Convolutional Neural Network Model Using Data Augmentation for Emotion AI-based Recommendation Systems

  • Ho-yeon Park;Kyoung-jae Kim
    • 한국컴퓨터정보학회논문지
    • /
    • 제28권12호
    • /
    • pp.57-66
    • /
    • 2023
  • 본 연구에서는 딥러닝 기법과 정서적 AI를 적용하여 사용자의 감정 상태를 추정하고 이를 추천 과정에 반영할 수 있는 추천 시스템에 대한 새로운 연구 프레임워크를 제안한다. 이를 위해 분노, 혐오, 공포, 행복, 슬픔, 놀람, 중립의 7가지 감정을 각각 분류하는 감정분류모델을 구축하고, 이 결과를 추천 과정에 반영할 수 있는 모형을 제안한다. 그러나 일반적인 감정 분류 데이터에서는 각 레이블 간 분포 비율의 차이가 크기 때문에 일반화된 분류 결과를 기대하기 어려울 수 있다. 본 연구에서는 감정 이미지 데이터에서 혐오감 등의 감정 개수가 부족한 경우가 많으므로 데이터 증강을 이용한다. 마지막으로, 이미지 증강을 통해 데이터 기반의 감정 예측 모델을 추천시스템에 반영하는 방법을 제안한다.

Evaluation of Deep Learning Model for Scoliosis Pre-Screening Using Preprocessed Chest X-ray Images

  • Min Gu Jang;Jin Woong Yi;Hyun Ju Lee;Ki Sik Tae
    • 대한의용생체공학회:의공학회지
    • /
    • 제44권4호
    • /
    • pp.293-301
    • /
    • 2023
  • Scoliosis is a three-dimensional deformation of the spine that is a deformity induced by physical or disease-related causes as the spine is rotated abnormally. Early detection has a significant influence on the possibility of nonsurgical treatment. To train a deep learning model with preprocessed images and to evaluate the results with and without data augmentation to enable the diagnosis of scoliosis based only on a chest X-ray image. The preprocessed images in which only the spine, rib contours, and some hard tissues were left from the original chest image, were used for learning along with the original images, and three CNN(Convolutional Neural Networks) models (VGG16, ResNet152, and EfficientNet) were selected to proceed with training. The results obtained by training with the preprocessed images showed a superior accuracy to those obtained by training with the original image. When the scoliosis image was added through data augmentation, the accuracy was further improved, ultimately achieving a classification accuracy of 93.56% with the ResNet152 model using test data. Through supplementation with future research, the method proposed herein is expected to allow the early diagnosis of scoliosis as well as cost reduction by reducing the burden of additional radiographic imaging for disease detection.

드론 영상 분석과 자료 증가 방법을 통한 건설 자재 수량 측정 (Measurement of Construction Material Quantity through Analyzing Images Acquired by Drone And Data Augmentation)

  • 문지환;송누리;최재갑;박진호;김계영
    • 정보처리학회논문지:소프트웨어 및 데이터공학
    • /
    • 제9권1호
    • /
    • pp.33-38
    • /
    • 2020
  • 본 논문에서는 드론에 의하여 획득된 영상을 분석하여 건축자재의 수량을 측정하는 기술을 제안한다. 제안하는 기술은 드론 및 카메라 정보가 담겨있는 드론 로그와 영상 내 건축자재더미 종류와 영역을 예측하는 RCNN, 실제적인 수량 계산을 위한 사진측량법을 사용한다. 기존 연구에선 학습 데이터의 부족으로, 자재 종류 및 건축자재더미 영역 예측 정확도의 오류 범위가 컸다. 논문에서는 이러한 오류 범위를 줄이고 예측 안정성을 높이기 위해 자료 증가 방법으로 학습 데이터를 증가시킨다. 자료 증가는 학습 모델의 과적합을 막기 위해 회전에 의한 증가 방법만 사용한다. 수량 계산 방법으로는 Yaw, FOV 등의 드론 및 카메라 정보가 담겨있는 드론 로그와 영상 내 건축자재더미 영역을 찾고, 종류를 예측해 줄 RCNN 모델을 사용하고, 이 모든 정보를 종합해 논문에서 제안하는 수식에 적용하여 자재더미의 실제적인 수량을 계산한다. 제안하는 방법의 우수성은 실험을 통하여 확인한다.