• 제목/요약/키워드: Convolutional neural net

검색결과 299건 처리시간 0.024초

심층 CNN 기반 구조를 이용한 토마토 작물 병해충 분류 모델 (Tomato Crop Diseases Classification Models Using Deep CNN-based Architectures)

  • 김삼근;안재근
    • 한국산학기술학회논문지
    • /
    • 제22권5호
    • /
    • pp.7-14
    • /
    • 2021
  • 토마토 작물은 병해충의 영향을 많이 받기 때문에 이를 예방하지 않으면 농업 경제에 막대한 손실을 초래할 수 있다. 따라서 토마토의 다양한 병해충의 진단을 빠르고 정확하게 진단하는 시스템이 요구된다. 본 논문에서는 ImageNet 데이터 셋 상에서 다양하게 사전 학습된 딥러닝 기반 CNN 모델을 적용하여 토마토의 9가지 병해충 및 정상인 경우의 클래스를 분류하는 시스템을 제안한다. PlantVillage 데이터 셋으로부터 발췌한 토마토 잎의 이미지 셋을 3가지 딥러닝 기반 CNN 구조를 갖는 ResNet, Xception, DenseNet의 입력으로 사용한다. 기본 CNN 모델 위에 톱-레벨 분류기를 추가하여 제안 모델을 구성하였으며, 훈련 데이터 셋에 대해 5-fold 교차검증 기법을 적용하여 학습시켰다. 3가지 제안 모델의 학습은 모두 기본 CNN 모델의 계층을 동결하여 학습시키는 전이 학습과 동결을 해제한 후 학습률을 매우 작은 수로 설정하여 학습시키는 미세 조정 학습 두 단계로 진행하였다. 모델 최적화 알고리즘으로는 SGD, RMSprop, Adam을 적용하였다. 실험 결과는 RMSprop 알고리즘이 적용된 DenseNet CNN 모델이 98.63%의 정확도로 가장 우수한 결과를 보였다.

외부 환경에 강인한 딥러닝 기반 손 제스처 인식 (A Deep Learning-based Hand Gesture Recognition Robust to External Environments)

  • 오동한;이병희;김태영
    • 한국차세대컴퓨팅학회논문지
    • /
    • 제14권5호
    • /
    • pp.31-39
    • /
    • 2018
  • 최근 딥러닝을 기반으로 사용자의 손 제스처를 인식하여 가상현실 환경에서 사용자 친화적 인터페이스를 제공하기 위한 연구가 활발히 진행되고 있다. 그러나 대부분 연구들은 손 정보를 얻기 위하여 별도 센서를 사용하거나 효율적인 학습을 위하여 전처리 과정을 거친다. 또한 조명의 변화나 손 일부가 가려지는 등과 같은 외부환경의 변화를 고려하지 못하고 있다. 본 논문은 일반 웹캠에서 얻어진 RGB 영상에서 별도의 전처리 과정없이 외부 환경에 강인한 딥러닝 기반 손 제스처 인식 방법을 제안한다. 딥러닝 모델로 VGGNet과 GoogLeNet 구조를 개선하고, 각 구조의 성능을 비교한다. 조명이 어둡거나 손 일부가 가려지거나 시야에서 일부 벗어난 손 영상들이 포함된 데이터로 실험한 결과 본 연구에서 제시한 VGGNet과 GoogLeNet 구조는 각각 93.88%와 93.75%의 인식률을 보였고 메모리와 속도 측면에서 GoogLeNet이 VGGNet 보다 메모리를 약 3배 적게 사용하면서 처리속도는 10배 이상 우수함을 알 수 있었다. 본 연구의 결과는 실시간 처리가 가능하여 가상현실 환경에서 게임, 교육, 의료 등 다양한 분야에서 손 제스처 인터페이스로 활용될 수 있다.

Feasibility of Deep Learning-Based Analysis of Auscultation for Screening Significant Stenosis of Native Arteriovenous Fistula for Hemodialysis Requiring Angioplasty

  • Jae Hyon Park;Insun Park;Kichang Han;Jongjin Yoon;Yongsik Sim;Soo Jin Kim;Jong Yun Won;Shina Lee;Joon Ho Kwon;Sungmo Moon;Gyoung Min Kim;Man-deuk Kim
    • Korean Journal of Radiology
    • /
    • 제23권10호
    • /
    • pp.949-958
    • /
    • 2022
  • Objective: To investigate the feasibility of using a deep learning-based analysis of auscultation data to predict significant stenosis of arteriovenous fistulas (AVF) in patients undergoing hemodialysis requiring percutaneous transluminal angioplasty (PTA). Materials and Methods: Forty patients (24 male and 16 female; median age, 62.5 years) with dysfunctional native AVF were prospectively recruited. Digital sounds from the AVF shunt were recorded using a wireless electronic stethoscope before (pre-PTA) and after PTA (post-PTA), and the audio files were subsequently converted to mel spectrograms, which were used to construct various deep convolutional neural network (DCNN) models (DenseNet201, EfficientNetB5, and ResNet50). The performance of these models for diagnosing ≥ 50% AVF stenosis was assessed and compared. The ground truth for the presence of ≥ 50% AVF stenosis was obtained using digital subtraction angiography. Gradient-weighted class activation mapping (Grad-CAM) was used to produce visual explanations for DCNN model decisions. Results: Eighty audio files were obtained from the 40 recruited patients and pooled for the study. Mel spectrograms of "pre-PTA" shunt sounds showed patterns corresponding to abnormal high-pitched bruits with systolic accentuation observed in patients with stenotic AVF. The ResNet50 and EfficientNetB5 models yielded an area under the receiver operating characteristic curve of 0.99 and 0.98, respectively, at optimized epochs for predicting ≥ 50% AVF stenosis. However, Grad-CAM heatmaps revealed that only ResNet50 highlighted areas relevant to AVF stenosis in the mel spectrogram. Conclusion: Mel spectrogram-based DCNN models, particularly ResNet50, successfully predicted the presence of significant AVF stenosis requiring PTA in this feasibility study and may potentially be used in AVF surveillance.

Comparison and optimization of deep learning-based radiosensitivity prediction models using gene expression profiling in National Cancer Institute-60 cancer cell line

  • Kim, Euidam;Chung, Yoonsun
    • Nuclear Engineering and Technology
    • /
    • 제54권8호
    • /
    • pp.3027-3033
    • /
    • 2022
  • Background: In this study, various types of deep-learning models for predicting in vitro radiosensitivity from gene-expression profiling were compared. Methods: The clonogenic surviving fractions at 2 Gy from previous publications and microarray gene-expression data from the National Cancer Institute-60 cell lines were used to measure the radiosensitivity. Seven different prediction models including three distinct multi-layered perceptrons (MLP), four different convolutional neural networks (CNN) were compared. Folded cross-validation was applied to train and evaluate model performance. The criteria for correct prediction were absolute error < 0.02 or relative error < 10%. The models were compared in terms of prediction accuracy, training time per epoch, training fluctuations, and required calculation resources. Results: The strength of MLP-based models was their fast initial convergence and short training time per epoch. They represented significantly different prediction accuracy depending on the model configuration. The CNN-based models showed relatively high prediction accuracy, low training fluctuations, and a relatively small increase in the memory requirement as the model deepens. Conclusion: Our findings suggest that a CNN-based model with moderate depth would be appropriate when the prediction accuracy is important, and a shallow MLP-based model can be recommended when either the training resources or time are limited.

비드 이미지 데이터를 활용한 레이저 공정변수 예측 (Prediction of Laser Process Parameters using Bead Image Data)

  • 전예랑;최해운
    • 한국기계가공학회지
    • /
    • 제21권6호
    • /
    • pp.8-14
    • /
    • 2022
  • In this study reports experiments were conducted to determine the quality of weld beads of different materials, Al and Cu. Among the lasers used to make battery cells for electric vehicles, non-destructive testing was performed using deep learning to determine the quality of beads welded with the ARM laser. Deep learning was performed using AlexNet algorithm with a convolutional neural network structure. The results of quality identification were divided into good and bad, and the result value was derived that all the results were in agreement with 94% or more. Overall, the best welding quality was obtained in the experiment for the fixed ring beam output/variable center beam output, in the case of the fixed beam (ring beam) 500W and variable beam (center beam) 1,050W; weld bead failure was seldom observed. The tensile force test to confirm the reliability of welding reported an average tensile force of 2.5kgf/mm or more in all sections.

One-step deep learning-based method for pixel-level detection of fine cracks in steel girder images

  • Li, Zhihang;Huang, Mengqi;Ji, Pengxuan;Zhu, Huamei;Zhang, Qianbing
    • Smart Structures and Systems
    • /
    • 제29권1호
    • /
    • pp.153-166
    • /
    • 2022
  • Identifying fine cracks in steel bridge facilities is a challenging task of structural health monitoring (SHM). This study proposed an end-to-end crack image segmentation framework based on a one-step Convolutional Neural Network (CNN) for pixel-level object recognition with high accuracy. To particularly address the challenges arising from small object detection in complex background, efforts were made in loss function selection aiming at sample imbalance and module modification in order to improve the generalization ability on complicated images. Specifically, loss functions were compared among alternatives including the Binary Cross Entropy (BCE), Focal, Tversky and Dice loss, with the last three specialized for biased sample distribution. Structural modifications with dilated convolution, Spatial Pyramid Pooling (SPP) and Feature Pyramid Network (FPN) were also performed to form a new backbone termed CrackDet. Models of various loss functions and feature extraction modules were trained on crack images and tested on full-scale images collected on steel box girders. The CNN model incorporated the classic U-Net as its backbone, and Dice loss as its loss function achieved the highest mean Intersection-over-Union (mIoU) of 0.7571 on full-scale pictures. In contrast, the best performance on cropped crack images was achieved by integrating CrackDet with Dice loss at a mIoU of 0.7670.

Enhanced 3D Residual Network for Human Fall Detection in Video Surveillance

  • Li, Suyuan;Song, Xin;Cao, Jing;Xu, Siyang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권12호
    • /
    • pp.3991-4007
    • /
    • 2022
  • In the public healthcare, a computational system that can automatically and efficiently detect and classify falls from a video sequence has significant potential. With the advancement of deep learning, which can extract temporal and spatial information, has become more widespread. However, traditional 3D CNNs that usually adopt shallow networks cannot obtain higher recognition accuracy than deeper networks. Additionally, some experiences of neural network show that the problem of gradient explosions occurs with increasing the network layers. As a result, an enhanced three-dimensional ResNet-based method for fall detection (3D-ERes-FD) is proposed to directly extract spatio-temporal features to address these issues. In our method, a 50-layer 3D residual network is used to deepen the network for improving fall recognition accuracy. Furthermore, enhanced residual units with four convolutional layers are developed to efficiently reduce the number of parameters and increase the depth of the network. According to the experimental results, the proposed method outperformed several state-of-the-art methods.

Automatic assessment of post-earthquake buildings based on multi-task deep learning with auxiliary tasks

  • Zhihang Li;Huamei Zhu;Mengqi Huang;Pengxuan Ji;Hongyu Huang;Qianbing Zhang
    • Smart Structures and Systems
    • /
    • 제31권4호
    • /
    • pp.383-392
    • /
    • 2023
  • Post-earthquake building condition assessment is crucial for subsequent rescue and remediation and can be automated by emerging computer vision and deep learning technologies. This study is based on an endeavour for the 2nd International Competition of Structural Health Monitoring (IC-SHM 2021). The task package includes five image segmentation objectives - defects (crack/spall/rebar exposure), structural component, and damage state. The structural component and damage state tasks are identified as the priority that can form actionable decisions. A multi-task Convolutional Neural Network (CNN) is proposed to conduct the two major tasks simultaneously. The rest 3 sub-tasks (spall/crack/rebar exposure) were incorporated as auxiliary tasks. By synchronously learning defect information (spall/crack/rebar exposure), the multi-task CNN model outperforms the counterpart single-task models in recognizing structural components and estimating damage states. Particularly, the pixel-level damage state estimation witnesses a mIoU (mean intersection over union) improvement from 0.5855 to 0.6374. For the defect detection tasks, rebar exposure is omitted due to the extremely biased sample distribution. The segmentations of crack and spall are automated by single-task U-Net but with extra efforts to resample the provided data. The segmentation of small objects (spall and crack) benefits from the resampling method, with a substantial IoU increment of nearly 10%.

Self-Attention 딥러닝 모델 기반 산업 제품의 이상 영역 분할 성능 분석 (Performance Analysis of Anomaly Area Segmentation in Industrial Products Based on Self-Attention Deep Learning Model)

  • 박창준;김남중;박준휘;이재현;곽정환
    • 한국컴퓨터정보학회:학술대회논문집
    • /
    • 한국컴퓨터정보학회 2024년도 제69차 동계학술대회논문집 32권1호
    • /
    • pp.45-46
    • /
    • 2024
  • 본 논문에서는 Self-Attention 기반 딥러닝 기법인 Dense Prediction Transformer(DPT) 모델을 MVTec Anomaly Detection(MVTec AD) 데이터셋에 적용하여 실제 산업 제품 이미지 내 이상 부분을 분할하는 연구를 진행하였다. DPT 모델의 적용을 통해 기존 Convolutional Neural Network(CNN) 기반 이상 탐지기법의 한계점인 지역적 Feature 추출 및 고정된 수용영역으로 인한 문제를 개선하였으며, 실제 산업 제품 데이터에서의 이상 분할 시 기존 주력 기법인 U-Net의 구조를 적용한 최고 성능의 모델보다 1.14%만큼의 성능 향상을 보임에 따라 Self-Attention 기반 딥러닝 기법의 적용이 산업 제품 이상 분할에 효과적임을 입증하였다.

  • PDF

Approach to diagnosing multiple abnormal events with single-event training data

  • Ji Hyeon Shin;Seung Gyu Cho;Seo Ryong Koo;Seung Jun Lee
    • Nuclear Engineering and Technology
    • /
    • 제56권2호
    • /
    • pp.558-567
    • /
    • 2024
  • Diagnostic support systems are being researched to assist operators in identifying and responding to abnormal events in a nuclear power plant. Most studies to date have considered single abnormal events only, for which it is relatively straightforward to obtain data to train the deep learning model of the diagnostic support system. However, cases in which multiple abnormal events occur must also be considered, for which obtaining training data becomes difficult due to the large number of combinations of possible abnormal events. This study proposes an approach to maintain diagnostic performance for multiple abnormal events by training a deep learning model with data on single abnormal events only. The proposed approach is applied to an existing algorithm that can perform feature selection and multi-label classification. We choose an extremely randomized trees classifier to select dedicated monitoring parameters for target abnormal events. In diagnosing each event occurrence independently, two-channel convolutional neural networks are employed as sub-models. The algorithm was tested in a case study with various scenarios, including single and multiple abnormal events. Results demonstrated that the proposed approach maintained diagnostic performance for 15 single abnormal events and significantly improved performance for 105 multiple abnormal events compared to the base model.