• Title/Summary/Keyword: VGG-16

Search Result 117, Processing Time 0.032 seconds

Enhanced Deep Learning for Animal Image Patch Classification (동물 이미지 패치 분류를 위한 향상된 딥 러닝)

  • Shin, Seong-Yoon;Lee, Hyun-Chang;Shin, Kwang-Seong
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2022.01a
    • /
    • pp.389-390
    • /
    • 2022
  • 본 논문에서는 동물 이미지 분류를 위한 작은 데이터 세트를 기반으로 하는 향상된 딥 러닝 방법을 제안한다. 먼저, CNN을 사용하여 작은 데이터 세트에 대한 훈련 모델을 구축한다. 데이터 증대를 사용하여 훈련 세트의 데이터 샘플을 확장한다. 다음으로, VGG16과 같은 대규모 데이터 세트에서 사전 훈련된 네트워크를 사용하여 작은 데이터 세트의 병목 현상 기능을 추출한다. 그리하여 두 개의 NumPy 파일에 새로운 훈련 데이터 세트 및 테스트 데이터 세트로 저장한다. 마지막으로 완전히 연결된 네트워크를 훈련시킨다.

  • PDF

Performance Comparison of Neural Network Models for Adversarial Attacks by Autonomous Ships (자율주행 선박의 적대적 공격에 대한 신경망 모델의 성능 비교)

  • Tae-Hoon Her;Ju-Hyeong Kim;Na-Hyun Kim;So-Yeon Kim
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.11a
    • /
    • pp.1106-1107
    • /
    • 2023
  • 자율주행 선박의 기술 발전에 따라 적대적 공격에 대한 위험성이 대두되고 있다. 이를 해결하기 위해 본 연구는 다양한 신경망 모델을 활용하여 적대적 공격을 탐지하는 성능을 체계적으로 비교, 분석하였다. CNN, GRU, LSTM, VGG16 모델을 사용하여 실험을 진행하였고, 이 중 VGG16 모델이 가장 높은 탐지 성능을 보였다. 본 연구의 결과를 통해 자율주행 선박에 적용될 수 있는 보안모델 구축에 대한 신뢰성 있는 방향성을 제시하고자 한다.

A study on classification of textile design and extraction of regions of interest (텍스타일 디자인 분류 및 관심 영역 도출에 대한 연구)

  • Chae, Seung Wan;Lee, Woo Chang;Lee, Byoung Woo;Lee, Choong Kwon
    • Smart Media Journal
    • /
    • v.10 no.2
    • /
    • pp.70-75
    • /
    • 2021
  • Grouping and classifying similar designs in design increase efficiency in terms of management and provide convenience in terms of use. Using artificial intelligence algorithms, this study attempted to classify textile designs into four categories: dots, flower patterns, stripes, and geometry. In particular, we explored whether it is possible to find and explain the regions of interest underlying classification from the perspective of artificial intelligence. We randomly extracted a total of 4,536 designs at a ratio of 8:2, comprising 3,629 for training and 907 for testing. The models used in the classification were VGG-16 and ResNet-34, both of which showed excellent classification performance with precision on flower pattern designs of 0.79%, 0.89% and recall of 0.95% and 0.38%. Analysis using the Local Interpretable Model-agnostic Explanation (LIME) technique has shown that geometry and flower-patterned designs derived shapes and petals from the region of interest on which classification was based.

Deep Learning-based Person Analysis in Oriental Painting for Supporting Famous Painting Habruta (명화 하브루타 지원을 위한 딥러닝 기반 동양화 인물 분석)

  • Moon, Hyeyoung;Kim, Namgyu
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.9
    • /
    • pp.105-116
    • /
    • 2021
  • Habruta is a question-based learning that talks, discusses, and argues in pairs. In particular, the famous painting Habruta is being implemented for the purpose of enhancing the appreciation ability of paintings and enriching the expressive power through questions and answers about the famous paintings. In this study, in order to support the famous painting Habruta for oriental paintings, we propose a method of automatically generating questions from the gender perspective of oriental painting characters using the current deep learning technology. Specifically, in this study, based on the pre-trained model, VGG16, we propose a model that can effectively analyze the features of Asian paintings by performing fine-tuning. In addition, we classify the types of questions into three types: fact, imagination, and applied questions used in the famous Habruta, and subdivide each question according to the character to derive a total of 9 question patterns. In order to verify the feasibilityof the proposed methodology, we conducted an experiment that analyzed 300 characters of actual oriental paintings. As a result of the experiment, we confirmed that the gender classification model according to our methodology shows higher accuracy than the existing model.

Rock Classification Prediction in Tunnel Excavation Using CNN (CNN 기법을 활용한 터널 암판정 예측기술 개발)

  • Kim, Hayoung;Cho, Laehun;Kim, Kyu-Sun
    • Journal of the Korean Geotechnical Society
    • /
    • v.35 no.9
    • /
    • pp.37-45
    • /
    • 2019
  • Quick identification of the condition of tunnel face and optimized determination of support patterns during tunnel excavation in underground construction projects help engineers prevent tunnel collapse and safely excavate tunnels. This study investigates a CNN technique for quick determination of rock quality classification depending on the condition of tunnel face, and presents the procedure for rock quality classification using a deep learning technique and the improved method for accurate prediction. The VGG16 model developed by tens of thousands prestudied images was used for deep learning, and 1,469 tunnel face images were used to classify the five types of rock quality condition. In this study, the prediction accuracy using this technique was up to 83.9%. It is expected that this technique can be used for an error-minimizing rock quality classification system not depending on experienced professionals in rock quality rating.

Grad-CAM based deep learning network for location detection of the main object (주 객체 위치 검출을 위한 Grad-CAM 기반의 딥러닝 네트워크)

  • Kim, Seon-Jin;Lee, Jong-Keun;Kwak, Nae-Jung;Ryu, Sung-Pil;Ahn, Jae-Hyeong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.2
    • /
    • pp.204-211
    • /
    • 2020
  • In this paper, we propose an optimal deep learning network architecture for main object location detection through weak supervised learning. The proposed network adds convolution blocks for improving the localization accuracy of the main object through weakly-supervised learning. The additional deep learning network consists of five additional blocks that add a composite product layer based on VGG-16. And the proposed network was trained by the method of weakly-supervised learning that does not require real location information for objects. In addition, Grad-CAM to compensate for the weakness of GAP in CAM, which is one of weak supervised learning methods, was used. The proposed network was tested through the CUB-200-2011 data set, we could obtain 50.13% in top-1 localization error. Also, the proposed network shows higher accuracy in detecting the main object than the existing method.

Performance Comparison of the Optimizers in a Faster R-CNN Model for Object Detection of Metaphase Chromosomes (중기 염색체 객체 검출을 위한 Faster R-CNN 모델의 최적화기 성능 비교)

  • Jung, Wonseok;Lee, Byeong-Soo;Seo, Jeongwook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.11
    • /
    • pp.1357-1363
    • /
    • 2019
  • In this paper, we compares the performance of the gredient descent optimizers of the Faster Region-based Convolutional Neural Network (R-CNN) model for the chromosome object detection in digital images composed of human metaphase chromosomes. In faster R-CNN, the gradient descent optimizer is used to minimize the objective function of the region proposal network (RPN) module and the classification score and bounding box regression blocks. The gradient descent optimizer. Through performance comparisons among these four gradient descent optimizers in our experiments, we found that the Adamax optimizer could achieve the mean average precision (mAP) of about 52% when considering faster R-CNN with a base network, VGG16. In case of faster R-CNN with a base network, ResNet50, the Adadelta optimizer could achieve the mAP of about 58%.

Arabic Words Extraction and Character Recognition from Picturesque Image Macros with Enhanced VGG-16 based Model Functionality Using Neural Networks

  • Ayed Ahmad Hamdan Al-Radaideh;Mohd Shafry bin Mohd Rahim;Wad Ghaban;Majdi Bsoul;Shahid Kamal;Naveed Abbas
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.7
    • /
    • pp.1807-1822
    • /
    • 2023
  • Innovation and rapid increased functionality in user friendly smartphones has encouraged shutterbugs to have picturesque image macros while in work environment or during travel. Formal signboards are placed with marketing objectives and are enriched with text for attracting people. Extracting and recognition of the text from natural images is an emerging research issue and needs consideration. When compared to conventional optical character recognition (OCR), the complex background, implicit noise, lighting, and orientation of these scenic text photos make this problem more difficult. Arabic language text scene extraction and recognition adds a number of complications and difficulties. The method described in this paper uses a two-phase methodology to extract Arabic text and word boundaries awareness from scenic images with varying text orientations. The first stage uses a convolution autoencoder, and the second uses Arabic Character Segmentation (ACS), which is followed by traditional two-layer neural networks for recognition. This study presents the way that how can an Arabic training and synthetic dataset be created for exemplify the superimposed text in different scene images. For this purpose a dataset of size 10K of cropped images has been created in the detection phase wherein Arabic text was found and 127k Arabic character dataset for the recognition phase. The phase-1 labels were generated from an Arabic corpus of quotes and sentences, which consists of 15kquotes and sentences. This study ensures that Arabic Word Awareness Region Detection (AWARD) approach with high flexibility in identifying complex Arabic text scene images, such as texts that are arbitrarily oriented, curved, or deformed, is used to detect these texts. Our research after experimentations shows that the system has a 91.8% word segmentation accuracy and a 94.2% character recognition accuracy. We believe in the future that the researchers will excel in the field of image processing while treating text images to improve or reduce noise by processing scene images in any language by enhancing the functionality of VGG-16 based model using Neural Networks.

Analysis of Transfer Learning Effect for Automatic Dog Breed Classification (반려견 자동 품종 분류를 위한 전이학습 효과 분석)

  • Lee, Dongsu;Park, Gooman
    • Journal of Broadcast Engineering
    • /
    • v.27 no.1
    • /
    • pp.133-145
    • /
    • 2022
  • Compared to the continuously increasing dog population and industry size in Korea, systematic analysis of related data and research on breed classification methods are very insufficient. In this paper, an automatic breed classification method is proposed using deep learning technology for 14 major dog breeds domestically raised. To do this, dog images are collected for deep learning training and a dataset is built, and a breed classification algorithm is created by performing transfer learning based on VGG-16 and Resnet-34 as backbone networks. In order to check the transfer learning effect of the two models on dog images, we compared the use of pre-trained weights and the experiment of updating the weights. When fine tuning was performed based on VGG-16 backbone network, in the final model, the accuracy of Top 1 was about 89% and that of Top 3 was about 94%, respectively. The domestic dog breed classification method and data construction proposed in this paper have the potential to be used for various application purposes, such as classification of abandoned and lost dog breeds in animal protection centers or utilization in pet-feed industry.

Handwriting Thai Digit Recognition Using Convolution Neural Networks (다양한 컨볼루션 신경망을 이용한 태국어 숫자 인식)

  • Onuean, Athita;Jung, Hanmin;Kim, Taehong
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.15-17
    • /
    • 2021
  • Handwriting recognition research is mainly focused on deep learning techniques and has achieved a great performance in the last few years. Especially, handwritten Thai digit recognition has been an important research area including generic digital numerical information, such as Thai official government documents and receipts. However, it becomes also a challenging task for a long time. For resolving the unavailability of a large Thai digit dataset, this paper constructs our dataset and learns them with some variants of the CNN model; Decision tree, K-nearest neighbors, Alexnet, LaNet-5, and VGG (11,13,16,19). The experimental results using the accuracy metric show the maximum accuracy of 98.29% when using VGG 13 with batch normalization.

  • PDF