• Title/Summary/Keyword: 이미지 데이터 셋

Search Result 294, Processing Time 0.028 seconds

Real-time mask facial expression recognition using Tiny-YOLOv3 and ResNet50 (Tiny-YOLOv3와 ResNet50을 이용한 실시간 마스크 표정인식)

  • Park, Gyuri;Park, Nayeon;Kim, Seungwoo;Kim, Seunghye;Kim, Jinsan;Ko, Byungchul
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • fall
    • /
    • pp.232-234
    • /
    • 2021
  • 최근 휴먼-컴퓨터 인터페이스, 가상현식, 증강현실, 지능형 자동차등에서 얼굴표정 인식에 대한 연구가 활발히 진행되고 있다. 얼굴표정인식 연구는 대부분 맨얼굴을 대상으로 하고 있지만 최근 코로나-19로 인해 마스크 착용한 사람들이 많아지면서, 마스크를 착용했을 때의 표정인식에 대한 필요성이 증가하고 있다. 본 논문은 마스크를 착용했을 때에도 실시간으로 표정 분류가 가능한 시스템개발을 목표로 구동에 필요한 알고리즘을 조사했고, 그 중 Tiny-YOLOv3와 ResNet50 알고리즘을 이용하기로 했다. 얼굴과 표정 데이터셋 등에서 모은 이미지 데이터를 사용하여 실행해 보고 그 적절성 및 성능에 대해 평가해 보았다.

  • PDF

Prediction of Doodle Images Using Neural Networks

  • Hae-Chan Lee;Kyu-Cheol Cho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.5
    • /
    • pp.29-38
    • /
    • 2023
  • Doodles, often possess irregular shapes and patterns, making it challenging for artificial intelligence to mechanically recognize and predict patterns in random doodles. Unlike humans who can effortlessly recognize and predict doodles even when circles are imperfect or lines are not perfectly straight, artificial intelligence requires learning from given training data to recognize and predict doodles. In this paper, we leverage a diverse dataset of doodle images from individuals of various nationalities, cultures, left-handedness, and right-handedness. After training two neural networks, we determine which network offers higher accuracy and is more suitable for doodle image prediction. The motivation behind predicting doodle images using artificial intelligence lies in providing a unique perspective on human expression and intent through the utilization of neural networks. For instance, by using the various images generated by artificial intelligence based on human-drawn doodles, we expect to foster diversity in artistic expression and expand the creative domain.

Scene Text Recognition Performance Improvement through an Add-on of an OCR based Classifier (OCR 엔진 기반 분류기 애드온 결합을 통한 이미지 내부 텍스트 인식 성능 향상)

  • Chae, Ho-Yeol;Seok, Ho-Sik
    • Journal of IKEEE
    • /
    • v.24 no.4
    • /
    • pp.1086-1092
    • /
    • 2020
  • An autonomous agent for real world should be able to recognize text in scenes. With the advancement of deep learning, various DNN models have been utilized for transformation, feature extraction, and predictions. However, the existing state-of-the art STR (Scene Text Recognition) engines do not achieve the performance required for real world applications. In this paper, we introduce a performance-improvement method through an add-on composed of an OCR (Optical Character Recognition) engine and a classifier for STR engines. On instances from IC13 and IC15 datasets which a STR engine failed to recognize, our method recognizes 10.92% of unrecognized characters.

Detail Focused Image Classifier Model for Traditional Images (전통문화 이미지를 위한 세부 자질 주목형 이미지 자동 분석기)

  • Kim, Kuekyeng;Hur, Yuna;Kim, Gyeongmin;Yu, Wonhee;Lim, Heuiseok
    • Journal of the Korea Convergence Society
    • /
    • v.8 no.12
    • /
    • pp.85-92
    • /
    • 2017
  • As accessibility toward traditional cultural contents drops compared to its increase in production, the need for higher accessibility for continued management and research to exist. For this, this paper introduces an image classifier model for traditional images based on artificial neural networks, which converts the input image's features into a vector space and by utilizing a RNN based model it recognizes and compares the details of the input which enables the classification of traditional images. This enables the classifiers to classify similarly looking traditional images more precisely by focusing on the details. For the training of this model, a wide range of images were arranged and collected based on the format of the Korean information culture field, which contributes to other researches related to the fields of using traditional cultural images. Also, this research contributes to the further activation of demand, supply, and researches related to traditional culture.

A Study on Flame Detection using Faster R-CNN and Image Augmentation Techniques (Faster R-CNN과 이미지 오그멘테이션 기법을 이용한 화염감지에 관한 연구)

  • Kim, Jae-Jung;Ryu, Jin-Kyu;Kwak, Dong-Kurl;Byun, Sun-Joon
    • Journal of IKEEE
    • /
    • v.22 no.4
    • /
    • pp.1079-1087
    • /
    • 2018
  • Recently, computer vision field based deep learning artificial intelligence has become a hot topic among various image analysis boundaries. In this study, flames are detected in fire images using the Faster R-CNN algorithm, which is used to detect objects within the image, among various image recognition algorithms based on deep learning. In order to improve fire detection accuracy through a small amount of data sets in the learning process, we use image augmentation techniques, and learn image augmentation by dividing into 6 types and compare accuracy, precision and detection rate. As a result, the detection rate increases as the type of image augmentation increases. However, as with the general accuracy and detection rate of other object detection models, the false detection rate is also increased from 10% to 30%.

Korean Text Image Super-Resolution for Improving Text Recognition Accuracy (텍스트 인식률 개선을 위한 한글 텍스트 이미지 초해상화)

  • Junhyeong Kwon;Nam Ik Cho
    • Journal of Broadcast Engineering
    • /
    • v.28 no.2
    • /
    • pp.178-184
    • /
    • 2023
  • Finding texts in general scene images and recognizing their contents is a very important task that can be used as a basis for robot vision, visual assistance, and so on. However, for the low-resolution text images, the degradations, such as noise or blur included in text images, are more noticeable, which leads to severe performance degradation of text recognition accuracy. In this paper, we propose a new Korean text image super-resolution based on a Transformer-based model, which generally shows higher performance than convolutional neural networks. In the experiments, we show that text recognition accuracy for Korean text images can be improved when our proposed text image super-resolution method is used. We also propose a new Korean text image dataset for training our model, which contains massive HR-LR Korean text image pairs.

Cascade Fusion-Based Multi-Scale Enhancement of Thermal Image (캐스케이드 융합 기반 다중 스케일 열화상 향상 기법)

  • Kyung-Jae Lee
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.19 no.1
    • /
    • pp.301-307
    • /
    • 2024
  • This study introduces a novel cascade fusion architecture aimed at enhancing thermal images across various scale conditions. The processing of thermal images at multiple scales has been challenging due to the limitations of existing methods that are designed for specific scales. To overcome these limitations, this paper proposes a unified framework that utilizes cascade feature fusion to effectively learn multi-scale representations. Confidence maps from different image scales are fused in a cascaded manner, enabling scale-invariant learning. The architecture comprises end-to-end trained convolutional neural networks to enhance image quality by reinforcing mutual scale dependencies. Experimental results indicate that the proposed technique outperforms existing methods in multi-scale thermal image enhancement. Performance evaluation results are provided, demonstrating consistent improvements in image quality metrics. The cascade fusion design facilitates robust generalization across scales and efficient learning of cross-scale representations.

Tomato Crop Diseases Classification Models Using Deep CNN-based Architectures (심층 CNN 기반 구조를 이용한 토마토 작물 병해충 분류 모델)

  • Kim, Sam-Keun;Ahn, Jae-Geun
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.22 no.5
    • /
    • pp.7-14
    • /
    • 2021
  • Tomato crops are highly affected by tomato diseases, and if not prevented, a disease can cause severe losses for the agricultural economy. Therefore, there is a need for a system that quickly and accurately diagnoses various tomato diseases. In this paper, we propose a system that classifies nine diseases as well as healthy tomato plants by applying various pretrained deep learning-based CNN models trained on an ImageNet dataset. The tomato leaf image dataset obtained from PlantVillage is provided as input to ResNet, Xception, and DenseNet, which have deep learning-based CNN architectures. The proposed models were constructed by adding a top-level classifier to the basic CNN model, and they were trained by applying a 5-fold cross-validation strategy. All three of the proposed models were trained in two stages: transfer learning (which freezes the layers of the basic CNN model and then trains only the top-level classifiers), and fine-tuned learning (which sets the learning rate to a very small number and trains after unfreezing basic CNN layers). SGD, RMSprop, and Adam were applied as optimization algorithms. The experimental results show that the DenseNet CNN model to which the RMSprop algorithm was applied output the best results, with 98.63% accuracy.

Object classification for domestic waste based on Convolutional neural networks (심층 신경망 기반의 생활폐기물 자동 분류)

  • Nam, Junyoung;Lee, Christine;Patankar, Asif Ashraf;Wang, Hanxiang;Li, Yanfen;Moon, Hyeonjoon
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2019.11a
    • /
    • pp.83-86
    • /
    • 2019
  • 도시화 과정에서 도시의 생활폐기물 문제가 빠르게 증가되고 있고, 효과적이지 못한 생활폐기물 관리는 도시의 오염을 악화시키고 물리적인 환경오염과 경제적인 부분에서 극심한 문제들을 야기시킬 수 있다. 게다가 부피가 커서 관리하기 힘든 대형 생활폐기물들이 증가하여 도시 발전에도 방해가 된다. 생활폐기물을 처리하는데 있어 대형 생활폐기물 품목에 대해서는 요금을 청구하여 처리한다. 다양한 유형의 대형 생활폐기물을 수동으로 분류하는 것은 시간과 비용이 많이 든다. 그 결과 대형 생활폐기물을 자동으로 분류하는 시스템을 도입하는 것이 중요하다. 본 논문에서는 대형 생활폐기물 분류를 위한 시스템을 제안하며, 이 논문의 4 가지로 분류된다. 1) 높은 정확도와 강 분류(roust classification) 수행에 적합한 Convolution Neural Network(CNN) 모델 중 VGG-19, Inception-V3, ResNet50 의 정확도와 속도를 비교한다. 제안된 20 개의 클래스의 대형 생활폐기물의 데이터 셋(data set)에 대해 가장 높은 분류의 정확도는 86.19%이다. 2) 불균형 데이터 문제를 처리하기 Class Weight VGG-19(CW-VGG-19)와 Extreme Gradient Boosting VGG-19 두 가지 방법을 사용하였다. 3) 20 개의 클래스를 포함하는 데이터 셋을 수동으로 수집 및 검증하였으며 각 클래스의 컬러 이미지 수는 500 개 이상이다. 4) 딥 러닝(Deep Learning) 기반 모바일 애플리케이션을 개발하였다.

  • PDF

SKU-Net: Improved U-Net using Selective Kernel Convolution for Retinal Vessel Segmentation

  • Hwang, Dong-Hwan;Moon, Gwi-Seong;Kim, Yoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.4
    • /
    • pp.29-37
    • /
    • 2021
  • In this paper, we propose a deep learning-based retinal vessel segmentation model for handling multi-scale information of fundus images. we integrate the selective kernel convolution into U-Net-based convolutional neural network. The proposed model extracts and segment features information with various shapes and sizes of retinal blood vessels, which is important information for diagnosing eye-related diseases from fundus images. The proposed model consists of standard convolutions and selective kernel convolutions. While the standard convolutional layer extracts information through the same size kernel size, The selective kernel convolution extracts information from branches with various kernel sizes and combines them by adaptively adjusting them through split-attention. To evaluate the performance of the proposed model, we used the DRIVE and CHASE DB1 datasets and the proposed model showed F1 score of 82.91% and 81.71% on both datasets respectively, confirming that the proposed model is effective in segmenting retinal blood vessels.