• Title/Summary/Keyword: learning through the image

Search Result 925, Processing Time 0.03 seconds

In-Loop Filtering with a Deep Network in HEVC (깊은 신경망을 사용한 HEVC의 루프 내 필터링)

  • Kim, Dongsin;Lee, So Yoon;Yang, Yoonmo;Oh, Byung Tae
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2020.11a
    • /
    • pp.145-147
    • /
    • 2020
  • As deep learning technology advances, there have been many attempts to improve video codecs such as High-Efficiency-Video-Coding (HEVC) using deep learning technology. One of the most researched approaches is improving filters inside codecs through image restoration researches. In this paper, we propose a method 01 replacing the sample adaptive offset (SAO) filtering with a deep neural network. The proposed method uses the deep neural network to find the optimal offset value. The proposed network consists of two subnetworks to find the offset value and its type of the signal, which can restore nonlinear and complex type of error. Experimental results show that the performance is better than the conventional HEVC in low delay P and random access mode.

  • PDF

Implementation of handwritten digit recognition CNN structure using GPGPU and Combined Layer (GPGPU와 Combined Layer를 이용한 필기체 숫자인식 CNN구조 구현)

  • Lee, Sangil;Nam, Kihun;Jung, Jun Mo
    • The Journal of the Convergence on Culture Technology
    • /
    • v.3 no.4
    • /
    • pp.165-169
    • /
    • 2017
  • CNN(Convolutional Nerual Network) is one of the algorithms that show superior performance in image recognition and classification among machine learning algorithms. CNN is simple, but it has a large amount of computation and it takes a lot of time. Consequently, in this paper we performed an parallel processing unit for the convolution layer, pooling layer and the fully connected layer, which consumes a lot of handling time in the process of CNN, through the SIMT(Single Instruction Multiple Thread)'s structure of GPGPU(General-Purpose computing on Graphics Processing Units).And we also expect to improve performance by reducing the number of memory accesses and directly using the output of convolution layer not storing it in pooling layer. In this paper, we use MNIST dataset to verify this experiment and confirm that the proposed CNN structure is 12.38% better than existing structure.

Study on driver's distraction research trend and deep learning based behavior recognition model

  • Han, Sangkon;Choi, Jung-In
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.11
    • /
    • pp.173-182
    • /
    • 2021
  • In this paper, we analyzed driver's and passenger's motions that cause driver's distraction, and recognized 10 driver's behaviors related to mobile phones. First, distraction-inducing behaviors were classified into environments and factors, and related recent papers were analyzed. Based on the analyzed papers, 10 driver's behaviors related to cell phones, which are the main causes of distraction, were recognized. The experiment was conducted based on about 100,000 image data. Features were extracted through SURF and tested with three models (CNN, ResNet-101, and improved ResNet-101). The improved ResNet-101 model reduced training and validation errors by 8.2 times and 44.6 times compared to CNN, and the average precision and f1-score were maintained at a high level of 0.98. In addition, using CAM (class activation maps), it was reviewed whether the deep learning model used the cell phone object and location as the decisive cause when judging the driver's distraction behavior.

Study on Detection Technique for Sea Fog by using CCTV Images and Convolutional Neural Network (CCTV 영상과 합성곱 신경망을 활용한 해무 탐지 기법 연구)

  • Kim, Na-Kyeong;Bak, Su-Ho;Jeong, Min-Ji;Hwang, Do-Hyun;Enkhjargal, Unuzaya;Park, Mi-So;Kim, Bo-Ram;Yoon, Hong-Joo
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.15 no.6
    • /
    • pp.1081-1088
    • /
    • 2020
  • In this paper, the method of detecting sea fog through CCTV image is proposed based on convolutional neural networks. The study data randomly extracted 1,0004 images, sea-fog and not sea-fog, from a total of 11 ports or beaches (Busan Port, Busan New Port, Pyeongtaek Port, Incheon Port, Gunsan Port, Daesan Port, Mokpo Port, Yeosu Gwangyang Port, Ulsan Port, Pohang Port, and Haeundae Beach) based on 1km of visibility. 80% of the total 1,0004 datasets were extracted and used for learning the convolutional neural network model. The model has 16 convolutional layers and 3 fully connected layers, and a convolutional neural network that performs Softmax classification in the last fully connected layer is used. Model accuracy evaluation was performed using the remaining 20%, and the accuracy evaluation result showed a classification accuracy of about 96%.

Development of Face Recognition System based on Real-time Mini Drone Camera Images (실시간 미니드론 카메라 영상을 기반으로 한 얼굴 인식 시스템 개발)

  • Kim, Sung-Ho
    • Journal of Convergence for Information Technology
    • /
    • v.9 no.12
    • /
    • pp.17-23
    • /
    • 2019
  • In this paper, I propose a system development methodology that accepts images taken by camera attached to drone in real time while controlling mini drone and recognize and confirm the face of certain person. For the development of this system, OpenCV, Python related libraries and the drone SDK are used. To increase face recognition ratio of certain person from real-time drone images, it uses Deep Learning-based facial recognition algorithm and uses the principle of Triples in particular. To check the performance of the system, the results of 30 experiments for face recognition based on the author's face showed a recognition rate of about 95% or higher. It is believed that research results of this paper can be used to quickly find specific person through drone at tourist sites and festival venues.

A Study of Tram-Pedestrian Collision Prediction Method Using YOLOv5 and Motion Vector (YOLOv5와 모션벡터를 활용한 트램-보행자 충돌 예측 방법 연구)

  • Kim, Young-Min;An, Hyeon-Uk;Jeon, Hee-gyun;Kim, Jin-Pyeong;Jang, Gyu-Jin;Hwang, Hyeon-Chyeol
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.12
    • /
    • pp.561-568
    • /
    • 2021
  • In recent years, autonomous driving technologies have become a high-value-added technology that attracts attention in the fields of science and industry. For smooth Self-driving, it is necessary to accurately detect an object and estimate its movement speed in real time. CNN-based deep learning algorithms and conventional dense optical flows have a large consumption time, making it difficult to detect objects and estimate its movement speed in real time. In this paper, using a single camera image, fast object detection was performed using the YOLOv5 algorithm, a deep learning algorithm, and fast estimation of the speed of the object was performed by using a local dense optical flow modified from the existing dense optical flow based on the detected object. Based on this algorithm, we present a system that can predict the collision time and probability, and through this system, we intend to contribute to prevent tram accidents.

CNN-based Recommendation Model for Classifying HS Code (HS 코드 분류를 위한 CNN 기반의 추천 모델 개발)

  • Lee, Dongju;Kim, Gunwoo;Choi, Keunho
    • Management & Information Systems Review
    • /
    • v.39 no.3
    • /
    • pp.1-16
    • /
    • 2020
  • The current tariff return system requires tax officials to calculate tax amount by themselves and pay the tax amount on their own responsibility. In other words, in principle, the duty and responsibility of reporting payment system are imposed only on the taxee who is required to calculate and pay the tax accurately. In case the tax payment system fails to fulfill the duty and responsibility, the additional tax is imposed on the taxee by collecting the tax shortfall and imposing the tax deduction on For this reason, item classifications, together with tariff assessments, are the most difficult and could pose a significant risk to entities if they are misclassified. For this reason, import reports are consigned to customs officials, who are customs experts, while paying a substantial fee. The purpose of this study is to classify HS items to be reported upon import declaration and to indicate HS codes to be recorded on import declaration. HS items were classified using the attached image in the case of item classification based on the case of the classification of items by the Korea Customs Service for classification of HS items. For image classification, CNN was used as a deep learning algorithm commonly used for image recognition and Vgg16, Vgg19, ResNet50 and Inception-V3 models were used among CNN models. To improve classification accuracy, two datasets were created. Dataset1 selected five types with the most HS code images, and Dataset2 was tested by dividing them into five types with 87 Chapter, the most among HS code 2 units. The classification accuracy was highest when HS item classification was performed by learning with dual database2, the corresponding model was Inception-V3, and the ResNet50 had the lowest classification accuracy. The study identified the possibility of HS item classification based on the first item image registered in the item classification determination case, and the second point of this study is that HS item classification, which has not been attempted before, was attempted through the CNN model.

Guidelines for Data Construction when Estimating Traffic Volume based on Artificial Intelligence using Drone Images (드론영상과 인공지능 기반 교통량 추정을 위한 데이터 구축 가이드라인 도출 연구)

  • Han, Dongkwon;Kim, Doopyo;Kim, Sungbo
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.40 no.3
    • /
    • pp.147-157
    • /
    • 2022
  • Recently, many studies have been conducted to analyze traffic or object recognition that classifies vehicles through artificial intelligence-based prediction models using CCTV (Closed Circuit TeleVision)or drone images. In order to develop an object recognition deep learning model for accurate traffic estimation, systematic data construction is required, and related standardized guidelines are insufficient. In this study, previous studies were analyzed to derive guidelines for establishing artificial intelligence-based training data for traffic estimation using drone images, and business reports or training data for artificial intelligence and quality management guidelines were referenced. The guidelines for data construction are divided into data acquisition, preprocessing, and validation, and guidelines for notice and evaluation index for each item are presented. The guidelines for data construction aims to provide assistance in the development of a robust and generalized artificial intelligence model in analyzing the estimation of road traffic based on drone image artificial intelligence.

Prediction of Doodle Images Using Neural Networks

  • Hae-Chan Lee;Kyu-Cheol Cho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.5
    • /
    • pp.29-38
    • /
    • 2023
  • Doodles, often possess irregular shapes and patterns, making it challenging for artificial intelligence to mechanically recognize and predict patterns in random doodles. Unlike humans who can effortlessly recognize and predict doodles even when circles are imperfect or lines are not perfectly straight, artificial intelligence requires learning from given training data to recognize and predict doodles. In this paper, we leverage a diverse dataset of doodle images from individuals of various nationalities, cultures, left-handedness, and right-handedness. After training two neural networks, we determine which network offers higher accuracy and is more suitable for doodle image prediction. The motivation behind predicting doodle images using artificial intelligence lies in providing a unique perspective on human expression and intent through the utilization of neural networks. For instance, by using the various images generated by artificial intelligence based on human-drawn doodles, we expect to foster diversity in artistic expression and expand the creative domain.

AI-Based Object Recognition Research for Augmented Reality Character Implementation (증강현실 캐릭터 구현을 위한 AI기반 객체인식 연구)

  • Seok-Hwan Lee;Jung-Keum Lee;Hyun Sim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.6
    • /
    • pp.1321-1330
    • /
    • 2023
  • This study attempts to address the problem of 3D pose estimation for multiple human objects through a single image generated during the character development process that can be used in augmented reality. In the existing top-down method, all objects in the image are first detected, and then each is reconstructed independently. The problem is that inconsistent results may occur due to overlap or depth order mismatch between the reconstructed objects. The goal of this study is to solve these problems and develop a single network that provides consistent 3D reconstruction of all humans in a scene. Integrating a human body model based on the SMPL parametric system into a top-down framework became an important choice. Through this, two types of collision loss based on distance field and loss that considers depth order were introduced. The first loss prevents overlap between reconstructed people, and the second loss adjusts the depth ordering of people to render occlusion inference and annotated instance segmentation consistently. This method allows depth information to be provided to the network without explicit 3D annotation of the image. Experimental results show that this study's methodology performs better than existing methods on standard 3D pose benchmarks, and the proposed losses enable more consistent reconstruction from natural images.