• Title/Summary/Keyword: Object Detection Deep Learning Model

Search Result 275, Processing Time 0.024 seconds

The Effect of Hyperparameter Choice on ReLU and SELU Activation Function

  • Kevin, Pratama;Kang, Dae-Ki
    • International journal of advanced smart convergence
    • /
    • v.6 no.4
    • /
    • pp.73-79
    • /
    • 2017
  • The Convolutional Neural Network (CNN) has shown an excellent performance in computer vision task. Applications of CNN include image classification, object detection in images, autonomous driving, etc. This paper will evaluate the performance of CNN model with ReLU and SELU as activation function. The evaluation will be performed on four different choices of hyperparameter which are initialization method, network configuration, optimization technique, and regularization. We did experiment on each choice of hyperparameter and show how it influences the network convergence and test accuracy. In this experiment, we also discover performance improvement when using SELU as activation function over ReLU.

Implementation of Yolov3-tiny Object Detection Deep Learning Model over RISC-V Virtual Platform (RISC-V 가상플랫폼 기반 Yolov3-tiny 물체 탐지 딥러닝 모델 구현)

  • Kim, DoYoung;Seol, Hui-Gwan;Lim, Seung-Ho
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.05a
    • /
    • pp.576-578
    • /
    • 2022
  • 딥러닝 기술의 발전으로 객체 인색, 영상 분석에 관한 성능이 비약적으로 발전하였다. 하지만 고성능 GPU 를 사용하는 컴퓨팅 환경이 아닌 제한적인 엣지 디바이스 환경에서의 영상 처리 및 딥러닝 모델의 적용을 위해서는 엣지 디바이스에서 딥러닝 모델 실행 환경 과 이에 대한 분석이 필요하다. 본 논문에서는 RISC-V ISA 를 구현한 RISC-V 가상 플랫폼에 yolov3-tiny 모델 기반 객체 인식 시스템을 소프트웨어 레벨에서 포팅하여 구현하고, 샘플 이미지에 대한 네트워크 딥러닝 연산 및 객체 인식 알고리즘을 적용하여 그 결과를 도출하여 보았다. 본 적용을 바탕으로 RISC-V 기반 임베디드 엣지 디바이스 플랫폼에서 딥러닝 네트워크 연산과 객체 인식 알고리즘의 수행에 대한 분석과 딥러닝 연산 최적화를 위한 알고리즘 연구에 활용할 수 있다.

Segmentation-Based Depth Map Adjustment for Improved Grasping Pose Detection (물체 파지점 검출 향상을 위한 분할 기반 깊이 지도 조정)

  • Hyunsoo Shin;Muhammad Raheel Afzal;Sungon Lee
    • The Journal of Korea Robotics Society
    • /
    • v.19 no.1
    • /
    • pp.16-22
    • /
    • 2024
  • Robotic grasping in unstructured environments poses a significant challenge, demanding precise estimation of gripping positions for diverse and unknown objects. Generative Grasping Convolution Neural Network (GG-CNN) can estimate the position and direction that can be gripped by a robot gripper for an unknown object based on a three-dimensional depth map. Since GG-CNN uses only a depth map as an input, the precision of the depth map is the most critical factor affecting the result. To address the challenge of depth map precision, we integrate the Segment Anything Model renowned for its robust zero-shot performance across various segmentation tasks. We adjust the components corresponding to the segmented areas in the depth map aligned through external calibration. The proposed method was validated on the Cornell dataset and SurgicalKit dataset. Quantitative analysis compared to existing methods showed a 49.8% improvement with the dataset including surgical instruments. The results highlight the practical importance of our approach, especially in scenarios involving thin and metallic objects.

Automatic Estimation of Tillers and Leaf Numbers in Rice Using Deep Learning for Object Detection

  • Hyeokjin Bak;Ho-young Ban;Sungryul Chang;Dongwon Kwon;Jae-Kyeong Baek;Jung-Il Cho ;Wan-Gyu Sang
    • Proceedings of the Korean Society of Crop Science Conference
    • /
    • 2022.10a
    • /
    • pp.81-81
    • /
    • 2022
  • Recently, many studies on big data based smart farming have been conducted. Research to quantify morphological characteristics using image data from various crops in smart farming is underway. Rice is one of the most important food crops in the world. Much research has been done to predict and model rice crop yield production. The number of productive tillers per plant is one of the important agronomic traits associated with the grain yield of rice crop. However, modeling the basic growth characteristics of rice requires accurate data measurements. The existing method of measurement by humans is not only labor intensive but also prone to human error. Therefore, conversion to digital data is necessary to obtain accurate and phenotyping quickly. In this study, we present an image-based method to predict leaf number and evaluate tiller number of individual rice crop using YOLOv5 deep learning network. We performed using various network of the YOLOv5 model and compared them to determine higher prediction accuracy. We ako performed data augmentation, a method we use to complement small datasets. Based on the number of leaves and tiller actually measured in rice crop, the number of leaves predicted by the model from the image data and the existing regression equation were used to evaluate the number of tillers using the image data.

  • PDF

Metal Surface Defect Detection and Classification using EfficientNetV2 and YOLOv5 (EfficientNetV2 및 YOLOv5를 사용한 금속 표면 결함 검출 및 분류)

  • Alibek, Esanov;Kim, Kang-Chul
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.17 no.4
    • /
    • pp.577-586
    • /
    • 2022
  • Detection and classification of steel surface defects are critical for product quality control in the steel industry. However, due to its low accuracy and slow speed, the traditional approach cannot be effectively used in a production line. The current, widely used algorithm (based on deep learning) has an accuracy problem, and there are still rooms for development. This paper proposes a method of steel surface defect detection combining EfficientNetV2 for image classification and YOLOv5 as an object detector. Shorter training time and high accuracy are advantages of this model. Firstly, the image input into EfficientNetV2 model classifies defect classes and predicts probability of having defects. If the probability of having a defect is less than 0.25, the algorithm directly recognizes that the sample has no defects. Otherwise, the samples are further input into YOLOv5 to accomplish the defect detection process on the metal surface. Experiments show that proposed model has good performance on the NEU dataset with an accuracy of 98.3%. Simultaneously, the average training speed is shorter than other models.

The improved facial expression recognition algorithm for detecting abnormal symptoms in infants and young children (영유아 이상징후 감지를 위한 표정 인식 알고리즘 개선)

  • Kim, Yun-Su;Lee, Su-In;Seok, Jong-Won
    • Journal of IKEEE
    • /
    • v.25 no.3
    • /
    • pp.430-436
    • /
    • 2021
  • The non-contact body temperature measurement system is one of the key factors, which is manage febrile diseases in mass facilities using optical and thermal imaging cameras. Conventional systems can only be used for simple body temperature measurement in the face area, because it is used only a deep learning-based face detection algorithm. So, there is a limit to detecting abnormal symptoms of the infants and young children, who have difficulty expressing their opinions. This paper proposes an improved facial expression recognition algorithm for detecting abnormal symptoms in infants and young children. The proposed method uses an object detection model to detect infants and young children in an image, then It acquires the coordinates of the eyes, nose, and mouth, which are key elements of facial expression recognition. Finally, facial expression recognition is performed by applying a selective sharpening filter based on the obtained coordinates. According to the experimental results, the proposed algorithm improved by 2.52%, 1.12%, and 2.29%, respectively, for the three expressions of neutral, happy, and sad in the UTK dataset.

Survey on Deep Learning-based Panoptic Segmentation Methods (딥 러닝 기반의 팬옵틱 분할 기법 분석)

  • Kwon, Jung Eun;Cho, Sung In
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.16 no.5
    • /
    • pp.209-214
    • /
    • 2021
  • Panoptic segmentation, which is now widely used in computer vision such as medical image analysis, and autonomous driving, helps understanding an image with holistic view. It identifies each pixel by assigning a unique class ID, and an instance ID. Specifically, it can classify 'thing' from 'stuff', and provide pixel-wise results of semantic prediction and object detection. As a result, it can solve both semantic segmentation and instance segmentation tasks through a unified single model, producing two different contexts for two segmentation tasks. Semantic segmentation task focuses on how to obtain multi-scale features from large receptive field, without losing low-level features. On the other hand, instance segmentation task focuses on how to separate 'thing' from 'stuff' and how to produce the representation of detected objects. With the advances of both segmentation techniques, several panoptic segmentation models have been proposed. Many researchers try to solve discrepancy problems between results of two segmentation branches that can be caused on the boundary of the object. In this survey paper, we will introduce the concept of panoptic segmentation, categorize the existing method into two representative methods and explain how it is operated on two methods: top-down method and bottom-up method. Then, we will analyze the performance of various methods with experimental results.

YOLO Based Automatic Sorting System for Plastic Recycling (플라스틱 재활용을 위한 YOLO기반의 자동 분류시스템)

  • Kim, Yong jun;Cho, Taeuk;Park, Hyung-kun
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.10a
    • /
    • pp.382-384
    • /
    • 2021
  • In this study, we implement a system that automatically classifies types of plastics using YOLO (You Only Look Once), a real-time object recognition algorithm. The system consists of Nvidia jetson nano, a small computer for deep learning and computer vision, with model trained to recognize plastic separation emission marks using YOLO. Using a webcam, recycling marks of plastic waste were recognized as PET, HDPE, and PP, and motors were adjusted to be classified according to the type. By implementing this automatic classifier, it is convenient in that it can reduce the labor of separating and discharging plastic separation marks by humans and increase the efficiency of recycling through accurate recycling.

  • PDF

Crack Detection on the Road in Aerial Image using Mask R-CNN (Mask R-CNN을 이용한 항공 영상에서의 도로 균열 검출)

  • Lee, Min Hye;Nam, Kwang Woo;Lee, Chang Woo
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.24 no.3
    • /
    • pp.23-29
    • /
    • 2019
  • Conventional crack detection methods have a problem of consuming a lot of labor, time and cost. To solve these problems, an automatic detection system is needed to detect cracks in images obtained by using vehicles or UAVs(unmanned aerial vehicles). In this paper, we have studied road crack detection with unmanned aerial photographs. Aerial images are generated through preprocessing and labeling to generate morphological information data sets of cracks. The generated data set was applied to the mask R-CNN model to obtain a new model in which various crack information was learned. Experimental results show that the cracks in the proposed aerial image were detected with an accuracy of 73.5% and some of them were predicted in a certain type of crack region.

Training of a Siamese Network to Build a Tracker without Using Tracking Labels (샴 네트워크를 사용하여 추적 레이블을 사용하지 않는 다중 객체 검출 및 추적기 학습에 관한 연구)

  • Kang, Jungyu;Song, Yoo-Seung;Min, Kyoung-Wook;Choi, Jeong Dan
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.21 no.5
    • /
    • pp.274-286
    • /
    • 2022
  • Multi-object tracking has been studied for a long time under computer vision and plays a critical role in applications such as autonomous driving and driving assistance. Multi-object tracking techniques generally consist of a detector that detects objects and a tracker that tracks the detected objects. Various publicly available datasets allow us to train a detector model without much effort. However, there are relatively few publicly available datasets for training a tracker model, and configuring own tracker datasets takes a long time compared to configuring detector datasets. Hence, the detector is often developed separately with a tracker module. However, the separated tracker should be adjusted whenever the former detector model is changed. This study proposes a system that can train a model that performs detection and tracking simultaneously using only the detector training datasets. In particular, a Siam network with augmentation is used to compose the detector and tracker. Experiments are conducted on public datasets to verify that the proposed algorithm can formulate a real-time multi-object tracker comparable to the state-of-the-art tracker models.