• Title/Summary/Keyword: Yolo-Cnn

Search Result 72, Processing Time 0.028 seconds

Comparison of CNN and YOLO for Object Detection (객체 검출을 위한 CNN과 YOLO 성능 비교 실험)

  • Lee, Yong-Hwan;Kim, Youngseop
    • Journal of the Semiconductor & Display Technology
    • /
    • v.19 no.1
    • /
    • pp.85-92
    • /
    • 2020
  • Object detection plays a critical role in the field of computer vision, and various researches have rapidly increased along with applying convolutional neural network and its modified structures since 2012. There are representative object detection algorithms, which are convolutional neural networks and YOLO. This paper presents two representative algorithm series, based on CNN and YOLO which solves the problem of CNN bounding box. We compare the performance of algorithm series in terms of accuracy, speed and cost. Compared with the latest advanced solution, YOLO v3 achieves a good trade-off between speed and accuracy.

Object Tracking Method using Deep Learning and Kalman Filter (딥 러닝 및 칼만 필터를 이용한 객체 추적 방법)

  • Kim, Gicheol;Son, Sohee;Kim, Minseop;Jeon, Jinwoo;Lee, Injae;Cha, Jihun;Choi, Haechul
    • Journal of Broadcast Engineering
    • /
    • v.24 no.3
    • /
    • pp.495-505
    • /
    • 2019
  • Typical algorithms of deep learning include CNN(Convolutional Neural Networks), which are mainly used for image recognition, and RNN(Recurrent Neural Networks), which are used mainly for speech recognition and natural language processing. Among them, CNN is able to learn from filters that generate feature maps with algorithms that automatically learn features from data, making it mainstream with excellent performance in image recognition. Since then, various algorithms such as R-CNN and others have appeared in object detection to improve performance of CNN, and algorithms such as YOLO(You Only Look Once) and SSD(Single Shot Multi-box Detector) have been proposed recently. However, since these deep learning-based detection algorithms determine the success of the detection in the still images, stable object tracking and detection in the video requires separate tracking capabilities. Therefore, this paper proposes a method of combining Kalman filters into deep learning-based detection networks for improved object tracking and detection performance in the video. The detection network used YOLO v2, which is capable of real-time processing, and the proposed method resulted in 7.7% IoU performance improvement over the existing YOLO v2 network and 20 fps processing speed in FHD images.

Real-time Human Detection under Omni-dir ectional Camera based on CNN with Unified Detection and AGMM for Visual Surveillance

  • Nguyen, Thanh Binh;Nguyen, Van Tuan;Chung, Sun-Tae;Cho, Seongwon
    • Journal of Korea Multimedia Society
    • /
    • v.19 no.8
    • /
    • pp.1345-1360
    • /
    • 2016
  • In this paper, we propose a new real-time human detection under omni-directional cameras for visual surveillance purpose, based on CNN with unified detection and AGMM. Compared to CNN-based state-of-the-art object detection methods. YOLO model-based object detection method boasts of very fast object detection, but with less accuracy. The proposed method adapts the unified detecting CNN of YOLO model so as to be intensified by the additional foreground contextual information obtained from pre-stage AGMM. Increased computational time incurred by additional AGMM processing is compensated by speed-up gain obtained from utilizing 2-D input data consisting of grey-level image data and foreground context information instead of 3-D color input data. Through various experiments, it is shown that the proposed method performs better with respect to accuracy and more robust to environment changes than YOLO model-based human detection method, but with the similar processing speeds to that of YOLO model-based one. Thus, it can be successfully employed for embedded surveillance application.

A Comparative Study of Deep Learning Techniques for Alzheimer's disease Detection in Medical Radiography

  • Amal Alshahrani;Jenan Mustafa;Manar Almatrafi;Layan Albaqami;Raneem Aljabri;Shahad Almuntashri
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.5
    • /
    • pp.53-63
    • /
    • 2024
  • Alzheimer's disease is a brain disorder that worsens over time and affects millions of people around the world. It leads to a gradual deterioration in memory, thinking ability, and behavioral and social skills until the person loses his ability to adapt to society. Technological progress in medical imaging and the use of artificial intelligence, has provided the possibility of detecting Alzheimer's disease through medical images such as magnetic resonance imaging (MRI). However, Deep learning algorithms, especially convolutional neural networks (CNNs), have shown great success in analyzing medical images for disease diagnosis and classification. Where CNNs can recognize patterns and objects from images, which makes them ideally suited for this study. In this paper, we proposed to compare the performances of Alzheimer's disease detection by using two deep learning methods: You Only Look Once (YOLO), a CNN-enabled object recognition algorithm, and Visual Geometry Group (VGG16) which is a type of deep convolutional neural network primarily used for image classification. We will compare our results using these modern models Instead of using CNN only like the previous research. In addition, the results showed different levels of accuracy for the various versions of YOLO and the VGG16 model. YOLO v5 reached 56.4% accuracy at 50 epochs and 61.5% accuracy at 100 epochs. YOLO v8, which is for classification, reached 84% accuracy overall at 100 epochs. YOLO v9, which is for object detection overall accuracy of 84.6%. The VGG16 model reached 99% accuracy for training after 25 epochs but only 78% accuracy for testing. Hence, the best model overall is YOLO v9, with the highest overall accuracy of 86.1%.

Development of an Efficient 3D Object Recognition Algorithm for Robotic Grasping in Cluttered Environments (혼재된 환경에서의 효율적 로봇 파지를 위한 3차원 물체 인식 알고리즘 개발)

  • Song, Dongwoon;Yi, Jae-Bong;Yi, Seung-Joon
    • The Journal of Korea Robotics Society
    • /
    • v.17 no.3
    • /
    • pp.255-263
    • /
    • 2022
  • 3D object detection pipelines often incorporate RGB-based object detection methods such as YOLO, which detects the object classes and bounding boxes from the RGB image. However, in complex environments where objects are heavily cluttered, bounding box approaches may show degraded performance due to the overlapping bounding boxes. Mask based methods such as Mask R-CNN can handle such situation better thanks to their detailed object masks, but they require much longer time for data preparation compared to bounding box-based approaches. In this paper, we present a 3D object recognition pipeline which uses either the YOLO or Mask R-CNN real-time object detection algorithm, K-nearest clustering algorithm, mask reduction algorithm and finally Principal Component Analysis (PCA) alg orithm to efficiently detect 3D poses of objects in a complex environment. Furthermore, we also present an improved YOLO based 3D object detection algorithm that uses a prioritized heightmap clustering algorithm to handle overlapping bounding boxes. The suggested algorithms have successfully been used at the Artificial-Intelligence Robot Challenge (ARC) 2021 competition with excellent results.

A Beverage Can Recognition System Based on Deep Learning for the Visually Impaired (시각장애인을 위한 딥러닝 기반 음료수 캔 인식 시스템)

  • Lee Chanbee;Sim Suhyun;Kim Sunhee
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.19 no.1
    • /
    • pp.119-127
    • /
    • 2023
  • Recently, deep learning has been used in the development of various institutional devices and services to help the visually impaired people in their daily lives. This is because not only are there few products and facility guides written in braille, but less than 10% of the visually impaired can use braille. In this paper, we propose a system that recognizes beverage cans in real time and outputs the beverage can name with sound for the convenience of the visually impaired. Five commercially available beverage cans were selected, and a CNN model and a YOLO model were designed to recognize the beverage cans. After augmenting the image data, model training was performed. The accuracy of the proposed CNN model and YOLO model is 91.2% and 90.8%, respectively. For practical verification, a system was built by attaching a camera and speaker to a Raspberry Pi. In the system, the YOLO model was applied. It was confirmed that beverage cans were recognized and output as sound in real time in various environments.

Research on railroad track object detection and classification based on mask R-CNN (mask R-CNN 기반의 철도선로 객체검출 및 분류에 관한 연구)

  • Seung-Shin Lee;Jong-Won Choi;Ryum-Duck Oh
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2024.01a
    • /
    • pp.81-83
    • /
    • 2024
  • 본 논문에서는 mask R-CNN의 이미지 세그먼테이션(Image Segmentation) 기법을 이용하여 철도의 선로를 식별하고 분류하는 방법을 제안한다. mask R-CNN의 이미지 세그먼테이션은 바운딩 박스(Bounding Box)를 통해 이미지에서 객체를 식별하는 R-CNN 알고리즘과는 달리 픽셀 단위로 관심 있는 객체를 검출하고 분류하는 기법으로서 오브젝트 디텍션(Object Detection)보다 더욱 정교한 객체 식별이 가능하다. 본 연구에서는 Pascal VOC 형태의 고속철도 데이터 24,205셋의 데이터를 전처리하고 MS COCO 데이터셋으로 변환하여, MMDetection의 mask R-CNN을 통해 픽셀 단위로 철도선로를 식별하고 정상/불량 상태를 분류하는 연구를 수행하였다. 선행연구에서는 YOLO를 활용하여 Polygon형태의 좌표를 바운딩 박스로 분류하였는데, 본 연구에서는 mask R-CNN을 활용함으로써 철도 선로를 더욱 정교하게 식별하였으며 정상/불량의 상태 분류는 YOLO와 유사한 성능을 보였다.

  • PDF

Separation of Touching Pigs using YOLO-based Bounding Box (YOLO 기반 외곽 사각형을 이용한 근접 돼지 분리)

  • Seo, J.;Ju, M.;Choi, Y.;Lee, J.;Chung, Y.;Park, D.
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.2
    • /
    • pp.77-86
    • /
    • 2018
  • Although separation of touching pigs in real-time is an important issue for a 24-h pig monitoring system, it is challenging to separate accurately the touching pigs in a crowded pig room. In this study, we propose a separation method for touching pigs using the information generated from Convolutional Neural Network(CNN). Especially, we apply one of the CNN-based object detection methods(i.e., You Look Only Once, YOLO) to solve the touching objects separation problem in an active manner. First, we evaluate and select the bounding boxes generated from YOLO, and then separate touching pigs by analyzing the relations between the selected bounding boxes. Our experimental results show that the proposed method is more effective than widely-used methods for separating touching pigs, in terms of both accuracy and execution time.

Improving Efficiency of Object Detection using Multiple Neural Networks (다중 신경망을 이용한 객체 탐지 효율성 개선방안)

  • Park, Dae-heum;Lim, Jong-hoon;Jang, Si-Woong
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.05a
    • /
    • pp.154-157
    • /
    • 2022
  • In the existing Tensorflow CNN environment, the object detection method is a method of performing object labeling and detection by Tensorflow itself. However, with the advent of YOLO, the efficiency of image object detection has increased. As a result, more deep layers can be built than existing neural networks, and the image object recognition rate can be increased. Therefore, in this paper, the detection ability and speed were compared and analyzed by designing an object detection system based on Darknet and YOLO and performing multi-layer construction and learning based on the existing convolutional neural network. For this reason, in this paper, a neural network methodology that efficiently uses Darknet's learning is presented.

  • PDF

Vehicle Manufacturer Recognition using Deep Learning and Perspective Transformation

  • Ansari, Israfil;Shim, Jaechang
    • Journal of Multimedia Information System
    • /
    • v.6 no.4
    • /
    • pp.235-238
    • /
    • 2019
  • In real world object detection is an active research topic for understanding different objects from images. There are different models presented in past and had significant results. In this paper we are presenting vehicle logo detection using previous object detection models such as You only look once (YOLO) and Faster Region-based CNN (F-RCNN). Both the front and rear view of the vehicles were used for training and testing the proposed method. Along with deep learning an image pre-processing algorithm called perspective transformation is proposed for all the test images. Using perspective transformation, the top view images were transformed into front view images. This algorithm has higher detection rate as compared to raw images. Furthermore, YOLO model has better result as compare to F-RCNN model.