• Title/Summary/Keyword: You Only Look Once

Search Result 111, Processing Time 0.033 seconds

Object Detection for the Visually Impaired in a Voice Guidance System (시각장애인을 위한 보행 안내 시스템의 객체 인식)

  • Soo-Yeon Son;Eunho-Jeong;Hyon Hee Kim
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.11a
    • /
    • pp.1206-1207
    • /
    • 2023
  • 보행의 제한은 시각장애인의 자립적인 생활을 어렵게 하며 안전에도 큰 영향을 끼친다. 본 논문은 YOLOv5(You Only Look Once version 5)를 활용하여 안전한 보행을 돕는 방법을 제시한다. 제시하는 방법은 자동차나 자전거, 전동킥보드 등의 움직이는 사물과 사람을 실시간으로 인식하여 시각장애인에게 알림으로써 보행에 도움을 줄 수 있으며 시각장애인의 안전한 보행에 도움을 줄 것이라 기대한다.

Research on Digital Construction Site Management Using Drone and Vision Processing Technology (드론 및 비전 프로세싱 기술을 활용한 디지털 건설현장 관리에 대한 연구)

  • Seo, Min Jo;Park, Kyung Kyu;Lee, Seung Been;Kim, Si Uk;Choi, Won Jun;Kim, Chee Kyeung
    • Proceedings of the Korean Institute of Building Construction Conference
    • /
    • 2023.11a
    • /
    • pp.239-240
    • /
    • 2023
  • Construction site management involves overseeing tasks from the construction phase to the maintenance stage, and digitalization of construction sites is necessary for digital construction site management. In this study, we aim to conduct research on object recognition at construction sites using drones. Images of construction sites captured by drones are reconstructed into BIM (Building Information Modeling) models, and objects are recognized after partially rendering the models using artificial intelligence. For the photorealistic rendering of the BIM models, both traditional filtering techniques and the generative adversarial network (GAN) model were used, while the YOLO (You Only Look Once) model was employed for object recognition. This study is expected to provide insights into the research direction of digital construction site management and help assess the potential and future value of introducing artificial intelligence in the construction industry.

  • PDF

A Study on Biomass Estimation Technique of Invertebrate Grazers Using Multi-object Tracking Model Based on Deep Learning (딥러닝 기반 다중 객체 추적 모델을 활용한 조식성 무척추동물 현존량 추정 기법 연구)

  • Bak, Suho;Kim, Heung-Min;Lee, Heeone;Han, Jeong-Ik;Kim, Tak-Young;Lim, Jae-Young;Jang, Seon Woong
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.3
    • /
    • pp.237-250
    • /
    • 2022
  • In this study, we propose a method to estimate the biomass of invertebrate grazers from the videos with underwater drones by using a multi-object tracking model based on deep learning. In order to detect invertebrate grazers by classes, we used YOLOv5 (You Only Look Once version 5). For biomass estimation we used DeepSORT (Deep Simple Online and real-time tracking). The performance of each model was evaluated on a workstation with a GPU accelerator. YOLOv5 averaged 0.9 or more mean Average Precision (mAP), and we confirmed it shows about 59 fps at 4 k resolution when using YOLOv5s model and DeepSORT algorithm. Applying the proposed method in the field, there was a tendency to be overestimated by about 28%, but it was confirmed that the level of error was low compared to the biomass estimation using object detection model only. A follow-up study is needed to improve the accuracy for the cases where frame images go out of focus continuously or underwater drones turn rapidly. However,should these issues be improved, it can be utilized in the production of decision support data in the field of invertebrate grazers control and monitoring in the future.

Application of Deep Learning Method for Real-Time Traffic Analysis using UAV (UAV를 활용한 실시간 교통량 분석을 위한 딥러닝 기법의 적용)

  • Park, Honglyun;Byun, Sunghoon;Lee, Hansung
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.38 no.4
    • /
    • pp.353-361
    • /
    • 2020
  • Due to the rapid urbanization, various traffic problems such as traffic jams during commute and regular traffic jams are occurring. In order to solve these traffic problems, it is necessary to quickly and accurately estimate and analyze traffic volume. ITS (Intelligent Transportation System) is a system that performs optimal traffic management by utilizing the latest ICT (Information and Communications Technology) technologies, and research has been conducted to analyze fast and accurate traffic volume through various techniques. In this study, we proposed a deep learning-based vehicle detection method using UAV (Unmanned Aerial Vehicle) video for real-time traffic analysis with high accuracy. The UAV was used to photograph orthogonal videos necessary for training and verification at intersections where various vehicles pass and trained vehicles by classifying them into sedan, truck, and bus. The experiment on UAV dataset was carried out using YOLOv3 (You Only Look Once V3), a deep learning-based object detection technique, and the experiments achieved the overall object detection rate of 90.21%, precision of 95.10% and the recall of 85.79%.

Object Tracking Method using Deep Learning and Kalman Filter (딥 러닝 및 칼만 필터를 이용한 객체 추적 방법)

  • Kim, Gicheol;Son, Sohee;Kim, Minseop;Jeon, Jinwoo;Lee, Injae;Cha, Jihun;Choi, Haechul
    • Journal of Broadcast Engineering
    • /
    • v.24 no.3
    • /
    • pp.495-505
    • /
    • 2019
  • Typical algorithms of deep learning include CNN(Convolutional Neural Networks), which are mainly used for image recognition, and RNN(Recurrent Neural Networks), which are used mainly for speech recognition and natural language processing. Among them, CNN is able to learn from filters that generate feature maps with algorithms that automatically learn features from data, making it mainstream with excellent performance in image recognition. Since then, various algorithms such as R-CNN and others have appeared in object detection to improve performance of CNN, and algorithms such as YOLO(You Only Look Once) and SSD(Single Shot Multi-box Detector) have been proposed recently. However, since these deep learning-based detection algorithms determine the success of the detection in the still images, stable object tracking and detection in the video requires separate tracking capabilities. Therefore, this paper proposes a method of combining Kalman filters into deep learning-based detection networks for improved object tracking and detection performance in the video. The detection network used YOLO v2, which is capable of real-time processing, and the proposed method resulted in 7.7% IoU performance improvement over the existing YOLO v2 network and 20 fps processing speed in FHD images.

Fundamental Study on Algorithm Development for Prediction of Smoke Spread Distance Based on Deep Learning (딥러닝 기반의 연기 확산거리 예측을 위한 알고리즘 개발 기초연구)

  • Kim, Byeol;Hwang, Kwang-Il
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.27 no.1
    • /
    • pp.22-28
    • /
    • 2021
  • This is a basic study on the development of deep learning-based algorithms to detect smoke before the smoke detector operates in the event of a ship fire, analyze and utilize the detected data, and support fire suppression and evacuation activities by predicting the spread of smoke before it spreads to remote areas. Proposed algorithms were reviewed in accordance with the following procedures. As a first step, smoke images obtained through fire simulation were applied to the YOLO (You Only Look Once) model, which is a deep learning-based object detection algorithm. The mean average precision (mAP) of the trained YOLO model was measured to be 98.71%, and smoke was detected at a processing speed of 9 frames per second (FPS). The second step was to estimate the spread of smoke using the coordinates of the boundary box, from which was utilized to extract the smoke geometry from YOLO. This smoke geometry was then applied to the time series prediction algorithm, long short-term memory (LSTM). As a result, smoke spread data obtained from the coordinates of the boundary box between the estimated fire occurrence and 30 s were entered into the LSTM learning model to predict smoke spread data from 31 s to 90 s in the smoke image of a fast fire obtained from fire simulation. The average square root error between the estimated spread of smoke and its predicted value was 2.74.

Detection of Urban Trees Using YOLOv5 from Aerial Images (항공영상으로부터 YOLOv5를 이용한 도심수목 탐지)

  • Park, Che-Won;Jung, Hyung-Sup
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.6_2
    • /
    • pp.1633-1641
    • /
    • 2022
  • Urban population concentration and indiscriminate development are causing various environmental problems such as air pollution and heat island phenomena, and causing human resources to deteriorate the damage caused by natural disasters. Urban trees have been proposed as a solution to these urban problems, and actually play an important role, such as providing environmental improvement functions. Accordingly, quantitative measurement and analysis of individual trees in urban trees are required to understand the effect of trees on the urban environment. However, the complexity and diversity of urban trees have a problem of lowering the accuracy of single tree detection. Therefore, we conducted a study to effectively detect trees in Dongjak-gu using high-resolution aerial images that enable effective detection of tree objects and You Only Look Once Version 5 (YOLOv5), which showed excellent performance in object detection. Labeling guidelines for the construction of tree AI learning datasets were generated, and box annotation was performed on Dongjak-gu trees based on this. We tested various scale YOLOv5 models from the constructed dataset and adopted the optimal model to perform more efficient urban tree detection, resulting in significant results of mean Average Precision (mAP) 0.663.

A Study on the Compensation Methods of Object Recognition Errors for Using Intelligent Recognition Model in Sports Games (스포츠 경기에서 지능인식모델을 이용하기 위한 대상체 인식오류 보상방법에 관한 연구)

  • Han, Junsu;Kim, Jongwon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.22 no.5
    • /
    • pp.537-542
    • /
    • 2021
  • This paper improves the possibility of recognizing fast-moving objects through the YOLO (You Only Look Once) deep learning recognition model in an application environment for object recognition in images. The purpose was to study the method of collecting semantic data through processing. In the recognition model, the moving object recognition error was identified as unrecognized because of the difference between the frame rate of the camera and the moving speed of the object and a misrecognition due to the existence of a similar object in an environment adjacent to the object. To minimize the recognition errors by compensating for errors, such as unrecognized and misrecognized objects through the proposed data collection method, and applying vision processing technology for the causes of errors that may occur in images acquired for sports (tennis games) that can represent real similar environments. The effectiveness of effective secondary data collection was improved by research on methods and processing structures. Therefore, by applying the data collection method proposed in this study, ordinary people can collect and manage data to improve their health and athletic performance in the sports and health industry through the simple shooting of a smart-phone camera.

A Technique for Interpreting and Adjusting Depth Information of each Plane by Applying an Object Detection Algorithm to Multi-plane Light-field Image Converted from Hologram Image (Light-field 이미지로 변환된 다중 평면 홀로그램 영상에 대해 객체 검출 알고리즘을 적용한 평면별 객체의 깊이 정보 해석 및 조절 기법)

  • Young-Gyu Bae;Dong-Ha Shin;Seung-Yeol Lee
    • Journal of Broadcast Engineering
    • /
    • v.28 no.1
    • /
    • pp.31-41
    • /
    • 2023
  • Directly converting the focal depth and image size of computer-generated-hologram (CGH), which is obtained by calculating the interference pattern of light from the 3D image, is known to be quite difficult because of the less similarity between the CGH and the original image. This paper proposes a method for separately converting the each of focal length of the given CGH, which is composed of multi-depth images. Firstly, the proposed technique converts the 3D image reproduced from the CGH into a Light-Field (LF) image composed of a set of 2D images observed from various angles, and the positions of the moving objects for each observed views are checked using an object detection algorithm YOLOv5 (You-Only-Look-Once-version-5). After that, by adjusting the positions of objects, the depth-transformed LF image and CGH are generated. Numerical simulations and experimental results show that the proposed technique can change the focal length within a range of about 3 cm without significant loss of the image quality when applied to the image which have original depth of 10 cm, with a spatial light modulator which has a pixel size of 3.6 ㎛ and a resolution of 3840⨯2160.

Assessment of the Object Detection Ability of Interproximal Caries on Primary Teeth in Periapical Radiographs Using Deep Learning Algorithms (유치의 치근단 방사선 사진에서 딥 러닝 알고리즘을 이용한 모델의 인접면 우식증 객체 탐지 능력의 평가)

  • Hongju Jeon;Seonmi Kim;Namki Choi
    • Journal of the korean academy of Pediatric Dentistry
    • /
    • v.50 no.3
    • /
    • pp.263-276
    • /
    • 2023
  • The purpose of this study was to evaluate the performance of a model using You Only Look Once (YOLO) for object detection of proximal caries in periapical radiographs of children. A total of 2016 periapical radiographs in primary dentition were selected from the M6 database as a learning material group, of which 1143 were labeled as proximal caries by an experienced dentist using an annotation tool. After converting the annotations into a training dataset, YOLO was trained on the dataset using a single convolutional neural network (CNN) model. Accuracy, recall, specificity, precision, negative predictive value (NPV), F1-score, Precision-Recall curve, and AP (area under curve) were calculated for evaluation of the object detection model's performance in the 187 test datasets. The results showed that the CNN-based object detection model performed well in detecting proximal caries, with a diagnostic accuracy of 0.95, a recall of 0.94, a specificity of 0.97, a precision of 0.82, a NPV of 0.96, and an F1-score of 0.81. The AP was 0.83. This model could be a valuable tool for dentists in detecting carious lesions in periapical radiographs.