• Title/Summary/Keyword: faster r-cnn

Search Result 89, Processing Time 0.026 seconds

Deep Window Detection in Street Scenes

  • Ma, Wenguang;Ma, Wei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.2
    • /
    • pp.855-870
    • /
    • 2020
  • Windows are key components of building facades. Detecting windows, crucial to 3D semantic reconstruction and scene parsing, is a challenging task in computer vision. Early methods try to solve window detection by using hand-crafted features and traditional classifiers. However, these methods are unable to handle the diversity of window instances in real scenes and suffer from heavy computational costs. Recently, convolutional neural networks based object detection algorithms attract much attention due to their good performances. Unfortunately, directly training them for challenging window detection cannot achieve satisfying results. In this paper, we propose an approach for window detection. It involves an improved Faster R-CNN architecture for window detection, featuring in a window region proposal network, an RoI feature fusion and a context enhancement module. Besides, a post optimization process is designed by the regular distribution of windows to refine detection results obtained by the improved deep architecture. Furthermore, we present a newly collected dataset which is the largest one for window detection in real street scenes to date. Experimental results on both existing datasets and the new dataset show that the proposed method has outstanding performance.

Application of Deep Learning Algorithm for Detecting Construction Workers Wearing Safety Helmet Using Computer Vision (건설현장 근로자의 안전모 착용 여부 검출을 위한 컴퓨터 비전 기반 딥러닝 알고리즘의 적용)

  • Kim, Myung Ho;Shin, Sung Woo;Suh, Yong Yoon
    • Journal of the Korean Society of Safety
    • /
    • v.34 no.6
    • /
    • pp.29-37
    • /
    • 2019
  • Since construction sites are exposed to outdoor environments, working conditions are significantly dangerous. Thus, wearing of the personal protective equipments such as safety helmet is very important for worker safety. However, construction workers are often wearing-off the helmet as inconvenient and uncomportable. As a result, a small mistake may lead to serious accident. For this, checking of wearing safety helmet is important task to safety managers in field. However, due to the limited time and manpower, the checking can not be executed for every individual worker spread over a large construction site. Therefore, if an automatic checking system is provided, field safety management should be performed more effectively and efficiently. In this study, applicability of deep learning based computer vision technology is investigated for automatic checking of wearing safety helmet in construction sites. Faster R-CNN deep learning algorithm for object detection and classification is employed to develop the automatic checking model. Digital camera images captured in real construction site are used to validate the proposed model. Based on the results, it is concluded that the proposed model may effectively be used for automatic checking of wearing safety helmet in construction site.

A Study on the Motion Object Detection Method for Autonomous Driving (자율주행을 위한 동적 객체 인식 방법에 관한 연구)

  • Park, Seung-Jun;Park, Sang-Bae;Kim, Jung-Ha
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.24 no.5
    • /
    • pp.547-553
    • /
    • 2021
  • Dynamic object recognition is an important task for autonomous vehicles. Since dynamic objects exhibit a higher collision risk than static objects, our own trajectories should be planned to match the future state of moving elements in the scene. Time information such as optical flow can be used to recognize movement. Existing optical flow calculations are based only on camera sensors and are prone to misunderstanding in low light conditions. In this regard, to improve recognition performance in low-light environments, we applied a normalization filter and a correction function for Gamma Value to the input images. The low light quality improvement algorithm can be applied to confirm the more accurate detection of Object's Bounding Box for the vehicle. It was confirmed that there is an important in object recognition through image prepocessing and deep learning using YOLO.

Vehicle License Plate Text Recognition Algorithm Using Object Detection and Handwritten Hangul Recognition Algorithm (객체 검출과 한글 손글씨 인식 알고리즘을 이용한 차량 번호판 문자 추출 알고리즘)

  • Na, Min Won;Choi, Ha Na;Park, Yun Young
    • Journal of Information Technology Services
    • /
    • v.20 no.6
    • /
    • pp.97-105
    • /
    • 2021
  • Recently, with the development of IT technology, unmanned systems are being introduced in many industrial fields, and one of the most important factors for introducing unmanned systems in the automobile field is vehicle licence plate recognition(VLPR). The existing VLPR algorithms are configured to use image processing for a specific type of license plate to divide individual areas of a character within the plate to recognize each character. However, as the number of Korean vehicle license plates increases, the law is amended, there are old-fashioned license plates, new license plates, and different types of plates are used for each type of vehicle. Therefore, it is necessary to update the VLPR system every time, which incurs costs. In this paper, we use an object detection algorithm to detect character regardless of the format of the vehicle license plate, and apply a handwritten Hangul recognition(HHR) algorithm to enhance the recognition accuracy of a single Hangul character, which is called a Hangul unit. Since Hangul unit is recognized by combining initial consonant, medial vowel and final consonant, so it is possible to use other Hangul units in addition to the 40 Hangul units used for the Korean vehicle license plate.

Intelligent Face Mosaicing Method in Video for Personal Information Protection (개인정보 보호를 위한 비디오에서의 지능형 얼굴 모자이킹 방법)

  • Lim, Hyuk;Choi, Minseok;Choi, Seungbi;Choi, Haechul
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2020.11a
    • /
    • pp.338-339
    • /
    • 2020
  • 개인 방송의 보편화로 인해 인터넷 혹은 방송으로 유포되는 영상에서 일반인의 얼굴이 빈번히 노출되고 있으며, 동의 받지 않은 얼굴의 방송 노출은 개인 초상권 침해와 같은 사회적 문제를 일으킬 수 있다. 이러한 개인 초상권 침해 문제를 해결하고자 본 논문은 비디오에서 일반인의 얼굴을 검출하고 이에 마스킹을 가하는 방법을 제안한다. 제안 방법은 우선 딥러닝 기반의 Faster R-CNN을 이용하여 모자이킹을 하지 않을 특정인과 모자이킹을 가할 비특정인을 포함한 다수의 얼굴 영상을 학습한다. 학습된 네트워크를 이용하여 입력 비디오에 대해 사람의 얼굴을 검출하고 검출된 결과 중 특정인을 선별해 낸다. 최종적으로 입력 비디오에서 특정인을 제외한 나머지 검출된 얼굴에 대해 모자이킹 처리를 수행함으로써 비디오에서 지능적으로 비특정인의 얼굴을 가린다. 실험결과, 특정인과 비특정인을 포함한 얼굴 검출의 경우 99%의 정확도를 보였으며, 얼굴 검출 결과 중 특정인을 정확히 맞춘 경우는 86%의 정확도를 보였다. 제안 방법은 인터넷 동영상 서비스 및 방송 분야에서 개인 정보 보호를 위해 효과적으로 활용될 수 있을 것으로 기대된다.

  • PDF

Analysis of the Effect of Deep-learning Super-resolution for Fragments Detection Performance Enhancement (파편 탐지 성능 향상을 위한 딥러닝 초해상도화 효과 분석)

  • Yuseok Lee
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.26 no.3
    • /
    • pp.234-245
    • /
    • 2023
  • The Arena Fragmentation Test(AFT) is designed to analyze warhead performance by measuring fragmentation data. In order to evaluate the results of the AFT, a set of AFT images are captured by high-speed cameras. To detect objects in the AFT image set, ResNet-50 based Faster R-CNN is used as a detection model. However, because of the low resolution of the AFT image set, a detection model has shown low performance. To enhance the performance of the detection model, Super-resolution(SR) methods are used to increase the AFT image set resolution. To this end, The Bicubic method and three SR models: ZSSR, EDSR, and SwinIR are used. The use of SR images results in an increase in the performance of the detection model. While the increase in the number of pixels representing a fragment flame in the AFT images improves the Recall performance of the detection model, the number of pixels representing noise also increases, leading to a slight decreases in Precision performance. Consequently, the F1 score is increased by up to 9 %, demonstrating the effectiveness of SR in enhancing the performance of the detection model.

ANALYSIS OF THE FLOOR PLAN DATASET WITH YOLO V5

  • MYUNGHYUN JUNG;MINJUNG GIM;SEUNGHWAN YANG
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • v.27 no.4
    • /
    • pp.311-323
    • /
    • 2023
  • This paper introduces the industrial problem, the solution, and the results of the research conducted with Define Inc. The client company wanted to improve the performance of an object detection model on the floor plan dataset. To solve the problem, we analyzed the operational principles, advantages, and disadvantages of the existing object detection model, identified the characteristics of the floor plan dataset, and proposed to use of YOLO v5 as an appropriate object detection model for training the dataset. We compared the performance of the existing model and the proposed model using mAP@60, and verified the object detection results with real test data, and found that the performance increase of mAP@60 was 0.08 higher with a 25% shorter inference time. We also found that the training time of the proposed YOLO v5 was 71% shorter than the existing model because it has a simpler structure. In this paper, we have shown that the object detection model for the floor plan dataset can achieve better performance while reducing the training time. We expect that it will be useful for solving other industrial problems related to object detection in the future. We also believe that this result can be extended to study object recognition in 3D floor plan dataset.

Fase Positive Fire Detection Improvement Research using the Frame Similarity Principal based on Deep Learning (딥런닝 기반의 프레임 유사성을 이용한 화재 오탐 검출 개선 연구)

  • Lee, Yeung-Hak;Shim, Jae-Chnag
    • Journal of IKEEE
    • /
    • v.23 no.1
    • /
    • pp.242-248
    • /
    • 2019
  • Fire flame and smoke detection algorithm studies are challenging task in computer vision due to the variety of shapes, rapid spread and colors. The performance of a typical sensor based fire detection system is largely limited by environmental factors (indoor and fire locations). To solve this problem, a deep learning method is applied. Because it extracts the feature of the object using several methods, so that if a similar shape exists in the frame, it can be detected as false postive. This study proposes a new algorithm to reduce false positives by using frame similarity before using deep learning to decrease the false detection rate. Experimental results show that the fire detection performance is maintained and the false positives are reduced by applying the proposed method. It is confirmed that the proposed method has excellent false detection performance.

Two person Interaction Recognition Based on Effective Hybrid Learning

  • Ahmed, Minhaz Uddin;Kim, Yeong Hyeon;Kim, Jin Woo;Bashar, Md Rezaul;Rhee, Phill Kyu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.2
    • /
    • pp.751-770
    • /
    • 2019
  • Action recognition is an essential task in computer vision due to the variety of prospective applications, such as security surveillance, machine learning, and human-computer interaction. The availability of more video data than ever before and the lofty performance of deep convolutional neural networks also make it essential for action recognition in video. Unfortunately, limited crafted video features and the scarcity of benchmark datasets make it challenging to address the multi-person action recognition task in video data. In this work, we propose a deep convolutional neural network-based Effective Hybrid Learning (EHL) framework for two-person interaction classification in video data. Our approach exploits a pre-trained network model (the VGG16 from the University of Oxford Visual Geometry Group) and extends the Faster R-CNN (region-based convolutional neural network a state-of-the-art detector for image classification). We broaden a semi-supervised learning method combined with an active learning method to improve overall performance. Numerous types of two-person interactions exist in the real world, which makes this a challenging task. In our experiment, we consider a limited number of actions, such as hugging, fighting, linking arms, talking, and kidnapping in two environment such simple and complex. We show that our trained model with an active semi-supervised learning architecture gradually improves the performance. In a simple environment using an Intelligent Technology Laboratory (ITLab) dataset from Inha University, performance increased to 95.6% accuracy, and in a complex environment, performance reached 81% accuracy. Our method reduces data-labeling time, compared to supervised learning methods, for the ITLab dataset. We also conduct extensive experiment on Human Action Recognition benchmarks such as UT-Interaction dataset, HMDB51 dataset and obtain better performance than state-of-the-art approaches.