• Title/Summary/Keyword: 3D Object Detection

Search Result 243, Processing Time 0.03 seconds

ANALYSIS OF THE FLOOR PLAN DATASET WITH YOLO V5

  • MYUNGHYUN JUNG;MINJUNG GIM;SEUNGHWAN YANG
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • v.27 no.4
    • /
    • pp.311-323
    • /
    • 2023
  • This paper introduces the industrial problem, the solution, and the results of the research conducted with Define Inc. The client company wanted to improve the performance of an object detection model on the floor plan dataset. To solve the problem, we analyzed the operational principles, advantages, and disadvantages of the existing object detection model, identified the characteristics of the floor plan dataset, and proposed to use of YOLO v5 as an appropriate object detection model for training the dataset. We compared the performance of the existing model and the proposed model using mAP@60, and verified the object detection results with real test data, and found that the performance increase of mAP@60 was 0.08 higher with a 25% shorter inference time. We also found that the training time of the proposed YOLO v5 was 71% shorter than the existing model because it has a simpler structure. In this paper, we have shown that the object detection model for the floor plan dataset can achieve better performance while reducing the training time. We expect that it will be useful for solving other industrial problems related to object detection in the future. We also believe that this result can be extended to study object recognition in 3D floor plan dataset.

Stereo Vision-Based 3D Pose Estimation of Product Labels for Bin Picking (빈피킹을 위한 스테레오 비전 기반의 제품 라벨의 3차원 자세 추정)

  • Udaya, Wijenayake;Choi, Sung-In;Park, Soon-Yong
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.22 no.1
    • /
    • pp.8-16
    • /
    • 2016
  • In the field of computer vision and robotics, bin picking is an important application area in which object pose estimation is necessary. Different approaches, such as 2D feature tracking and 3D surface reconstruction, have been introduced to estimate the object pose accurately. We propose a new approach where we can use both 2D image features and 3D surface information to identify the target object and estimate its pose accurately. First, we introduce a label detection technique using Maximally Stable Extremal Regions (MSERs) where the label detection results are used to identify the target objects separately. Then, the 2D image features on the detected label areas are utilized to generate 3D surface information. Finally, we calculate the 3D position and the orientation of the target objects using the information of the 3D surface.

EMOS: Enhanced moving object detection and classification via sensor fusion and noise filtering

  • Dongjin Lee;Seung-Jun Han;Kyoung-Wook Min;Jungdan Choi;Cheong Hee Park
    • ETRI Journal
    • /
    • v.45 no.5
    • /
    • pp.847-861
    • /
    • 2023
  • Dynamic object detection is essential for ensuring safe and reliable autonomous driving. Recently, light detection and ranging (LiDAR)-based object detection has been introduced and shown excellent performance on various benchmarks. Although LiDAR sensors have excellent accuracy in estimating distance, they lack texture or color information and have a lower resolution than conventional cameras. In addition, performance degradation occurs when a LiDAR-based object detection model is applied to different driving environments or when sensors from different LiDAR manufacturers are utilized owing to the domain gap phenomenon. To address these issues, a sensor-fusion-based object detection and classification method is proposed. The proposed method operates in real time, making it suitable for integration into autonomous vehicles. It performs well on our custom dataset and on publicly available datasets, demonstrating its effectiveness in real-world road environments. In addition, we will make available a novel three-dimensional moving object detection dataset called ETRI 3D MOD.

Semi-Supervised Domain Adaptation on LiDAR 3D Object Detection with Self-Training and Knowledge Distillation (자가학습과 지식증류 방법을 활용한 LiDAR 3차원 물체 탐지에서의 준지도 도메인 적응)

  • Jungwan Woo;Jaeyeul Kim;Sunghoon Im
    • The Journal of Korea Robotics Society
    • /
    • v.18 no.3
    • /
    • pp.346-351
    • /
    • 2023
  • With the release of numerous open driving datasets, the demand for domain adaptation in perception tasks has increased, particularly when transferring knowledge from rich datasets to novel domains. However, it is difficult to solve the change 1) in the sensor domain caused by heterogeneous LiDAR sensors and 2) in the environmental domain caused by different environmental factors. We overcome domain differences in the semi-supervised setting with 3-stage model parameter training. First, we pre-train the model with the source dataset with object scaling based on statistics of the object size. Then we fine-tine the partially frozen model weights with copy-and-paste augmentation. The 3D points in the box labels are copied from one scene and pasted to the other scenes. Finally, we use the knowledge distillation method to update the student network with a moving average from the teacher network along with a self-training method with pseudo labels. Test-Time Augmentation with varying z values is employed to predict the final results. Our method achieved 3rd place in ECCV 2022 workshop on the 3D Perception for Autonomous Driving challenge.

2D Human Pose Estimation based on Object Detection using RGB-D information

  • Park, Seohee;Ji, Myunggeun;Chun, Junchul
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.2
    • /
    • pp.800-816
    • /
    • 2018
  • In recent years, video surveillance research has been able to recognize various behaviors of pedestrians and analyze the overall situation of objects by combining image analysis technology and deep learning method. Human Activity Recognition (HAR), which is important issue in video surveillance research, is a field to detect abnormal behavior of pedestrians in CCTV environment. In order to recognize human behavior, it is necessary to detect the human in the image and to estimate the pose from the detected human. In this paper, we propose a novel approach for 2D Human Pose Estimation based on object detection using RGB-D information. By adding depth information to the RGB information that has some limitation in detecting object due to lack of topological information, we can improve the detecting accuracy. Subsequently, the rescaled region of the detected object is applied to ConVol.utional Pose Machines (CPM) which is a sequential prediction structure based on ConVol.utional Neural Network. We utilize CPM to generate belief maps to predict the positions of keypoint representing human body parts and to estimate human pose by detecting 14 key body points. From the experimental results, we can prove that the proposed method detects target objects robustly in occlusion. It is also possible to perform 2D human pose estimation by providing an accurately detected region as an input of the CPM. As for the future work, we will estimate the 3D human pose by mapping the 2D coordinate information on the body part onto the 3D space. Consequently, we can provide useful human behavior information in the research of HAR.

A Collision detection from division space for performance improvement of MMORPG game engine (MMORPG 게임엔진의 성능개선을 위한 분할공간에서의 충돌검출)

  • Lee, Sung-Ug
    • The KIPS Transactions:PartB
    • /
    • v.10B no.5
    • /
    • pp.567-574
    • /
    • 2003
  • Application field of third dimension graphic is becoming diversification by the fast development of hardware recently. Various theory of details technology necessary to design game such as 3D MMORPG (Massive Multi-play Online Role Flaying Game) that do with third dimension. Cyber city should be absorbed. It is the detection speed that this treatise is necessary in game engine design. 3D MMORPG game engine has much factor that influence to speed as well as rendering processing because it express huge third dimension city´s grate many building and individual fast effectively by real time. This treatise nay get concept about the collision in 3D MMORPG and detection speed elevation of game engine through improved detection method. Space division is need to process fast dynamically wide outside that is 3D MMORPG´s main detection target. 3D is constructed with tree construct individual that need collision using processing geometry dataset that is given through new graph. We may search individual that need in collision detection and improve the collision detection speed as using hierarchical bounding box that use it with detection volume. Octree that will use by division octree is used mainly to express rightly static object but this paper use limited OSP by limited space division structure to use this in dynamic environment. Limited OSP space use limited space with method that divide square to classify typically complicated 3D space´s object. Through this detection, this paper propose follow contents, first, this detection may judge collision detection at early time without doing all polygon´s collision examination. Second, this paper may improve detection efficiency of game engine through and then reduce detection time because detection time of bounding box´s collision detection.

Object Detection Using Combined Random Fern for RGB-D Image Format (RGB-D 영상 포맷을 위한 결합형 무작위 Fern을 이용한 객체 검출)

  • Lim, Seung-Ouk;Kim, Yu-Seon;Lee, Si-Woong
    • The Journal of the Korea Contents Association
    • /
    • v.16 no.9
    • /
    • pp.451-459
    • /
    • 2016
  • While an object detection algorithm plays a key role in many computer vision applications, it requires extensive computation to show robustness under varying lightning and geometrical distortions. Recently, some approaches formulate the problem in a classification framework and show improved performances in object recognition. Among them, random fern algorithm drew a lot of attention because of its simple structure and high recognition rates. However, it reveals performance degradation under the illumination changes and noise addition, since it computes patch features based only on pixel intensities. In this paper, we propose a new structure of combined random fern which incorporates depth information into the conventional random fern reflecting 3D structure of the patch. In addition, a new structure of object tracker which exploits the combined random fern is also introduced. Experiments show that the proposed method provides superior performance of object detection under illumination change and noisy condition compared to the conventional methods.

3D Coordinates Acquisition by using Multi-view X-ray Images (다시점 X선 영상을 이용한 3차원 좌표 획득)

  • Yi, Sooyeong;Rhi, Jaeyoung;Kim, Soonchul;Lee, Jeonggyu
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.19 no.10
    • /
    • pp.886-890
    • /
    • 2013
  • In this paper, a 3D coordinates acquisition method for a mechanical assembly is developed by using multiview X-ray images. The multi-view X-ray images of an object are obtained by a rotary table. From the rotation transformation, it is possible to obtain the 3D coordinates of corresponding edge points on multi-view X-ray images by triangulation. The edge detection algorithm in this paper is based on the attenuation characteristic of the X-ray. The 3D coordinates of the object points are represented on a graphic display, which is used for the inspection of a mechanical assembly.

A New Object Region Detection and Classification Method using Multiple Sensors on the Driving Environment (다중 센서를 사용한 주행 환경에서의 객체 검출 및 분류 방법)

  • Kim, Jung-Un;Kang, Hang-Bong
    • Journal of Korea Multimedia Society
    • /
    • v.20 no.8
    • /
    • pp.1271-1281
    • /
    • 2017
  • It is essential to collect and analyze target information around the vehicle for autonomous driving of the vehicle. Based on the analysis, environmental information such as location and direction should be analyzed in real time to control the vehicle. In particular, obstruction or cutting of objects in the image must be handled to provide accurate information about the vehicle environment and to facilitate safe operation. In this paper, we propose a method to simultaneously generate 2D and 3D bounding box proposals using LiDAR Edge generated by filtering LiDAR sensor information. We classify the classes of each proposal by connecting them with Region-based Fully-Covolutional Networks (R-FCN), which is an object classifier based on Deep Learning, which uses two-dimensional images as inputs. Each 3D box is rearranged by using the class label and the subcategory information of each class to finally complete the 3D bounding box corresponding to the object. Because 3D bounding boxes are created in 3D space, object information such as space coordinates and object size can be obtained at once, and 2D bounding boxes associated with 3D boxes do not have problems such as occlusion.

Visual Positioning System based on Voxel Labeling using Object Simultaneous Localization And Mapping

  • Jung, Tae-Won;Kim, In-Seon;Jung, Kye-Dong
    • International Journal of Advanced Culture Technology
    • /
    • v.9 no.4
    • /
    • pp.302-306
    • /
    • 2021
  • Indoor localization is one of the basic elements of Location-Based Service, such as indoor navigation, location-based precision marketing, spatial recognition of robotics, augmented reality, and mixed reality. We propose a Voxel Labeling-based visual positioning system using object simultaneous localization and mapping (SLAM). Our method is a method of determining a location through single image 3D cuboid object detection and object SLAM for indoor navigation, then mapping to create an indoor map, addressing it with voxels, and matching with a defined space. First, high-quality cuboids are created from sampling 2D bounding boxes and vanishing points for single image object detection. And after jointly optimizing the poses of cameras, objects, and points, it is a Visual Positioning System (VPS) through matching with the pose information of the object in the voxel database. Our method provided the spatial information needed to the user with improved location accuracy and direction estimation.