• Title/Summary/Keyword: Human Instance Segmentation

Search Result 11, Processing Time 0.023 seconds

Automatic Dataset Generation of Object Detection and Instance Segmentation using Mask R-CNN (Mask R-CNN을 이용한 물체인식 및 개체분할의 학습 데이터셋 자동 생성)

  • Jo, HyunJun;Kim, Dawit;Song, Jae-Bok
    • The Journal of Korea Robotics Society
    • /
    • v.14 no.1
    • /
    • pp.31-39
    • /
    • 2019
  • A robot usually adopts ANN (artificial neural network)-based object detection and instance segmentation algorithms to recognize objects but creating datasets for these algorithms requires high labeling costs because the dataset should be manually labeled. In order to lower the labeling cost, a new scheme is proposed that can automatically generate a training images and label them for specific objects. This scheme uses an instance segmentation algorithm trained to give the masks of unknown objects, so that they can be obtained in a simple environment. The RGB images of objects can be obtained by using these masks, and it is necessary to label the classes of objects through a human supervision. After obtaining object images, they are synthesized with various background images to create new images. Labeling the synthesized images is performed automatically using the masks and previously input object classes. In addition, human intervention is further reduced by using the robot arm to collect object images. The experiments show that the performance of instance segmentation trained through the proposed method is equivalent to that of the real dataset and that the time required to generate the dataset can be significantly reduced.

Context-Dependent Video Data Augmentation for Human Instance Segmentation (인물 개체 분할을 위한 맥락-의존적 비디오 데이터 보강)

  • HyunJin Chun;JongHun Lee;InCheol Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.5
    • /
    • pp.217-228
    • /
    • 2023
  • Video instance segmentation is an intelligent visual task with high complexity because it not only requires object instance segmentation for each image frame constituting a video, but also requires accurate tracking of instances throughout the frame sequence of the video. In special, human instance segmentation in drama videos has an unique characteristic that requires accurate tracking of several main characters interacting in various places and times. Also, it is also characterized by a kind of the class imbalance problem because there is a significant difference between the frequency of main characters and that of supporting or auxiliary characters in drama videos. In this paper, we introduce a new human instance datatset called MHIS, which is built upon drama videos, Miseang, and then propose a novel video data augmentation method, CDVA, in order to overcome the data imbalance problem between character classes. Different from the previous video data augmentation methods, the proposed CDVA generates more realistic augmented videos by deciding the optimal location within the background clip for a target human instance to be inserted with taking rich spatio-temporal context embedded in videos into account. Therefore, the proposed augmentation method, CDVA, can improve the performance of a deep neural network model for video instance segmentation. Conducting both quantitative and qualitative experiments using the MHIS dataset, we prove the usefulness and effectiveness of the proposed video data augmentation method.

A Basic Study on the Instance Segmentation with Surveillance Cameras at Construction Sties using Deep Learning based Computer Vision (건설 현장 CCTV 영상에서 딥러닝을 이용한 사물 인식 기초 연구)

  • Kang, Kyung-Su;Cho, Young-Woon;Ryu, Han-Guk
    • Proceedings of the Korean Institute of Building Construction Conference
    • /
    • 2020.11a
    • /
    • pp.55-56
    • /
    • 2020
  • The construction industry has the highest occupational fatality and injury rates related to accidents of any industry. Accordingly, safety managers closely monitor to prevent accidents in real-time by installing surveillance cameras at construction sites. However, due to human cognitive ability limitations, it is impossible to monitor many videos simultaneously, and the fatigue of the person monitoring surveillance cameras is also very high. Thus, to help safety managers monitor work and reduce the occupational accident rate, a study on object recognition in construction sites was conducted through surveillance cameras. In this study, we applied to the instance segmentation to identify the classification and location of objects and extract the size and shape of objects in construction sites. This research considers ways in which deep learning-based computer vision technology can be applied to safety management on a construction site.

  • PDF

A Suggestion for Worker Feature Extraction and Multiple-Object Tracking Method in Apartment Construction Sites (아파트 건설 현장 작업자 특징 추출 및 다중 객체 추적 방법 제안)

  • Kang, Kyung-Su;Cho, Young-Woon;Ryu, Han-Guk
    • Proceedings of the Korean Institute of Building Construction Conference
    • /
    • 2021.05a
    • /
    • pp.40-41
    • /
    • 2021
  • The construction industry has the highest occupational accidents/injuries among all industries. Korean government installed surveillance camera systems at construction sites to reduce occupational accident rates. Construction safety managers are monitoring potential hazards at the sites through surveillance system; however, the human capability of monitoring surveillance system with their own eyes has critical issues. Therefore, this study proposed to build a deep learning-based safety monitoring system that can obtain information on the recognition, location, identification of workers and heavy equipment in the construction sites by applying multiple-object tracking with instance segmentation. To evaluate the system's performance, we utilized the MS COCO and MOT challenge metrics. These results present that it is optimal for efficiently automating monitoring surveillance system task at construction sites.

  • PDF

A Study on Automatic Classification of Characterized Ground Regions on Slopes by a Deep Learning based Image Segmentation (딥러닝 영상처리를 통한 비탈면의 지반 특성화 영역 자동 분류에 관한 연구)

  • Lee, Kyu Beom;Shin, Hyu-Soung;Kim, Seung Hyeon;Ha, Dae Mok;Choi, Isu
    • Tunnel and Underground Space
    • /
    • v.29 no.6
    • /
    • pp.508-522
    • /
    • 2019
  • Because of the slope failure, not only property damage but also human damage can occur, slope stability analysis should be conducted to predict and reinforce of the slope. This paper, defines the ground areas that can be characterized in terms of slope failure such as Rockmass jointset, Rockmass fault, Soil, Leakage water and Crush zone in sloped images. As a result, it was shown that the deep learning instance segmentation network can be used to recognize and automatically segment the precise shape of the ground region with different characteristics shown in the image. It showed the possibility of supporting the slope mapping work and automatically calculating the ground characteristics information of slopes necessary for decision making such as slope reinforcement.

Extraction of Workers and Heavy Equipment and Muliti-Object Tracking using Surveillance System in Construction Sites (건설 현장 CCTV 영상을 이용한 작업자와 중장비 추출 및 다중 객체 추적)

  • Cho, Young-Woon;Kang, Kyung-Su;Son, Bo-Sik;Ryu, Han-Guk
    • Journal of the Korea Institute of Building Construction
    • /
    • v.21 no.5
    • /
    • pp.397-408
    • /
    • 2021
  • The construction industry has the highest occupational accidents/injuries and has experienced the most fatalities among entire industries. Korean government installed surveillance camera systems at construction sites to reduce occupational accident rates. Construction safety managers are monitoring potential hazards at the sites through surveillance system; however, the human capability of monitoring surveillance system with their own eyes has critical issues. A long-time monitoring surveillance system causes high physical fatigue and has limitations in grasping all accidents in real-time. Therefore, this study aims to build a deep learning-based safety monitoring system that can obtain information on the recognition, location, identification of workers and heavy equipment in the construction sites by applying multiple object tracking with instance segmentation. To evaluate the system's performance, we utilized the Microsoft common objects in context and the multiple object tracking challenge metrics. These results prove that it is optimal for efficiently automating monitoring surveillance system task at construction sites.

Human Instance Segmentation using Video Data Augmentation (비디오 데이터 보강을 이용한 인물 개체 분할)

  • Chun, Hyun-Jin;Kim, Incheol
    • Annual Conference of KIPS
    • /
    • 2022.11a
    • /
    • pp.532-534
    • /
    • 2022
  • 본 논문에서는 미생 드라마 비디오들을 토대로 구축한 비디오 인물 개체 분할 데이터 집합인 MHIS를 소개하고, 등장인물 클래스 간의 심각한 데이터 불균형 문제를 효과적으로 해결하기 위한 새로운 비디오 데이터 보강 기법인 CDVA를 제안한다. 기존의 비디오 데이터 보강 기법들과는 달리, 새로운 CDVA 보강 기법은 비디오의 시공간적 맥락을 충분히 고려해서 부족한 인물 클래스의 훈련 비디오 데이터들을 추가 생성함으로써, 비디오 개체 분할 신경망 모델의 성능을 효과적으로 개선시킬 수 있다. 본 논문에서는 정량 및 정성 실험들을 통해, 제안 비디오 데이터 보강 기법의 우수성을 입증한다.

Human Assisted Fitting and Matching Primitive Objects to Sparse Point Clouds for Rapid Workspace Modeling in Construction Automation (-건설현장에서의 시공 자동화를 위한 Laser Sensor기반의 Workspace Modeling 방법에 관한 연구-)

  • KWON SOON-WOOK
    • Korean Journal of Construction Engineering and Management
    • /
    • v.5 no.5 s.21
    • /
    • pp.151-162
    • /
    • 2004
  • Current methods for construction site modeling employ large, expensive laser range scanners that produce dense range point clouds of a scene from different perspectives. Days of skilled interpretation and of automatic segmentation may be required to convert the clouds to a finished CAD model. The dynamic nature of the construction environment requires that a real-time local area modeling system be capable of handling a rapidly changing and uncertain work environment. However, in practice, large, simple, and reasonably accurate embodying volumes are adequate feedback to an operator who, for instance, is attempting to place materials in the midst of obstacles with an occluded view. For real-time obstacle avoidance and automated equipment control functions, such volumes also facilitate computational tractability. In this research, a human operator's ability to quickly evaluate and associate objects in a scene is exploited. The operator directs a laser range finder mounted on a pan and tilt unit to collect range points on objects throughout the workspace. These groups of points form sparse range point clouds. These sparse clouds are then used to create geometric primitives for visualization and modeling purposes. Experimental results indicate that these models can be created rapidly and with sufficient accuracy for automated obstacle avoidance and equipment control functions.

3D Clothes Modeling of Virtual Human for Metaverse (메타버스를 위한 가상 휴먼의 3차원 의상 모델링)

  • Kim, Hyun Woo;Kim, Dong Eon;Kim, Yujin;Park, In Kyu
    • Journal of Broadcast Engineering
    • /
    • v.27 no.5
    • /
    • pp.638-653
    • /
    • 2022
  • In this paper, we propose the new method of creating 3D virtual-human reflecting the pattern of clothes worn by the person in the high-resolution whole body front image and the body shape data about the person. To get the pattern of clothes, we proceed Instance Segmentation and clothes parsing using Cascade Mask R-CNN. After, we use Pix2Pix to blur the boundaries and estimate the background color and can get UV-Map of 3D clothes mesh proceeding UV-Map base warping. Also, we get the body shape data using SMPL-X and deform the original clothes and body mesh. With UV-Map of clothes and deformed clothes and body mesh, user finally can see the animation of 3D virtual-human reflecting user's appearance by rendering with the state-of-the game engine, i.e. Unreal Engine.

AI-Based Object Recognition Research for Augmented Reality Character Implementation (증강현실 캐릭터 구현을 위한 AI기반 객체인식 연구)

  • Seok-Hwan Lee;Jung-Keum Lee;Hyun Sim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.6
    • /
    • pp.1321-1330
    • /
    • 2023
  • This study attempts to address the problem of 3D pose estimation for multiple human objects through a single image generated during the character development process that can be used in augmented reality. In the existing top-down method, all objects in the image are first detected, and then each is reconstructed independently. The problem is that inconsistent results may occur due to overlap or depth order mismatch between the reconstructed objects. The goal of this study is to solve these problems and develop a single network that provides consistent 3D reconstruction of all humans in a scene. Integrating a human body model based on the SMPL parametric system into a top-down framework became an important choice. Through this, two types of collision loss based on distance field and loss that considers depth order were introduced. The first loss prevents overlap between reconstructed people, and the second loss adjusts the depth ordering of people to render occlusion inference and annotated instance segmentation consistently. This method allows depth information to be provided to the network without explicit 3D annotation of the image. Experimental results show that this study's methodology performs better than existing methods on standard 3D pose benchmarks, and the proposed losses enable more consistent reconstruction from natural images.