• Title/Summary/Keyword: Mask R-CNN(), Deep Learning

Search Result 48, Processing Time 0.025 seconds

Abnormal Behavior Detection and Localization Using Aspect Ratio Based on Mask R-CNN (Mask R-CNN 기반 Aspect Ratio를 활용한 이상행동 검출 및 영역화 방법)

  • Lim, Hyunseok;Hu, Xufeng;Gwak, Jeonghwan
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2022.01a
    • /
    • pp.99-101
    • /
    • 2022
  • 이상 행동을 탐지하는 딥러닝 기반 검지 시스템은 동영상 기반 데이터로부터 움직임을 보이는 객체를 추적하고 그 객체의 행동을 분석하여 정상적인 행동 범위를 벗어나는 패턴을 보이는 영역을 이상으로 탐지한다. 특히 생성적 적대 신경망(GAN)과 광학 흐름 추정(Optical flow estimation) 기법을 활용하여 움직임에 대한 특징 정보를 추출하고 이를 학습하여 행동 패턴에 대한 모델링을 수행한다. 모델 학습 및 테스트에 활용되는 데이터셋의 해상도가 낮거나 이상 행동을 표현하는 특징 정보가 부족할 경우 최종 모델 성능에 부정적 영향을 미치게 되며, 특히 광학 흐름이 표현하는 이동량 측면에서 차이가 크게 나지 않는 이상 객체의 경우 탐지가 정확하게 이뤄지지 않는다. 본 연구에서는 동영상 프레임에서 나타나는 객체의 평균 종횡비를 구하고 정상적인 비율을 벗어나는 객체에 대해서 이상 행동을 취하는 샘플로 처리하는 후처리단 모듈을 제안하여 최종적인 모델 성능을 향상시키는 방법을 고안한다.

  • PDF

Implementation of CNN-based Masking Algorithm for Post Processing of Aerial Image

  • CHOI, Eunsoo;QUAN, Zhixuan;JUNG, Sangwoo
    • Korean Journal of Artificial Intelligence
    • /
    • v.9 no.2
    • /
    • pp.7-14
    • /
    • 2021
  • Purpose: To solve urban problems, empirical research is being actively conducted to implement a smart city based on various ICT technologies, and digital twin technology is needed to effectively implement a smart city. A digital twin is essential for the realization of a smart city. A digital twin is a virtual environment that intuitively visualizes multidimensional data in the real world based on 3D. Digital twin is implemented on the premise of the convergence of GIS and BIM, and in particular, a lot of time is invested in data pre-processing and labeling in the data construction process. In digital twin, data quality is prioritized for consistency with reality, but there is a limit to data inspection with the naked eye. Therefore, in order to improve the required time and quality of digital twin construction, it was attempted to detect a building using Mask R-CNN, a deep learning-based masking algorithm for aerial images. If the results of this study are advanced and used to build digital twin data, it is thought that a high-quality smart city can be realized.

Implementation of AI-based Object Recognition Model for Improving Driving Safety of Electric Mobility Aids (전동 이동 보조기기 주행 안전성 향상을 위한 AI기반 객체 인식 모델의 구현)

  • Je-Seung Woo;Sun-Gi Hong;Jun-Mo Park
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.23 no.3
    • /
    • pp.166-172
    • /
    • 2022
  • In this study, we photograph driving obstacle objects such as crosswalks, side spheres, manholes, braille blocks, partial ramps, temporary safety barriers, stairs, and inclined curb that hinder or cause inconvenience to the movement of the vulnerable using electric mobility aids. We develop an optimal AI model that classifies photographed objects and automatically recognizes them, and implement an algorithm that can efficiently determine obstacles in front of electric mobility aids. In order to enable object detection to be AI learning with high probability, the labeling form is labeled as a polygon form when building a dataset. It was developed using a Mask R-CNN model in Detectron2 framework that can detect objects labeled in the form of polygons. Image acquisition was conducted by dividing it into two groups: the general public and the transportation weak, and image information obtained in two areas of the test bed was secured. As for the parameter setting of the Mask R-CNN learning result, it was confirmed that the model learned with IMAGES_PER_BATCH: 2, BASE_LEARNING_RATE 0.001, MAX_ITERATION: 10,000 showed the highest performance at 68.532, so that the user can quickly and accurately recognize driving risks and obstacles.

Implementation of AI-based Object Recognition Model for Improving Driving Safety of Electric Mobility Aids (객체 인식 모델과 지면 투영기법을 활용한 영상 내 다중 객체의 위치 보정 알고리즘 구현)

  • Dong-Seok Park;Sun-Gi Hong;Jun-Mo Park
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.24 no.2
    • /
    • pp.119-125
    • /
    • 2023
  • In this study, we photograph driving obstacle objects such as crosswalks, side spheres, manholes, braille blocks, partial ramps, temporary safety barriers, stairs, and inclined curb that hinder or cause inconvenience to the movement of the vulnerable using electric mobility aids. We develop an optimal AI model that classifies photographed objects and automatically recognizes them, and implement an algorithm that can efficiently determine obstacles in front of electric mobility aids. In order to enable object detection to be AI learning with high probability, the labeling form is labeled as a polygon form when building a dataset. It was developed using a Mask R-CNN model in Detectron2 framework that can detect objects labeled in the form of polygons. Image acquisition was conducted by dividing it into two groups: the general public and the transportation weak, and image information obtained in two areas of the test bed was secured. As for the parameter setting of the Mask R-CNN learning result, it was confirmed that the model learned with IMAGES_PER_BATCH: 2, BASE_LEARNING_RATE 0.001, MAX_ITERATION: 10,000 showed the highest performance at 68.532, so that the user can quickly and accurately recognize driving risks and obstacles.

A Dataset of Ground Vehicle Targets from Satellite SAR Images and Its Application to Detection and Instance Segmentation (위성 SAR 영상의 지상차량 표적 데이터 셋 및 탐지와 객체분할로의 적용)

  • Park, Ji-Hoon;Choi, Yeo-Reum;Chae, Dae-Young;Lim, Ho;Yoo, Ji Hee
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.25 no.1
    • /
    • pp.30-44
    • /
    • 2022
  • The advent of deep learning-based algorithms has facilitated researches on target detection from synthetic aperture radar(SAR) imagery. While most of them concentrate on detection tasks for ships with open SAR ship datasets and for aircraft from SAR scenes of airports, there is relatively scarce researches on the detection of SAR ground vehicle targets where several adverse factors such as high false alarm rates, low signal-to-clutter ratios, and multiple targets in close proximity are predicted to degrade the performances. In this paper, a dataset of ground vehicle targets acquired from TerraSAR-X(TSX) satellite SAR images is presented. Then, both detection and instance segmentation are simultaneously carried out on this dataset based on the deep learning-based Mask R-CNN. Finally, this paper shows the future research directions to further improve the performances of detecting the SAR ground vehicle targets.

"Where can I buy this?" - Fashion Item Searcher using Instance Segmentation with Mask R-CNN ("이거 어디서 사?" - Mask R-CNN 기반 객체 분할을 활용한 패션 아이템 검색 시스템)

  • Jung, Kyunghee;Choi, Ha nl;Sammy, Y.X.B.;Kim, Hyunsung;Toan, N.D.;Choo, Hyunseung
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.11a
    • /
    • pp.465-467
    • /
    • 2022
  • Mobile phones have become an essential item nowadays since it provides access to online platform and service fast and easy. Coming to these platforms such as Social Network Service (SNS) for shopping have been a go-to option for many people. However, searching for a specific fashion item in the picture is challenging, where users need to try multiple searches by combining appropriate search keywords. To tackle this problem, we propose a system that could provide immediate access to websites related to fashion items. In the framework, we also propose a deep learning model for an automatic analysis of image contexts using instance segmentation. We use transfer learning by utilizing Deep fashion 2 to maximize our model accuracy. After segmenting all the fashion item objects in the image, the related search information is retrieved when the object is clicked. Furthermore, we successfully deploy our system so that it could be assessable using any web browser. We prove that deep learning could be a promising tool not only for scientific purpose but also applicable to commercial shopping.

Automatic Dataset Generation of Object Detection and Instance Segmentation using Mask R-CNN (Mask R-CNN을 이용한 물체인식 및 개체분할의 학습 데이터셋 자동 생성)

  • Jo, HyunJun;Kim, Dawit;Song, Jae-Bok
    • The Journal of Korea Robotics Society
    • /
    • v.14 no.1
    • /
    • pp.31-39
    • /
    • 2019
  • A robot usually adopts ANN (artificial neural network)-based object detection and instance segmentation algorithms to recognize objects but creating datasets for these algorithms requires high labeling costs because the dataset should be manually labeled. In order to lower the labeling cost, a new scheme is proposed that can automatically generate a training images and label them for specific objects. This scheme uses an instance segmentation algorithm trained to give the masks of unknown objects, so that they can be obtained in a simple environment. The RGB images of objects can be obtained by using these masks, and it is necessary to label the classes of objects through a human supervision. After obtaining object images, they are synthesized with various background images to create new images. Labeling the synthesized images is performed automatically using the masks and previously input object classes. In addition, human intervention is further reduced by using the robot arm to collect object images. The experiments show that the performance of instance segmentation trained through the proposed method is equivalent to that of the real dataset and that the time required to generate the dataset can be significantly reduced.

Crack segmentation in high-resolution images using cascaded deep convolutional neural networks and Bayesian data fusion

  • Tang, Wen;Wu, Rih-Teng;Jahanshahi, Mohammad R.
    • Smart Structures and Systems
    • /
    • v.29 no.1
    • /
    • pp.221-235
    • /
    • 2022
  • Manual inspection of steel box girders on long span bridges is time-consuming and labor-intensive. The quality of inspection relies on the subjective judgements of the inspectors. This study proposes an automated approach to detect and segment cracks in high-resolution images. An end-to-end cascaded framework is proposed to first detect the existence of cracks using a deep convolutional neural network (CNN) and then segment the crack using a modified U-Net encoder-decoder architecture. A Naïve Bayes data fusion scheme is proposed to reduce the false positives and false negatives effectively. To generate the binary crack mask, first, the original images are divided into 448 × 448 overlapping image patches where these image patches are classified as cracks versus non-cracks using a deep CNN. Next, a modified U-Net is trained from scratch using only the crack patches for segmentation. A customized loss function that consists of binary cross entropy loss and the Dice loss is introduced to enhance the segmentation performance. Additionally, a Naïve Bayes fusion strategy is employed to integrate the crack score maps from different overlapping crack patches and to decide whether a pixel is crack or not. Comprehensive experiments have demonstrated that the proposed approach achieves an 81.71% mean intersection over union (mIoU) score across 5 different training/test splits, which is 7.29% higher than the baseline reference implemented with the original U-Net.

Emergency Situation Recognition System Using CCTV and Deep Learning (CCTV와 딥러닝을 이용한 응급 상황 인식 시스템)

  • Park, SeJun;Jeong, Beom-jin;Lee, Jeong-joon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2020.11a
    • /
    • pp.807-809
    • /
    • 2020
  • 기존의 CCTV 관리 체계는 사건·사고에 대한 신속한 조치가 불가능하고 정황 파악이나 증거자료 확보 등 사후조치의 성격이 강하다. 본 논문에서는 Mask R-CNN(Regions with CNN)을 이용하여 CCTV가 읽어 들이는 객체가 응급상황인지 판단하는 방법을 제시한다. 사람으로 인식되는 영역을 다층 퍼셉트론(MLP, Multi-Layer Perceptron)으로 학습시켜 해당 대상이 처한 상황을 인지하고 응급상황으로 인식되는 상황이 지속될 경우 관리 모니터를 통해 사용자에게 알림을 준다. 본 연구를 통해 실시간 상호작용적인 CCTV 관리 체계를 구축하여 도움이 필요한 사람의 골든타임을 놓치지 않게 될 것으로 기대한다.

Fall Situation Recognition by Body Centerline Detection using Deep Learning

  • Kim, Dong-hyeon;Lee, Dong-seok;Kwon, Soon-kak
    • Journal of Multimedia Information System
    • /
    • v.7 no.4
    • /
    • pp.257-262
    • /
    • 2020
  • In this paper, a method of detecting the emergency situations such as body fall is proposed by using color images. We detect body areas and key parts of a body through a pre-learned Mask R-CNN in the images captured by a camera. Then we find the centerline of the body through the joint points of both shoulders and feet. Also, we calculate an angle to the center line and then calculate the amount of change in the angle per hour. If the angle change is more than a certain value, then it is decided as a suspected fall. Also, if the suspected fall state persists for more than a certain frame, then it is determined as a fall situation. Simulation results show that the proposed method can detect body fall situation accurately.