• Title/Summary/Keyword: 객체검출 모델

Search Result 240, Processing Time 0.026 seconds

Selective labeling using image super resolution for improving the efficiency of object detection in low-resolution oriental paintings

  • Moon, Hyeyoung;Kim, Namgyu
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.9
    • /
    • pp.21-32
    • /
    • 2022
  • Image labeling must be preceded in order to perform object detection, and this task is considered a significant burden in building a deep learning model. Tens of thousands of images need to be trained for building a deep learning model, and human labelers have many limitations in labeling these images manually. In order to overcome these difficulties, this study proposes a method to perform object detection without significant performance degradation, even though labeling some images rather than the entire image. Specifically, in this study, low-resolution oriental painting images are converted into high-quality images using a super-resolution algorithm, and the effect of SSIM and PSNR derived in this process on the mAP of object detection is analyzed. We expect that the results of this study can contribute significantly to constructing deep learning models such as image classification, object detection, and image segmentation that require efficient image labeling.

Loitering Behavior Detection Using Shadow Removal and Chromaticity Histogram Matching (그림자 제거와 색도 히스토그램 비교를 이용한 배회행위 검출)

  • Park, Eun-Soo;Lee, Hyung-Ho;Yun, Myoung-Kyu;Kim, Min-Gyu;Kwak, Jong-Hoon;Kim, Hak-Il
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.21 no.6
    • /
    • pp.171-181
    • /
    • 2011
  • Proposed in this paper is the intelligent video surveillance system to effectively detect multiple loitering objects even that disappear from the out of camera's field of view and later return to a target zone. After the background and foreground are segmented using Gaussian mixture model and shadows are removed, the objects returning to the target zone is recognized using the chromaticity histogram and the duration of loitering is preserved. For more accurate measurement of the loitering behavior, the camera calibration is also applied to map the image plane to the real-world ground. Hence, the loitering behavior can be detected by considering the time duration of the object's existence in the real-world space. The experiment was performed using loitering video and all of the loitering behaviors are accurately detected.

Video Analysis System for Action and Emotion Detection by Object with Hierarchical Clustering based Re-ID (계층적 군집화 기반 Re-ID를 활용한 객체별 행동 및 표정 검출용 영상 분석 시스템)

  • Lee, Sang-Hyun;Yang, Seong-Hun;Oh, Seung-Jin;Kang, Jinbeom
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.89-106
    • /
    • 2022
  • Recently, the amount of video data collected from smartphones, CCTVs, black boxes, and high-definition cameras has increased rapidly. According to the increasing video data, the requirements for analysis and utilization are increasing. Due to the lack of skilled manpower to analyze videos in many industries, machine learning and artificial intelligence are actively used to assist manpower. In this situation, the demand for various computer vision technologies such as object detection and tracking, action detection, emotion detection, and Re-ID also increased rapidly. However, the object detection and tracking technology has many difficulties that degrade performance, such as re-appearance after the object's departure from the video recording location, and occlusion. Accordingly, action and emotion detection models based on object detection and tracking models also have difficulties in extracting data for each object. In addition, deep learning architectures consist of various models suffer from performance degradation due to bottlenects and lack of optimization. In this study, we propose an video analysis system consists of YOLOv5 based DeepSORT object tracking model, SlowFast based action recognition model, Torchreid based Re-ID model, and AWS Rekognition which is emotion recognition service. Proposed model uses single-linkage hierarchical clustering based Re-ID and some processing method which maximize hardware throughput. It has higher accuracy than the performance of the re-identification model using simple metrics, near real-time processing performance, and prevents tracking failure due to object departure and re-emergence, occlusion, etc. By continuously linking the action and facial emotion detection results of each object to the same object, it is possible to efficiently analyze videos. The re-identification model extracts a feature vector from the bounding box of object image detected by the object tracking model for each frame, and applies the single-linkage hierarchical clustering from the past frame using the extracted feature vectors to identify the same object that failed to track. Through the above process, it is possible to re-track the same object that has failed to tracking in the case of re-appearance or occlusion after leaving the video location. As a result, action and facial emotion detection results of the newly recognized object due to the tracking fails can be linked to those of the object that appeared in the past. On the other hand, as a way to improve processing performance, we introduce Bounding Box Queue by Object and Feature Queue method that can reduce RAM memory requirements while maximizing GPU memory throughput. Also we introduce the IoF(Intersection over Face) algorithm that allows facial emotion recognized through AWS Rekognition to be linked with object tracking information. The academic significance of this study is that the two-stage re-identification model can have real-time performance even in a high-cost environment that performs action and facial emotion detection according to processing techniques without reducing the accuracy by using simple metrics to achieve real-time performance. The practical implication of this study is that in various industrial fields that require action and facial emotion detection but have many difficulties due to the fails in object tracking can analyze videos effectively through proposed model. Proposed model which has high accuracy of retrace and processing performance can be used in various fields such as intelligent monitoring, observation services and behavioral or psychological analysis services where the integration of tracking information and extracted metadata creates greate industrial and business value. In the future, in order to measure the object tracking performance more precisely, there is a need to conduct an experiment using the MOT Challenge dataset, which is data used by many international conferences. We will investigate the problem that the IoF algorithm cannot solve to develop an additional complementary algorithm. In addition, we plan to conduct additional research to apply this model to various fields' dataset related to intelligent video analysis.

Decompose the Manifold Into Gaussian Densities : Face Detection (다양체 가우시안 분해 : 얼굴 검출)

  • 양준영;변혜란
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2004.10b
    • /
    • pp.682-684
    • /
    • 2004
  • 제안하는 방법은 분산량이 큰 객체에 대하여 여러 개의 가우시안을 이용하여 다양체를 분해하는 알고리즘이다. 제안하는 방법은 단순하지만 빠르게 다양체를 근사시키는 여러 개의 가우시안을 생성한다. 또한, 가우시안 혼합 모델과 유사하나 보다 빠른 연산시간을 보장하며 Outlier에 대한 신뢰성을 향상 시켜준다. 제안하는 알고리즘은 우리가 수집한 다 인종(동양인, 혹인, 백인, 히스패닉) 얼굴 데이터 베이스 QQVGA영상에서 100%의 검출률과 0개의 오분류의 높은 성능을 도출하였다

  • PDF

Advanced Gaussian Mixture Learning for Complex Environment (개선된 적응적 가우시안 혼합 모델을 이용한 객체 검출)

  • Park Dae-Yong;Kim Jae-Min;Cho Seong-Won
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2005.11a
    • /
    • pp.283-289
    • /
    • 2005
  • Background Subtraction은 움직이는 물체 검출에 가장 많이 사용되는 방법 중 하나이다. 배경이 복잡하고 변화가 심한 경우, 배경을 실시간으로 얼마나 정확하게 학습하는가가 물체 검출의 정확도를 결정한다. Gaussian Mixture Model은 이러한 배경의 모델링에 가장 많이 쓰이는 방법이다. Gaussian Mixture Model은 확률적 학습 방법을 사용하는데, 이러한 방법은 물체가 자주 지나다니거나 물체가 멈춰있는 경우, 배경을 정확하게 모델링하지 못한다. 본 논문에서는 밝기 값에 대한 확률적 모델링과 밝기 값의 변화에 따른 처리를 결합하여 혼잡한 환경에서 배경을 정확하게 모델링할 수 있는 학습 방법을 제안한다.

  • PDF

Anomaly Detection in printed patters using U-Net (U-Net 모델을 이용한 비정상 인쇄물 검출 방법)

  • Hong, Soon-Hyun;Nam, Hyeon-Gil;Park, Jong-Il
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2020.07a
    • /
    • pp.686-688
    • /
    • 2020
  • 본 논문에서는 U-Net 모델을 이용하여 정교하고 반복되는 패턴을 가진 인쇄물에 대한 비지도 학습을 통한 딥러닝 기반 이상치탐지(Anomaly Detection) 방법을 제안하였다. 인쇄물(카드)의 비정상 패턴 검출을 위하여 촬영한 영상으로부터 카드 영역을 분리한 이미지로 구성된 Dataset을 구축하였고 정상 이미지와 동일한 이미지를 출력하기 위해, 정상 이미지와 마스크 이미지 쌍의 Training dataset을 U-Net으로 학습하였다. Test dataset의 이미지를 입력으로 넣어 생성된 마스크 결과를 원본 마스크 이미지와 비교하여 이상 여부를 판단하는 본 논문의 방법이 정상, 비정상 인쇄물을 잘 구분하는 것을 확인하였다. 또한 정상과 비정상 이미지 각각을 학습한 지도학습 기반 CNN 분류 방법을 입력 영상과 복원 영상 간의 복원 오차를 비교하여 객체의 이상 여부를 판별하는 본 논문의 방법과 비교 평가하였다. 본 논문을 통해 U-Net을 사용하여 별도로 데이터에 대한 label 취득 없이 이상치를 검출할 수 있음을 확인할 수 있었다.

  • PDF

Design of a deep learning model to determine fire occurrence in distribution switchboard using thermal imaging data (열화상 영상 데이터 기반 배전반 화재 발생 판별을 위한 딥러닝 모델 설계)

  • Dongjoon Park;Minyoung Kim
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.5
    • /
    • pp.737-745
    • /
    • 2023
  • This paper discusses a study on developing an artificial intelligence model to detect incidents of fires in distribution switchboard using thermal images. The objective of the research is to preprocess collected thermal images into suitable data for object detection models and design a model capable of determining the occurrence of fires within distribution panels. The study utilizes thermal image data from AI-HUB's industrial complex for training. Two CNN-based deep learning object detection algorithms, namely Faster R-CNN and RetinaNet, are employed to construct models. The paper compares and analyzes these two models, ultimately proposing the optimal model for the task.

Analysis System for Public Interest Report Video of Traffic Law Violation based on Deep Learning Algorithms (딥러닝 알고리즘 기반 교통법규 위반 공익신고 영상 분석 시스템)

  • Min-Seong Choi;Mi-Kyeong Moon
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.1
    • /
    • pp.63-70
    • /
    • 2023
  • Due to the spread of high-definition black boxes and the introduction of mobile applications such as 'Smart Citizens Report' and 'Safety Report', the number of public interest reports for violations of Traffic Law has increased rapidly, resulting in shortage of police personnel to handle them. In this paper, we describe the development of a system that can automatically detect lane violations which account for the largest proportion of public interest reporting videos for violations of traffic laws, using deep learning algorithms. In this study, a method for recognizing a vehicle and a solid line object using a YOLO model and a Lanenet model, a method for tracking an object individually using a deep sort algorithm, and a method for detecting lane change violations by recognizing the overlapping range of a vehicle object's bounding box and a solid line object are described. Using this system, it is expected that the shortage of police personnel in charge will be resolved.

Face Detection Method based Fusion RetinaNet using RGB-D Image (RGB-D 영상을 이용한 Fusion RetinaNet 기반 얼굴 검출 방법)

  • Nam, Eun-Jeong;Nam, Chung-Hyeon;Jang, Kyung-Sik
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.4
    • /
    • pp.519-525
    • /
    • 2022
  • The face detection task of detecting a person's face in an image is used as a preprocess or core process in various image processing-based applications. The neural network models, which have recently been performing well with the development of deep learning, are dependent on 2D images, so if noise occurs in the image, such as poor camera quality or pool focus of the face, the face may not be detected properly. In this paper, we propose a face detection method that uses depth information together to reduce the dependence of 2D images. The proposed model was trained after generating and preprocessing depth information in advance using face detection dataset, and as a result, it was confirmed that the FRN model was 89.16%, which was about 1.2% better than the RetinaNet model, which showed 87.95%.

Time-Stamp based Locking scheme for Update Spatial Data of Wireless Mobile Client (무선 이동 클라이언트에서 공간 데이터 변경을 위한 타임스탬프 기반 잠금 기법)

  • 이주형;김동현;홍봉희
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2001.04b
    • /
    • pp.37-39
    • /
    • 2001
  • 현재 이동 클라이언트의 발전과 무선 이동 데이터 통신의 발달에 의해 보다 정학한 공간 데이터 수정을 현장에서 수행 할 수 있게 되었다. 이러한 환경을 고려하여, 이 논문에서는 무선 이동 클라언트에서의 공간데이터 변경을 위해서 2-tier 트랜잭션 모델(2)을 이용한다. 이동 트랜잭션은 완료 즉시 서버에 재 접속할 필요가 없기 때문에 이동 트랜잭션의 완료 시점과 재 접속 후 베이스 트랜잭션으로 재 수행하는 시점 사이에 간격이 존재하게 된다. 그리고 고안 데이터 변경 트랜잭션은 교환가능한 트랜잭션이 아니며, 완전히 직렬가능(fully serializerability)해야 한다. 이러한 이유로 갱신 손실 문제(lost update problem)가 발생한다. 이 논문에서는 갱신 손실 문제를 해결하기 위하여 영역 잠금의 타입스탬프 값과 영역 잠금의 영역의 겹침을 이용하여 갱신 손실 가능한 공간객체 집합을 검출해내는 방법을 제시한다. 검출된 갱신 손실 가능한 공간 객체 집합의 완료 시점을 뒤로 연기(postpone)하는 프로토콜도 함께 제시한다.

  • PDF