• Title/Summary/Keyword: occlusion detection

Search Result 164, Processing Time 0.03 seconds

Patch-Based Processing and Occlusion Area Recovery for True Orthoimage Generation (정밀정사영상 생성을 위한 패치기반 처리와 폐색지역 복원)

  • Yoo, Eun-Jin;Lee, Dong-Cheon
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.28 no.1
    • /
    • pp.83-92
    • /
    • 2010
  • Emergence of high-resolution digital aerial cameras and airborne laser scanners have made innovative progress in photogrammetry and spatial information technology. The purpose of this study is to generate true orthoimage by recovering occlusion areas. The orthoimages were generated patch-based transformation. The occlusion areas were mutually corrected by using multiple aerial images. This study proposed a novel method of building roof based orthoimage generation and an effective method of occlusion area detection and recovery. The proposed methods could be efficient to generate true orthoimages in urban areas where occlusion areas are problematic.

Comparative study of data augmentation methods for fake audio detection (음성위조 탐지에 있어서 데이터 증강 기법의 성능에 관한 비교 연구)

  • KwanYeol Park;Il-Youp Kwak
    • The Korean Journal of Applied Statistics
    • /
    • v.36 no.2
    • /
    • pp.101-114
    • /
    • 2023
  • The data augmentation technique is effectively used to solve the problem of overfitting the model by allowing the training dataset to be viewed from various perspectives. In addition to image augmentation techniques such as rotation, cropping, horizontal flip, and vertical flip, occlusion-based data augmentation methods such as Cutmix and Cutout have been proposed. For models based on speech data, it is possible to use an occlusion-based data-based augmentation technique after converting a 1D speech signal into a 2D spectrogram. In particular, SpecAugment is an occlusion-based augmentation technique for speech spectrograms. In this study, we intend to compare and study data augmentation techniques that can be used in the problem of false-voice detection. Using data from the ASVspoof2017 and ASVspoof2019 competitions held to detect fake audio, a dataset applied with Cutout, Cutmix, and SpecAugment, an occlusion-based data augmentation method, was trained through an LCNN model. All three augmentation techniques, Cutout, Cutmix, and SpecAugment, generally improved the performance of the model. In ASVspoof2017, Cutmix, in ASVspoof2019 LA, Mixup, and in ASVspoof2019 PA, SpecAugment showed the best performance. In addition, increasing the number of masks for SpecAugment helps to improve performance. In conclusion, it is understood that the appropriate augmentation technique differs depending on the situation and data.

Apple Detection Algorithm based on an Improved SSD (개선 된 SSD 기반 사과 감지 알고리즘)

  • Ding, Xilong;Li, Qiutan;Wang, Xufei;Chen, Le;Son, Jinku;Song, Jeong-Young
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.21 no.3
    • /
    • pp.81-89
    • /
    • 2021
  • Under natural conditions, Apple detection has the problems of occlusion and small object detection difficulties. This paper proposes an improved model based on SSD. The SSD backbone network VGG16 is replaced with the ResNet50 network model, and the receptive field structure RFB structure is introduced. The RFB model amplifies the feature information of small objects and improves the detection accuracy of small objects. Combined with the attention mechanism (SE) to filter out the information that needs to be retained, the semantic information of the detection objectis enhanced. An improved SSD algorithm is trained on the VOC2007 data set. Compared with SSD, the improved algorithm has increased the accuracy of occlusion and small object detection by 3.4% and 3.9%. The algorithm has improved the false detection rate and missed detection rate. The improved algorithm proposed in this paper has higher efficiency.

Robust human tracking via key face information

  • Li, Weisheng;Li, Xinyi;Zhou, Lifang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.10
    • /
    • pp.5112-5128
    • /
    • 2016
  • Tracking human body is an important problem in computer vision field. Tracking failures caused by occlusion can lead to wrong rectification of the target position. In this paper, a robust human tracking algorithm is proposed to address the problem of occlusion, rotation and improve the tracking accuracy. It is based on Tracking-Learning-Detection framework. The key auxiliary information is used in the framework which motivated by the fact that a tracking target is usually embedded in the context that provides useful information. First, face localization method is utilized to find key face location information. Second, the relative position relationship is established between the auxiliary information and the target location. With the relevant model, the key face information will get the current target position when a target has disappeared. Thus, the target can be stably tracked even when it is partially or fully occluded. Experiments are conducted in various challenging videos. In conjunction with online update, the results demonstrate that the proposed method outperforms the traditional TLD algorithm, and it has a relatively better tracking performance than other state-of-the-art methods.

Temporal Stereo Matching Using Occlusion Handling (폐색 영역을 고려한 시간 축 스테레오 매칭)

  • Baek, Eu-Tteum;Ho, Yo-Sung
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.54 no.2
    • /
    • pp.99-105
    • /
    • 2017
  • Generally, stereo matching methods are used to estimate depth information based on color and spatial similarity. However, most depth estimation methods suffer from the occlusion region because occlusion regions cause inaccurate depth information. Moreover, they do not consider the temporal dimension when estimating the disparity. In this paper, we propose a temporal stereo matching method, considering occlusion and disregarding inaccurate temporal depth information. First, we apply a global stereo matching algorithm to estimate the depth information, we segment the image to occlusion and non-occlusion regions. After occlusion detection, we fill the occluded region with a reasonable disparity value that are obtained from neighboring pixels of the current pixel. Then, we apply a temporal disparity estimation method using the reliable information. Experimental results show that our method detects more accurate occlusion regions, compared to a conventional method. The proposed method increases the temporal consistency of estimated disparity maps and outperforms per-frame methods in noisy images.

A Robust Marker Detection Algorithm Using Hybrid Features in Augmented Reality (증강현실 환경에서 복합특징 기반의 강인한 마커 검출 알고리즘)

  • Park, Gyu-Ho;Lee, Heng-Suk;Han, Kyu-Phil
    • The KIPS Transactions:PartA
    • /
    • v.17A no.4
    • /
    • pp.189-196
    • /
    • 2010
  • This paper presents an improved marker detection algorithm using hybrid features such as corner, line segment, region, and adaptive threshold values, etc. In usual augmented reality environments, there are often marker occlusion and poor illumination. However, existing ARToolkit fails to recognize the marker in these situations, especially, partial concealment of marker by user, large change of illumination and dim circumstances. In order to solve these problems, the adaptive threshold technique is adopted to extract a marker region and a corner extraction method based on line segments is presented against marker occlusions. In addition, a compensating method, corresponding the marker size and center between registered and extracted one, is proposed to increase the template matching efficiency, because the inside marker size of warped images is slightly distorted due to the movement of corner and warping. Therefore, experimental results showed that the proposed algorithm can robustly detect the marker in severe illumination change and occlusion environment and use similar markers because the matching efficiency was increased almost 30%.

Pedestrian Counting System based on Average Filter Tracking for Measuring Advertisement Effectiveness of Digital Signage (디지털 사이니지의 광고효과 측정을 위한 평균 필터 추적 기반 유동인구 수 측정 시스템)

  • Kim, Kiyong;Yoon, Kyoungro
    • Journal of Broadcast Engineering
    • /
    • v.21 no.4
    • /
    • pp.493-505
    • /
    • 2016
  • Among modern computer vision and video surveillance systems, the pedestrian counting system is a one of important systems in terms of security, scheduling and advertising. In the field of, pedestrian counting remains a variety of challenges such as changes in illumination, partial occlusion, overlap and people detection. During pedestrian counting process, the biggest problem is occlusion effect in crowded environment. Occlusion and overlap must be resolved for accurate people counting. In this paper, we propose a novel pedestrian counting system which improves existing pedestrian tracking method. Unlike existing pedestrian tracking method, proposed method shows that average filter tracking method can improve tracking performance. Also proposed method improves tracking performance through frame compensation and outlier removal. At the same time, we keep various information of tracking objects. The proposed method improves counting accuracy and reduces error rate about S6 dataset and S7 dataset. Also our system provides real time detection at the rate of 80 fps.

A New True Ortho-photo Generation Algorithm for High Resolution Satellite Imagery

  • Bang, Ki-In;Kim, Chang-Jae
    • Korean Journal of Remote Sensing
    • /
    • v.26 no.3
    • /
    • pp.347-359
    • /
    • 2010
  • Ortho-photos provide valuable spatial and spectral information for various Geographic Information System (GIS) and mapping applications. The absence of relief displacement and the uniform scale in ortho-photos enable interested users to measure distances, compute areas, derive geographic locations, and quantify changes. Differential rectification has traditionally been used for ortho-photo generation. However, differential rectification produces serious problems (in the form of ghost images) when dealing with large scale imagery over urban areas. To avoid these artifacts, true ortho-photo generation techniques have been devised to remove ghost images through visibility analysis and occlusion detection. So far, the Z-buffer method has been one of the most popular methods for true ortho-photo generation. However, it is quite sensitive to the relationship between the cell size of the Digital Surface Model (DSM) and the Ground Sampling Distance (GSD) of the imaging sensor. Another critical issue of true ortho-photo generation using high resolution satellite imagery is the scan line search. In other words, the perspective center corresponding to each ground point should be identified since we are dealing with a line camera. This paper introduces alternative methodology for true ortho-photo generation that circumvents the drawbacks of the Z-buffer technique and the existing scan line search methods. The experiments using real data are carried out while comparing the performance of the proposed and the existing methods through qualitative and quantitative evaluations and computational efficiency. The experimental analysis proved that the proposed method provided the best success ratio of the occlusion detection and had reasonable processing time compared to all other true ortho-photo generation methods tested in this paper.

Video Analysis System for Action and Emotion Detection by Object with Hierarchical Clustering based Re-ID (계층적 군집화 기반 Re-ID를 활용한 객체별 행동 및 표정 검출용 영상 분석 시스템)

  • Lee, Sang-Hyun;Yang, Seong-Hun;Oh, Seung-Jin;Kang, Jinbeom
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.89-106
    • /
    • 2022
  • Recently, the amount of video data collected from smartphones, CCTVs, black boxes, and high-definition cameras has increased rapidly. According to the increasing video data, the requirements for analysis and utilization are increasing. Due to the lack of skilled manpower to analyze videos in many industries, machine learning and artificial intelligence are actively used to assist manpower. In this situation, the demand for various computer vision technologies such as object detection and tracking, action detection, emotion detection, and Re-ID also increased rapidly. However, the object detection and tracking technology has many difficulties that degrade performance, such as re-appearance after the object's departure from the video recording location, and occlusion. Accordingly, action and emotion detection models based on object detection and tracking models also have difficulties in extracting data for each object. In addition, deep learning architectures consist of various models suffer from performance degradation due to bottlenects and lack of optimization. In this study, we propose an video analysis system consists of YOLOv5 based DeepSORT object tracking model, SlowFast based action recognition model, Torchreid based Re-ID model, and AWS Rekognition which is emotion recognition service. Proposed model uses single-linkage hierarchical clustering based Re-ID and some processing method which maximize hardware throughput. It has higher accuracy than the performance of the re-identification model using simple metrics, near real-time processing performance, and prevents tracking failure due to object departure and re-emergence, occlusion, etc. By continuously linking the action and facial emotion detection results of each object to the same object, it is possible to efficiently analyze videos. The re-identification model extracts a feature vector from the bounding box of object image detected by the object tracking model for each frame, and applies the single-linkage hierarchical clustering from the past frame using the extracted feature vectors to identify the same object that failed to track. Through the above process, it is possible to re-track the same object that has failed to tracking in the case of re-appearance or occlusion after leaving the video location. As a result, action and facial emotion detection results of the newly recognized object due to the tracking fails can be linked to those of the object that appeared in the past. On the other hand, as a way to improve processing performance, we introduce Bounding Box Queue by Object and Feature Queue method that can reduce RAM memory requirements while maximizing GPU memory throughput. Also we introduce the IoF(Intersection over Face) algorithm that allows facial emotion recognized through AWS Rekognition to be linked with object tracking information. The academic significance of this study is that the two-stage re-identification model can have real-time performance even in a high-cost environment that performs action and facial emotion detection according to processing techniques without reducing the accuracy by using simple metrics to achieve real-time performance. The practical implication of this study is that in various industrial fields that require action and facial emotion detection but have many difficulties due to the fails in object tracking can analyze videos effectively through proposed model. Proposed model which has high accuracy of retrace and processing performance can be used in various fields such as intelligent monitoring, observation services and behavioral or psychological analysis services where the integration of tracking information and extracted metadata creates greate industrial and business value. In the future, in order to measure the object tracking performance more precisely, there is a need to conduct an experiment using the MOT Challenge dataset, which is data used by many international conferences. We will investigate the problem that the IoF algorithm cannot solve to develop an additional complementary algorithm. In addition, we plan to conduct additional research to apply this model to various fields' dataset related to intelligent video analysis.