• Title/Summary/Keyword: RGB-T Tracking

Search Result 4, Processing Time 0.018 seconds

Dynamic Tracking Aggregation with Transformers for RGB-T Tracking

  • Xiaohu, Liu;Zhiyong, Lei
    • Journal of Information Processing Systems
    • /
    • v.19 no.1
    • /
    • pp.80-88
    • /
    • 2023
  • RGB-thermal (RGB-T) tracking using unmanned aerial vehicles (UAVs) involves challenges with regards to the similarity of objects, occlusion, fast motion, and motion blur, among other issues. In this study, we propose dynamic tracking aggregation (DTA) as a unified framework to perform object detection and data association. The proposed approach obtains fused features based a transformer model and an L1-norm strategy. To link the current frame with recent information, a dynamically updated embedding called dynamic tracking identification (DTID) is used to model the iterative tracking process. For object association, we designed a long short-term tracking aggregation module for dynamic feature propagation to match spatial and temporal embeddings. DTA achieved a highly competitive performance in an experimental evaluation on public benchmark datasets.

Wearless IoT Device Controller based on Deep Neural Network and Hand Tracking (딥 뉴럴 네트워크 및 손 추적 기반의 웨어리스 IoT 장치 컨트롤러)

  • Choi, Seung-June;Kim, Eun-Yeol;Kim, Jung-Hwa;Hwang, Chae-Eun;Choi, Tae-Young
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2018.10a
    • /
    • pp.924-927
    • /
    • 2018
  • 본 논문에서는 거동이 불편한 환자나 장애인들을 위해 신체에 착용하는 부가적인 장비 없이 멀리 있는 가전을 직접 움직이지 않고 편리하게 제어할 수 있는 RGB-D 카메라를 활용한 손 인식과 딥러닝 기반 IoT 장치 컨트롤 시스템을 제안한다. 특히, 제어하고자 하는 장치의 위치를 알기 위하여 YOLO 알고리즘을 이용하여 장치를 인식한다. 또한 그와 동시에 RGB-D 카메라의 라이브러리를 이용하여 사용자의 손을 인식, 현재 사용자 손의 위치와 사용자가 취하는 손동작을 통하여 해당 위치의 장치를 제어한다.

Stereo Vision Based 3-D Motion Tracking for Human Animation

  • Han, Seung-Il;Kang, Rae-Won;Lee, Sang-Jun;Ju, Woo-Suk;Lee, Joan-Jae
    • Journal of Korea Multimedia Society
    • /
    • v.10 no.6
    • /
    • pp.716-725
    • /
    • 2007
  • In this paper we describe a motion tracking algorithm for 3D human animation using stereo vision system. This allows us to extract the motion data of the end effectors of human body by following the movement through segmentation process in HIS or RGB color model, and then blob analysis is used to detect robust shape. When two hands or two foots are crossed at any position and become disjointed, an adaptive algorithm is presented to recognize whether it is left or right one. And the real motion is the 3-D coordinate motion. A mono image data is a data of 2D coordinate. This data doesn't acquire distance from a camera. By stereo vision like human vision, we can acquire a data of 3D motion such as left, right motion from bottom and distance of objects from camera. This requests a depth value including x axis and y axis coordinate in mono image for transforming 3D coordinate. This depth value(z axis) is calculated by disparity of stereo vision by using only end-effectors of images. The position of the inner joints is calculated and 3D character can be visualized using inverse kinematics.

  • PDF

A Real-time People Counting Algorithm Using Background Modeling and CNN (배경모델링과 CNN을 이용한 실시간 피플 카운팅 알고리즘)

  • Yang, HunJun;Jang, Hyeok;Jeong, JaeHyup;Lee, Bowon;Jeong, DongSeok
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.54 no.3
    • /
    • pp.70-77
    • /
    • 2017
  • Recently, Internet of Things (IoT) and deep learning techniques have affected video surveillance systems in various ways. The surveillance features that perform detection, tracking, and classification of specific objects in Closed Circuit Television (CCTV) video are becoming more intelligent. This paper presents real-time algorithm that can run in a PC environment using only a low power CPU. Traditional tracking algorithms combine background modeling using the Gaussian Mixture Model (GMM), Hungarian algorithm, and a Kalman filter; they have relatively low complexity but high detection errors. To supplement this, deep learning technology was used, which can be trained from a large amounts of data. In particular, an SRGB(Sequential RGB)-3 Layer CNN was used on tracked objects to emphasize the features of moving people. Performance evaluation comparing the proposed algorithm with existing ones using HOG and SVM showed move-in and move-out error rate reductions by 7.6 % and 9.0 %, respectively.