• Title/Summary/Keyword: Multiple Machine Vision Tasks

Search Result 6, Processing Time 0.021 seconds

Evaluation of Video Codec AI-based Multiple tasks (인공지능 기반 멀티태스크를 위한 비디오 코덱의 성능평가 방법)

  • Kim, Shin;Lee, Yegi;Yoon, Kyoungro;Choo, Hyon-Gon;Lim, Hanshin;Seo, Jeongil
    • Journal of Broadcast Engineering
    • /
    • v.27 no.3
    • /
    • pp.273-282
    • /
    • 2022
  • MPEG-VCM(Video Coding for Machine) aims to standardize video codec for machines. VCM provides data sets and anchors, which provide reference data for comparison, for several machine vision tasks including object detection, object segmentation, and object tracking. The evaluation template can be used to compare compression and machine vision task performance between anchor data and various proposed video codecs. However, performance comparison is carried out separately for each machine vision task, and information related to performance evaluation of multiple machine vision tasks on a single bitstream is not provided currently. In this paper, we propose a performance evaluation method of a video codec for AI-based multi-tasks. Based on bits per pixel (BPP), which is the measure of a single bitstream size, and mean average precision(mAP), which is the accuracy measure of each task, we define three criteria for multi-task performance evaluation such as arithmetic average, weighted average, and harmonic average, and to calculate the multi-tasks performance results based on the mAP values. In addition, as the dynamic range of mAP may very different from task to task, performance results for multi-tasks are calculated and evaluated based on the normalized mAP in order to prevent a problem that would be happened because of the dynamic range.

A Parallel Implementation of Multiple Non-overlapping Cameras for Robot Pose Estimation

  • Ragab, Mohammad Ehab;Elkabbany, Ghada Farouk
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.8 no.11
    • /
    • pp.4103-4117
    • /
    • 2014
  • Image processing and computer vision algorithms are gaining larger concern in a variety of application areas such as robotics and man-machine interaction. Vision allows the development of flexible, intelligent, and less intrusive approaches than most of the other sensor systems. In this work, we determine the location and orientation of a mobile robot which is crucial for performing its tasks. In order to be able to operate in real time there is a need to speed up different vision routines. Therefore, we present and evaluate a method for introducing parallelism into the multiple non-overlapping camera pose estimation algorithm proposed in [1]. In this algorithm the problem has been solved in real time using multiple non-overlapping cameras and the Extended Kalman Filter (EKF). Four cameras arranged in two back-to-back pairs are put on the platform of a moving robot. An important benefit of using multiple cameras for robot pose estimation is the capability of resolving vision uncertainties such as the bas-relief ambiguity. The proposed method is based on algorithmic skeletons for low, medium and high levels of parallelization. The analysis shows that the use of a multiprocessor system enhances the system performance by about 87%. In addition, the proposed design is scalable, which is necaccery in this application where the number of features changes repeatedly.

An Improved Multiple Interval Pixel Sampling based Background Subtraction Algorithm (개선된 다중 구간 샘플링 배경제거 알고리즘)

  • Mahmood, Muhammad Tariq;Choi, Young Kyu
    • Journal of the Semiconductor & Display Technology
    • /
    • v.18 no.3
    • /
    • pp.1-6
    • /
    • 2019
  • Foreground/background segmentation in video sequences is often one of the first tasks in machine vision applications, making it a critical part of the system. In this paper, we present an improved sample-based technique that provides robust background image as well as segmentation mask. The conventional multiple interval sampling (MIS) algorithm have suffer from the unbalance of computation time per frame and the rapid change of confidence factor of background pixel. To balance the computation amount, a random-based pixel update scheme is proposed and a spatial and temporal smoothing technique is adopted to increase reliability of the confidence factor. The proposed method allows the sampling queue to have more dispersed data in time and space, and provides more continuous and reliable confidence factor. Experimental results revealed that our method works well to estimate stable background image and the foreground mask.

Computer vision and deep learning-based post-earthquake intelligent assessment of engineering structures: Technological status and challenges

  • T. Jin;X.W. Ye;W.M. Que;S.Y. Ma
    • Smart Structures and Systems
    • /
    • v.31 no.4
    • /
    • pp.311-323
    • /
    • 2023
  • Ever since ancient times, earthquakes have been a major threat to the civil infrastructures and the safety of human beings. The majority of casualties in earthquake disasters are caused by the damaged civil infrastructures but not by the earthquake itself. Therefore, the efficient and accurate post-earthquake assessment of the conditions of structural damage has been an urgent need for human society. Traditional ways for post-earthquake structural assessment rely heavily on field investigation by experienced experts, yet, it is inevitably subjective and inefficient. Structural response data are also applied to assess the damage; however, it requires mounted sensor networks in advance and it is not intuitional. As many types of damaged states of structures are visible, computer vision-based post-earthquake structural assessment has attracted great attention among the engineers and scholars. With the development of image acquisition sensors, computing resources and deep learning algorithms, deep learning-based post-earthquake structural assessment has gradually shown potential in dealing with image acquisition and processing tasks. This paper comprehensively reviews the state-of-the-art studies of deep learning-based post-earthquake structural assessment in recent years. The conventional way of image processing and machine learning-based structural assessment are presented briefly. The workflow of the methodology for computer vision and deep learning-based post-earthquake structural assessment was introduced. Then, applications of assessment for multiple civil infrastructures are presented in detail. Finally, the challenges of current studies are summarized for reference in future works to improve the efficiency, robustness and accuracy in this field.

Digital Maps and Automatic Narratives for the Interactive Global Histories

  • CHEONG, Siew Ann;NANETTI, Andrea;FHILIPPOV, Mikhail
    • Asian review of World Histories
    • /
    • v.4 no.1
    • /
    • pp.83-123
    • /
    • 2016
  • We describe a vision of historical analysis at the world scale, through the digital assembly of historical sources into a cloud-based database, where machine-learning techniques can be used to summarize the database into a time-integrated actor-to-actor complex network. Using this time-integrated network as a template, we then apply the method of automatic narratives to discover key actors ('who'), key events ('what'), key periods ('when'), key locations ('where'), key motives ('why'), and key actions ('how') that can be presented as hypotheses to world historians. We show two test cases on how this method works. To accelerate the pace of knowledge discovery and verification, we describe how historians would interact with these automatic narratives through an online, map-based knowledge aggregator that learns how scholars filter information, and eventually takes over this function to free historians from the more important tasks of verification, and stitching together coherent storylines. Ultimately, multiple coherent storylines that are not necessary compatible with each other can be discovered through human-computer interactions by the map-based knowledge aggregator.

Pedestrian and Vehicle Distance Estimation Based on Hard Parameter Sharing (하드 파라미터 쉐어링 기반의 보행자 및 운송 수단 거리 추정)

  • Seo, Ji-Won;Cha, Eui-Young
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.3
    • /
    • pp.389-395
    • /
    • 2022
  • Because of improvement of deep learning techniques, deep learning using computer vision such as classification, detection and segmentation has also been used widely at many fields. Expecially, automatic driving is one of the major fields that applies computer vision systems. Also there are a lot of works and researches to combine multiple tasks in a single network. In this study, we propose the network that predicts the individual depth of pedestrians and vehicles. Proposed model is constructed based on YOLOv3 for object detection and Monodepth for depth estimation, and it process object detection and depth estimation consequently using encoder and decoder based on hard parameter sharing. We also used attention module to improve the accuracy of both object detection and depth estimation. Depth is predicted with monocular image, and is trained using self-supervised training method.