• 제목/요약/키워드: Multiple Machine Vision Tasks

검색결과 6건 처리시간 0.017초

인공지능 기반 멀티태스크를 위한 비디오 코덱의 성능평가 방법 (Evaluation of Video Codec AI-based Multiple tasks)

  • 김신;이예지;윤경로;추현곤;임한신;서정일
    • 방송공학회논문지
    • /
    • 제27권3호
    • /
    • pp.273-282
    • /
    • 2022
  • MPEG 내 VCM 그룹은 머신을 위한 비디오 코덱을 표준화하는 것으로 목표로 하고 있다. VCM 그룹은 객체 탐지, 객체 분할, 객체 추적 등 3가지의 머신비전 태스크를 포함한 데이터 세트와 데이터 세트 별 기준 데이터인 Anchor를 제공하고 있으며, 평가 템플릿을 이용하여 후보 기술군과 Anchor의 압축 대비 머신비전 성능을 비교할 수 있다. 하지만 성능 비교는 머신비전 태스크 별로 분리하여 수행되고 있으며, 다수의 머신비전 태스크에 대한 성능 평가를 수행할 수 있는 비트스트림을 생성할 수 있는 데이터는 별도로 제공하고 있지 않다. 본 논문에서는 인공 지능 기반 멀티 태스크를 위한 비디오 코덱의 성능 평가 방안에 대해 제안한다. 하나의 비트스트림의 크기 척도인 픽셀 당 비트수(BPP, Bits Per Pixel) 와 각 태스크의 정확도 결과인 Mean Average Precision(mAP)를 기반으로 산술 평균, 가중 평균, 조화 평균 등 총 3가지의 멀티 태스크 성능 평가 지표를 제안하며 mAP 결과를 기반으로 성능 결과를 비교하고자 한다. 멀티 태스크에서 태스크 별 mAP 결과 값의 범위의 차이가 있을 수 있으며 차이로 인해 생길 수 있는 성능 평가와 관련된 문제를 방지하고자 정규화한 mAP 기반 멀티 태스크 성능 결과를 산출하고 평가하고자 한다.

A Parallel Implementation of Multiple Non-overlapping Cameras for Robot Pose Estimation

  • Ragab, Mohammad Ehab;Elkabbany, Ghada Farouk
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제8권11호
    • /
    • pp.4103-4117
    • /
    • 2014
  • Image processing and computer vision algorithms are gaining larger concern in a variety of application areas such as robotics and man-machine interaction. Vision allows the development of flexible, intelligent, and less intrusive approaches than most of the other sensor systems. In this work, we determine the location and orientation of a mobile robot which is crucial for performing its tasks. In order to be able to operate in real time there is a need to speed up different vision routines. Therefore, we present and evaluate a method for introducing parallelism into the multiple non-overlapping camera pose estimation algorithm proposed in [1]. In this algorithm the problem has been solved in real time using multiple non-overlapping cameras and the Extended Kalman Filter (EKF). Four cameras arranged in two back-to-back pairs are put on the platform of a moving robot. An important benefit of using multiple cameras for robot pose estimation is the capability of resolving vision uncertainties such as the bas-relief ambiguity. The proposed method is based on algorithmic skeletons for low, medium and high levels of parallelization. The analysis shows that the use of a multiprocessor system enhances the system performance by about 87%. In addition, the proposed design is scalable, which is necaccery in this application where the number of features changes repeatedly.

개선된 다중 구간 샘플링 배경제거 알고리즘 (An Improved Multiple Interval Pixel Sampling based Background Subtraction Algorithm)

  • 무하마드 타릭 마흐무드;최영규
    • 반도체디스플레이기술학회지
    • /
    • 제18권3호
    • /
    • pp.1-6
    • /
    • 2019
  • Foreground/background segmentation in video sequences is often one of the first tasks in machine vision applications, making it a critical part of the system. In this paper, we present an improved sample-based technique that provides robust background image as well as segmentation mask. The conventional multiple interval sampling (MIS) algorithm have suffer from the unbalance of computation time per frame and the rapid change of confidence factor of background pixel. To balance the computation amount, a random-based pixel update scheme is proposed and a spatial and temporal smoothing technique is adopted to increase reliability of the confidence factor. The proposed method allows the sampling queue to have more dispersed data in time and space, and provides more continuous and reliable confidence factor. Experimental results revealed that our method works well to estimate stable background image and the foreground mask.

Computer vision and deep learning-based post-earthquake intelligent assessment of engineering structures: Technological status and challenges

  • T. Jin;X.W. Ye;W.M. Que;S.Y. Ma
    • Smart Structures and Systems
    • /
    • 제31권4호
    • /
    • pp.311-323
    • /
    • 2023
  • Ever since ancient times, earthquakes have been a major threat to the civil infrastructures and the safety of human beings. The majority of casualties in earthquake disasters are caused by the damaged civil infrastructures but not by the earthquake itself. Therefore, the efficient and accurate post-earthquake assessment of the conditions of structural damage has been an urgent need for human society. Traditional ways for post-earthquake structural assessment rely heavily on field investigation by experienced experts, yet, it is inevitably subjective and inefficient. Structural response data are also applied to assess the damage; however, it requires mounted sensor networks in advance and it is not intuitional. As many types of damaged states of structures are visible, computer vision-based post-earthquake structural assessment has attracted great attention among the engineers and scholars. With the development of image acquisition sensors, computing resources and deep learning algorithms, deep learning-based post-earthquake structural assessment has gradually shown potential in dealing with image acquisition and processing tasks. This paper comprehensively reviews the state-of-the-art studies of deep learning-based post-earthquake structural assessment in recent years. The conventional way of image processing and machine learning-based structural assessment are presented briefly. The workflow of the methodology for computer vision and deep learning-based post-earthquake structural assessment was introduced. Then, applications of assessment for multiple civil infrastructures are presented in detail. Finally, the challenges of current studies are summarized for reference in future works to improve the efficiency, robustness and accuracy in this field.

Digital Maps and Automatic Narratives for the Interactive Global Histories

  • CHEONG, Siew Ann;NANETTI, Andrea;FHILIPPOV, Mikhail
    • Asian review of World Histories
    • /
    • 제4권1호
    • /
    • pp.83-123
    • /
    • 2016
  • We describe a vision of historical analysis at the world scale, through the digital assembly of historical sources into a cloud-based database, where machine-learning techniques can be used to summarize the database into a time-integrated actor-to-actor complex network. Using this time-integrated network as a template, we then apply the method of automatic narratives to discover key actors ('who'), key events ('what'), key periods ('when'), key locations ('where'), key motives ('why'), and key actions ('how') that can be presented as hypotheses to world historians. We show two test cases on how this method works. To accelerate the pace of knowledge discovery and verification, we describe how historians would interact with these automatic narratives through an online, map-based knowledge aggregator that learns how scholars filter information, and eventually takes over this function to free historians from the more important tasks of verification, and stitching together coherent storylines. Ultimately, multiple coherent storylines that are not necessary compatible with each other can be discovered through human-computer interactions by the map-based knowledge aggregator.

하드 파라미터 쉐어링 기반의 보행자 및 운송 수단 거리 추정 (Pedestrian and Vehicle Distance Estimation Based on Hard Parameter Sharing)

  • 서지원;차의영
    • 한국정보통신학회논문지
    • /
    • 제26권3호
    • /
    • pp.389-395
    • /
    • 2022
  • 심층 학습 기술의 발전으로 인해 분류, 객체 검출, 분할과 같은 시각 정보를 이용한 심층 학습이 다양한 분야에서 활용되고 있다. 그 중 자율 주행은 시각 데이터를 잘 활용하는 대표적인 분야 중 하나이다. 본 논문에서는 도로 위의 사람과 운송수단 객체에 대한 개별적인 깊이 값을 예측하는 망을 제안한다. 제안하는 모델은 YOLOv3와 Monodepth를 기반으로 하며, 하드 파라미터 쉐어링을 이용한 인코더와 디코더를 통해 객체 검출과 깊이 추정을 동시에 수행한다. 또한 주의 집중 기법을 사용하여 객체 검출 및 깊이 추정의 정확도를 높이고자 하였다. 깊이 추정은 단안 이미지를 통해 이루어지며, 자가 학습 방법을 통해 학습을 수행하였다.