• Title/Summary/Keyword: multi-view fusion

Search Result 33, Processing Time 0.024 seconds

Multiple Color and ToF Camera System for 3D Contents Generation

  • Ho, Yo-Sung
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.6 no.3
    • /
    • pp.175-182
    • /
    • 2017
  • In this paper, we present a multi-depth generation method using a time-of-flight (ToF) fusion camera system. Multi-view color cameras in the parallel type and ToF depth sensors are used for 3D scene capturing. Although each ToF depth sensor can measure the depth information of the scene in real-time, it has several problems to overcome. Therefore, after we capture low-resolution depth images by ToF depth sensors, we perform a post-processing to solve the problems. Then, the depth information of the depth sensor is warped to color image positions and used as initial disparity values. In addition, the warped depth data is used to generate a depth-discontinuity map for efficient stereo matching. By applying the stereo matching using belief propagation with the depth-discontinuity map and the initial disparity information, we have obtained more accurate and stable multi-view disparity maps in reduced time.

Lane Information Fusion Scheme using Multiple Lane Sensors (다중센서 기반 차선정보 시공간 융합기법)

  • Lee, Soomok;Park, Gikwang;Seo, Seung-woo
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.12
    • /
    • pp.142-149
    • /
    • 2015
  • Most of the mono-camera based lane detection systems are fragile on poor illumination conditions. In order to compensate limitations of single sensor utilization, lane information fusion system using multiple lane sensors is an alternative to stabilize performance and guarantee high precision. However, conventional fusion schemes, which only concerns object detection, are inappropriate to apply to the lane information fusion. Even few studies considering lane information fusion have dealt with limited aids on back-up sensor or omitted cases of asynchronous multi-rate and coverage. In this paper, we propose a lane information fusion scheme utilizing multiple lane sensors with different coverage and cycle. The precise lane information fusion is achieved by the proposed fusion framework which considers individual ranging capability and processing time of diverse types of lane sensors. In addition, a novel lane estimation model is proposed to synchronize multi-rate sensors precisely by up-sampling spare lane information signals. Through quantitative vehicle-level experiments with around view monitoring system and frontal camera system, we demonstrate the robustness of the proposed lane fusion scheme.

A depth-based Multi-view Super-Resolution Method Using Image Fusion and Blind Deblurring

  • Fan, Jun;Zeng, Xiangrong;Huangpeng, Qizi;Liu, Yan;Long, Xin;Feng, Jing;Zhou, Jinglun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.10
    • /
    • pp.5129-5152
    • /
    • 2016
  • Multi-view super-resolution (MVSR) aims to estimate a high-resolution (HR) image from a set of low-resolution (LR) images that are captured from different viewpoints (typically by different cameras). MVSR is usually applied in camera array imaging. Given that MVSR is an ill-posed problem and is typically computationally costly, we super-resolve multi-view LR images of the original scene via image fusion (IF) and blind deblurring (BD). First, we reformulate the MVSR problem into two easier problems: an IF problem and a BD problem. We further solve the IF problem on the premise of calculating the depth map of the desired image ahead, and then solve the BD problem, in which the optimization problems with respect to the desired image and with respect to the unknown blur are efficiently addressed by the alternating direction method of multipliers (ADMM). Our approach bridges the gap between MVSR and BD, taking advantages of existing BD methods to address MVSR. Thus, this approach is appropriate for camera array imaging because the blur kernel is typically unknown in practice. Corresponding experimental results using real and synthetic images demonstrate the effectiveness of the proposed method.

Multi-target Tracking Filters and Data Association: A Survey (다중표적 추적필터와 자료연관 기법동향)

  • Song, Taek Lyul
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.20 no.3
    • /
    • pp.313-322
    • /
    • 2014
  • This paper is to survey and put in perspective the working methods of multi-target tracking in clutter. This paper includes theories and practices for data association and related filter structures and is motivated by increasing interest in the area of target tracking, security, surveillance, and multi-sensor data fusion. It is hoped that it will be useful in view of taking into consideration a full understanding of existing techniques before using them in practice.

Effective Multi-Modal Feature Fusion for 3D Semantic Segmentation with Multi-View Images (멀티-뷰 영상들을 활용하는 3차원 의미적 분할을 위한 효과적인 멀티-모달 특징 융합)

  • Hye-Lim Bae;Incheol Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.12
    • /
    • pp.505-518
    • /
    • 2023
  • 3D point cloud semantic segmentation is a computer vision task that involves dividing the point cloud into different objects and regions by predicting the class label of each point. Existing 3D semantic segmentation models have some limitations in performing sufficient fusion of multi-modal features while ensuring both characteristics of 2D visual features extracted from RGB images and 3D geometric features extracted from point cloud. Therefore, in this paper, we propose MMCA-Net, a novel 3D semantic segmentation model using 2D-3D multi-modal features. The proposed model effectively fuses two heterogeneous 2D visual features and 3D geometric features by using an intermediate fusion strategy and a multi-modal cross attention-based fusion operation. Also, the proposed model extracts context-rich 3D geometric features from input point cloud consisting of irregularly distributed points by adopting PTv2 as 3D geometric encoder. In this paper, we conducted both quantitative and qualitative experiments with the benchmark dataset, ScanNetv2 in order to analyze the performance of the proposed model. In terms of the metric mIoU, the proposed model showed a 9.2% performance improvement over the PTv2 model using only 3D geometric features, and a 12.12% performance improvement over the MVPNet model using 2D-3D multi-modal features. As a result, we proved the effectiveness and usefulness of the proposed model.

A Study on a Multi-sensor Information Fusion Architecture for Avionics (항공전자 멀티센서 정보 융합 구조 연구)

  • Kang, Shin-Woo;Lee, Seoung-Pil;Park, Jun-Hyeon
    • Journal of Advanced Navigation Technology
    • /
    • v.17 no.6
    • /
    • pp.777-784
    • /
    • 2013
  • Synthesis process from the data produced by different types of sensor into a single information is being studied and used in a variety of platforms in terms of multi-sensor data fusion. Heterogeneous sensors has been integrated into various aircraft and modern avionic systems manage them. As the performance of sensors in aircraft is getting higher, the integration of sensor information is required from the viewpoint of avionics gradually. Information fusion is not studied widely in the view of software that provide a pilot with fused information from data produced by the sensor in the form of symbology on a display device. The purpose of information fusion is to assist pilots to make a decision in order to perform mission by providing the correct combat situation from avionics of the aircraft and to minimize their workload consequently. In the aircraft avionics equipped with different types of sensors, the software architecture that produce a comprehensive information using the sensor data through multi-sensor data fusion process to the user is shown in this paper.

Feature Extraction and Fusion for land-Cover Discrimination with Multi-Temporal SAR Data (다중 시기 SAR 자료를 이용한 토지 피복 구분을 위한 특징 추출과 융합)

  • Park No-Wook;Lee Hoonyol;Chi Kwang-Hoon
    • Korean Journal of Remote Sensing
    • /
    • v.21 no.2
    • /
    • pp.145-162
    • /
    • 2005
  • To improve the accuracy of land-cover discrimination in SAB data classification, this paper presents a methodology that includes feature extraction and fusion steps with multi-temporal SAR data. Three features including average backscattering coefficient, temporal variability and coherence are extracted from multi-temporal SAR data by considering the temporal behaviors of backscattering characteristics of SAR sensors. Dempster-Shafer theory of evidence(D-S theory) and fuzzy logic are applied to effectively integrate those features. Especially, a feature-driven heuristic approach to mass function assignment in D-S theory is applied and various fuzzy combination operators are tested in fuzzy logic fusion. As experimental results on a multi-temporal Radarsat-1 data set, the features considered in this paper could provide complementary information and thus effectively discriminated water, paddy and urban areas. However, it was difficult to discriminate forest and dry fields. From an information fusion methodological point of view, the D-S theory and fuzzy combination operators except the fuzzy Max and Algebraic Sum operators showed similar land-cover accuracy statistics.

Production of fusion-type realistic contents using 3D motion control technology (3D모션 컨트롤 기술을 이용한 융합형 실감 콘텐츠 제작)

  • Jeong, Sun-Ri;Chang, Seok-Joo
    • Journal of Convergence for Information Technology
    • /
    • v.9 no.4
    • /
    • pp.146-151
    • /
    • 2019
  • In this paper, we developed a multi-view video content based on real-world technology and a pilot using the production technology, and provided realistic contents production technology that can select a desired direction at a user 's view point by providing users with various viewpoint images. We also created multi-view video contents that can indirectly experience local cultural tourism resources and produced cyber tour contents based on multi-view video (realistic technology). This technology development can be used to create 3D interactive real-world contents that are used in all public education fields such as libraries, kindergartens, elementary schools, middle schools, elderly universities, housewives classrooms, lifelong education centers, The domestic VR market is still in it's infancy, and it's expected to develop in combination with the 3D market related to games and shopping malls. As the domestic educational trend and the demand for social public education system are growing, it is expected to increase gradually.

Hot Spot Detection of Thermal Infrared Image of Photovoltaic Power Station Based on Multi-Task Fusion

  • Xu Han;Xianhao Wang;Chong Chen;Gong Li;Changhao Piao
    • Journal of Information Processing Systems
    • /
    • v.19 no.6
    • /
    • pp.791-802
    • /
    • 2023
  • The manual inspection of photovoltaic (PV) panels to meet the requirements of inspection work for large-scale PV power plants is challenging. We present a hot spot detection and positioning method to detect hot spots in batches and locate their latitudes and longitudes. First, a network based on the YOLOv3 architecture was utilized to identify hot spots. The innovation is to modify the RU_1 unit in the YOLOv3 model for hot spot detection in the far field of view and add a neural network residual unit for fusion. In addition, because of the misidentification problem in the infrared images of the solar PV panels, the DeepLab v3+ model was adopted to segment the PV panels to filter out the misidentification caused by bright spots on the ground. Finally, the latitude and longitude of the hot spot are calculated according to the geometric positioning method utilizing known information such as the drone's yaw angle, shooting height, and lens field-of-view. The experimental results indicate that the hot spot recognition rate accuracy is above 98%. When keeping the drone 25 m off the ground, the hot spot positioning error is at the decimeter level.

Multi-Object Goal Visual Navigation Based on Multimodal Context Fusion (멀티모달 맥락정보 융합에 기초한 다중 물체 목표 시각적 탐색 이동)

  • Jeong Hyun Choi;In Cheol Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.9
    • /
    • pp.407-418
    • /
    • 2023
  • The Multi-Object Goal Visual Navigation(MultiOn) is a visual navigation task in which an agent must visit to multiple object goals in an unknown indoor environment in a given order. Existing models for the MultiOn task suffer from the limitation that they cannot utilize an integrated view of multimodal context because use only a unimodal context map. To overcome this limitation, in this paper, we propose a novel deep neural network-based agent model for MultiOn task. The proposed model, MCFMO, uses a multimodal context map, containing visual appearance features, semantic features of environmental objects, and goal object features. Moreover, the proposed model effectively fuses these three heterogeneous features into a global multimodal context map by using a point-wise convolutional neural network module. Lastly, the proposed model adopts an auxiliary task learning module to predict the observation status, goal direction and the goal distance, which can guide to learn the navigational policy efficiently. Conducting various quantitative and qualitative experiments using the Habitat-Matterport3D simulation environment and scene dataset, we demonstrate the superiority of the proposed model.