• Title/Summary/Keyword: Visual Mapping

Search Result 297, Processing Time 0.024 seconds

Towards UAV-based bridge inspection systems: a review and an application perspective

  • Chan, Brodie;Guan, Hong;Jo, Jun;Blumenstein, Michael
    • Structural Monitoring and Maintenance
    • /
    • v.2 no.3
    • /
    • pp.283-300
    • /
    • 2015
  • Visual condition inspections remain paramount to assessing the current deterioration status of a bridge and assigning remediation or maintenance tasks so as to ensure the ongoing serviceability of the structure. However, in recent years, there has been an increasing backlog of maintenance activities. Existing research reveals that this is attributable to the labour-intensive, subjective and disruptive nature of the current bridge inspection method. Current processes ultimately require lane closures, traffic guidance schemes and inspection equipment. This not only increases the whole-of-life costs of the bridge, but also increases the risk to the travelling public as issues affecting the structural integrity may go unaddressed. As a tool for bridge condition inspections, Unmanned Aerial Vehicles (UAVs) or, drones, offer considerable potential, allowing a bridge to be visually assessed without the need for inspectors to walk across the deck or utilise under-bridge inspection units. With current inspection processes placing additional strain on the existing bridge maintenance resources, the technology has the potential to significantly reduce the overall inspection costs and disruption caused to the travelling public. In addition to this, the use of automated aerial image capture enables engineers to better understand a situation through the 3D spatial context offered by UAV systems. However, the use of UAV for bridge inspection involves a number of critical issues to be resolved, including stability and accuracy of control, and safety to people. SLAM (Simultaneous Localisation and Mapping) is a technique that could be used by a UAV to build a map of the bridge underneath, while simultaneously determining its location on the constructed map. While there are considerable economic and risk-related benefits created through introducing entirely new ways of inspecting bridges and visualising information, there also remain hindrances to the wider deployment of UAVs. This study is to provide a context for use of UAVs for conducting visual bridge inspections, in addition to addressing the obstacles that are required to be overcome in order for the technology to be integrated into current practice.

Crack Inspection and Mapping of Concrete Bridges using Integrated Image Processing Techniques (통합 이미지 처리 기술을 이용한 콘크리트 교량 균열 탐지 및 매핑)

  • Kim, Byunghyun;Cho, Soojin
    • Journal of the Korean Society of Safety
    • /
    • v.36 no.1
    • /
    • pp.18-25
    • /
    • 2021
  • In many developed countries, such as South Korea, efficiently maintaining the aging infrastructures is an important issue. Currently, inspectors visually inspect the infrastructure for maintenance needs, but this method is inefficient due to its high costs, long logistic times, and hazards to the inspectors. Thus, in this paper, a novel crack inspection approach for concrete bridges is proposed using integrated image processing techniques. The proposed approach consists of four steps: (1) training a deep learning model to automatically detect cracks on concrete bridges, (2) acquiring in-situ images using a drone, (3) generating orthomosaic images based on 3D modeling, and (4) detecting cracks on the orthmosaic image using the trained deep learning model. Cascade Mask R-CNN, a state-of-the-art instance segmentation deep learning model, was trained with 3235 crack images that included 2415 hard negative images. We selected the Tancheon overpass, located in Seoul, South Korea, as a testbed for the proposed approach, and we captured images of pier 34-37 and slab 34-36 using a commercial drone. Agisoft Metashape was utilized as a 3D model generation program to generate an orthomosaic of the captured images. We applied the proposed approach to four orthomosaic images that displayed the front, back, left, and right sides of pier 37. Using pixel-level precision referencing visual inspection of the captured images, we evaluated the trained Cascade Mask R-CNN's crack detection performance. At the coping of the front side of pier 37, the model obtained its best precision: 94.34%. It achieved an average precision of 72.93% for the orthomosaics of the four sides of the pier. The test results show that this proposed approach for crack detection can be a suitable alternative to the conventional visual inspection method.

Visual Explanation of a Deep Learning Solar Flare Forecast Model and Its Relationship to Physical Parameters

  • Yi, Kangwoo;Moon, Yong-Jae;Lim, Daye;Park, Eunsu;Lee, Harim
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.46 no.1
    • /
    • pp.42.1-42.1
    • /
    • 2021
  • In this study, we present a visual explanation of a deep learning solar flare forecast model and its relationship to physical parameters of solar active regions (ARs). For this, we use full-disk magnetograms at 00:00 UT from the Solar and Heliospheric Observatory/Michelson Doppler Imager and the Solar Dynamics Observatory/Helioseismic and Magnetic Imager, physical parameters from the Space-weather HMI Active Region Patch (SHARP), and Geostationary Operational Environmental Satellite X-ray flare data. Our deep learning flare forecast model based on the Convolutional Neural Network (CNN) predicts "Yes" or "No" for the daily occurrence of C-, M-, and X-class flares. We interpret the model using two CNN attribution methods (guided backpropagation and Gradient-weighted Class Activation Mapping [Grad-CAM]) that provide quantitative information on explaining the model. We find that our deep learning flare forecasting model is intimately related to AR physical properties that have also been distinguished in previous studies as holding significant predictive ability. Major results of this study are as follows. First, we successfully apply our deep learning models to the forecast of daily solar flare occurrence with TSS = 0.65, without any preprocessing to extract features from data. Second, using the attribution methods, we find that the polarity inversion line is an important feature for the deep learning flare forecasting model. Third, the ARs with high Grad-CAM values produce more flares than those with low Grad-CAM values. Fourth, nine SHARP parameters such as total unsigned vertical current, total unsigned current helicity, total unsigned flux, and total photospheric magnetic free energy density are well correlated with Grad-CAM values.

  • PDF

Photorealistic Building Modelling and Visualization in 3D GIS (3차원 GIS의 현실감 부여 빌딩 모델링 및 시각화에 관한 연구)

  • Song, Yong Hak;Sohn, Hong Gyoo;Yun, Kong Hyun
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.26 no.2D
    • /
    • pp.311-316
    • /
    • 2006
  • Despite geospatial information systems are widely used in many different fields as a powerful tool for spatial analysis and decision-making, their capabilities to handle realistic 3-D urban environment are very limited. The objective of this work is to integrate the recent developments in 3-D modeling and visualization into GIS to enhance its 3-D capabilities. To achieve a photorealistic view, building models are collected from a pair of aerial stereo images. Roof and wall textures are respectively obtained from ortho-rectified aerial image and ground photography. This study is implemented by using ArcGIS as the work platform and ArcObjects and Visual Basic as development tools. Presented in this paper are 3-D geometric modeling and its data structure, texture creation and its association with the geometric model. As the results, photorealistic views of Purdue University campus are created and rendered with ArcScene.

Generating Radiology Reports via Multi-feature Optimization Transformer

  • Rui Wang;Rong Hua
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.10
    • /
    • pp.2768-2787
    • /
    • 2023
  • As an important research direction of the application of computer science in the medical field, the automatic generation technology of radiology report has attracted wide attention in the academic community. Because the proportion of normal regions in radiology images is much larger than that of abnormal regions, words describing diseases are often masked by other words, resulting in significant feature loss during the calculation process, which affects the quality of generated reports. In addition, the huge difference between visual features and semantic features causes traditional multi-modal fusion method to fail to generate long narrative structures consisting of multiple sentences, which are required for medical reports. To address these challenges, we propose a multi-feature optimization Transformer (MFOT) for generating radiology reports. In detail, a multi-dimensional mapping attention (MDMA) module is designed to encode the visual grid features from different dimensions to reduce the loss of primary features in the encoding process; a feature pre-fusion (FP) module is constructed to enhance the interaction ability between multi-modal features, so as to generate a reasonably structured radiology report; a detail enhanced attention (DEA) module is proposed to enhance the extraction and utilization of key features and reduce the loss of key features. In conclusion, we evaluate the performance of our proposed model against prevailing mainstream models by utilizing widely-recognized radiology report datasets, namely IU X-Ray and MIMIC-CXR. The experimental outcomes demonstrate that our model achieves SOTA performance on both datasets, compared with the base model, the average improvement of six key indicators is 19.9% and 18.0% respectively. These findings substantiate the efficacy of our model in the domain of automated radiology report generation.

Utilizing AI Foundation Models for Language-Driven Zero-Shot Object Navigation Tasks (언어-기반 제로-샷 물체 목표 탐색 이동 작업들을 위한 인공지능 기저 모델들의 활용)

  • Jeong-Hyun Choi;Ho-Jun Baek;Chan-Sol Park;Incheol Kim
    • The Journal of Korea Robotics Society
    • /
    • v.19 no.3
    • /
    • pp.293-310
    • /
    • 2024
  • In this paper, we propose an agent model for Language-Driven Zero-Shot Object Navigation (L-ZSON) tasks, which takes in a freeform language description of an unseen target object and navigates to find out the target object in an inexperienced environment. In general, an L-ZSON agent should able to visually ground the target object by understanding the freeform language description of it and recognizing the corresponding visual object in camera images. Moreover, the L-ZSON agent should be also able to build a rich spatial context map over the unknown environment and decide efficient exploration actions based on the map until the target object is present in the field of view. To address these challenging issues, we proposes AML (Agent Model for L-ZSON), a novel L-ZSON agent model to make effective use of AI foundation models such as Large Language Model (LLM) and Vision-Language model (VLM). In order to tackle the visual grounding issue of the target object description, our agent model employs GLEE, a VLM pretrained for locating and identifying arbitrary objects in images and videos in the open world scenario. To meet the exploration policy issue, the proposed agent model leverages the commonsense knowledge of LLM to make sequential navigational decisions. By conducting various quantitative and qualitative experiments with RoboTHOR, the 3D simulation platform and PASTURE, the L-ZSON benchmark dataset, we show the superior performance of the proposed agent model.

Functional Mapping of the Neural Basis for the Encoding and Retrieval of Human Episodic Memory Using ${H_2}^{15}O$ PET ({H_2}^{15}O$ PET을 이용한 정상인의 삽화기억 부호화 및 인출 중추 뇌기능지도화)

  • Lee, Jae-Sung;Nam, Hyun-Woo;Lee, Dong-Soo;Lee, Sang-Kun;Jang, Myoung-Jin;Ahn, Ji-Young;Park, Kwang-Suk;Chung, June-Key;Lee, Myung-Chul
    • The Korean Journal of Nuclear Medicine
    • /
    • v.34 no.1
    • /
    • pp.10-21
    • /
    • 2000
  • Purpose: Episodic memory is described as an 'autobiographical' memory responsible for storing a record of the events in our lives. We performed functional brain activation study using ${H_2}^{15}O$ PET to reveal the neural basis of the encoding and the retrieval of episodic memory in human normal volunteers. Materials and Methods: Four repeated ${H_2}^{15}O$ PET scans with two reference and two activation tasks were performed on 6 normal volunteers to activate brain areas engaged in encoding and retrieval with verbal materials. Images from the same subject were spatially registered and normalized using linear and nonlinear transformation. Using the means and variances for every condition which were adjusted with analysis of covariance, t-statistic analysis were performed voxel-wise. Results: Encoding of episodic memory activated the opercular and triangular parts of left inferior frontal gyrus, right prefrontal cortex, medial frontal area, cingulate gyrus, posterior middle and inferior temporal gyri, and cerebellum, and both primary visual and visual association areas. Retrieval of episodic memory activated the triangular part of left inferior frontal gyrus and inferior temporal gyrus, right prefrontal cortex and medial temporal area, and both cerebellum and primary visual and visual association areas. The activations in the opercular part of left inferior frontal gyrus and the right prefrontal cortex meant the essential role of these areas in the encoding and retrieval of episodic memory. Conclusion: We could localize the neural basis of the encoding and retrieval of episodic memory using ${H_2}^{15}O$ PET, which was partly consistent with the hypothesis of hemispheric encoding/retrieval asymmetry.

  • PDF

Color Correction Method for High Dynamic Range Image Using Dynamic Cone Response Function (동적 원추 세포 응답을 이용한 높은 동적 폭을 갖는 영상 색상 보정 방법)

  • Choi, Ho-Hyoung;Yun, Byoung-Ju
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.49 no.9
    • /
    • pp.104-112
    • /
    • 2012
  • Recently, the HDR imaging technique that mimics human eye is incorporated with LCD/LED display devices to deal with mismatch between the real world scene and the displayed image. However, HDR image has a veiling glare limit as well as a scale of the local contrast problem. In order to overcome these problems, several color correction methods, CSR(center/surround Retinex), MSR (Multi-Scale Retinex), tone-mapping method, iCAM06 and so on, are proposed. However, these methods have a dominated color throughout the entire resulting image after performing color correction. Accordingly, this paper presents a new color correction method using dynamic cone response function. The proposed color correction method consists of tone-mapping and dynamic cone response. The tone-mapping is obtained by using a linear interpolation between chromatic and achromatic. Thereafter, the resulting image is processed through the dynamic cone response function, which estimates the dynamic responses of human visual system as well as deals with mismatch between the real scene image and the rendered image. The experiment results show that the proposed method yields better performance of color correction over the conventional methods.

Photon Mapping-Based Rendering Technique for Smoke Particles (연기 파티클에 대한 포톤 매핑 기반의 렌더링 기법)

  • Song, Ki-Dong;Ihm, In-Sung
    • Journal of the Korea Computer Graphics Society
    • /
    • v.14 no.4
    • /
    • pp.7-18
    • /
    • 2008
  • To realistically produce fluids such as smoke for the visual effects in the films or animations, we need two main processes: a physics-based modeling of smoke and a rendering of smoke simulation data, based on light transport theory. In the computer graphics community, the physics-based fluids simulation is generally adopted for smoke modeling. Recently, the interest of the particle-based Lagrangian simulation methods is increasing due to the advantages at simulation time, instead of the grid-based Eulerian simulation methods which was widely used. As a result, because the smoke rendering technique depends heavily on the modeling method, the research for rendering of the particle-based smoke data still remains challenging while the research for rendering of the grid-based smoke data is actively in progress. This paper focuses on realistic rendering technique for the smoke particles produced by Lagrangian simulation method. This paper introduces a technique which is called particle map, that is the expansion and modification of photon mapping technique for the particle data. And then, this paper suggests the novel particle map technique and shows the differences and improvements, compared to previous work. In addition, this paper presents irradiance map technique which is the pre-calculation of the multiple scattering term in the volume rendering equation to enhance efficiency at rendering time.

  • PDF

Gamut Mapping and Extension Method in the xy Chromaticity Diagram for Various Display Devices (다양한 디스플레이 장치를 위한 xy 색도도상에서의 색역 사상 및 확장 기법)

  • Cho Yang-Ho;Kwon Oh-Seol;Son Chang-Hwan;Park Tae-Yong;Ha Yeong-Ho
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.43 no.1 s.307
    • /
    • pp.45-54
    • /
    • 2006
  • This paper proposed color matching technique, including display characterization, chromatic adaptation model, and gamut mapping and extension, to generate consistent colors for the same input signal in each display device. It is necessary to characterize the relationship between input and output colors for display device, to apply chromatic adaptation model considering the difference of reference white, and to compensate for the gamut which display devices can represent for reproducing consistent colors on DTV display devices. In this paper, 9 channel-independent GOG model, which is improved from conventional 3 channel GOG(gain, offset gamma) model, is used to consider channel interaction and enhance the modeling accuracy. Then, the input images have to be adjusted to compensate for the limited gamut of each display device. We proposed the gamut mapping and extension method, preserving lightness and hue of an original image and enhancing the saturation of an original image in xy chromaticity diagram. Since the hmm visual system is more sensitive to lightness and hue, these values are maintained as the values of input signal, and the enhancement of saturation is changed to the ratio of input and output gamut. Also the xy chromaticity diagram is effective to reduce the complexity of establishing gamut boundary and the process of reproducing moving-pictures in DTV display devices. As a result, reproducing accurate colors can be implemented when the proposed method is applied to LCD and PDP display devices