• Title/Summary/Keyword: vision model

Search Result 1,349, Processing Time 0.025 seconds

Markerless Image-to-Patient Registration Using Stereo Vision : Comparison of Registration Accuracy by Feature Selection Method and Location of Stereo Bision System (스테레오 비전을 이용한 마커리스 정합 : 특징점 추출 방법과 스테레오 비전의 위치에 따른 정합 정확도 평가)

  • Joo, Subin;Mun, Joung-Hwan;Shin, Ki-Young
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.53 no.1
    • /
    • pp.118-125
    • /
    • 2016
  • This study evaluates the performance of image to patient registration algorithm by using stereo vision and CT image for facial region surgical navigation. For the process of image to patient registration, feature extraction and 3D coordinate calculation are conducted, and then 3D CT image to 3D coordinate registration is conducted. Of the five combinations that can be generated by using three facial feature extraction methods and three registration methods on stereo vision image, this study evaluates the one with the highest registration accuracy. In addition, image to patient registration accuracy was compared by changing the facial rotation angle. As a result of the experiment, it turned out that when the facial rotation angle is within 20 degrees, registration using Active Appearance Model and Pseudo Inverse Matching has the highest accuracy, and when the facial rotation angle is over 20 degrees, registration using Speeded Up Robust Features and Iterative Closest Point has the highest accuracy. These results indicate that, Active Appearance Model and Pseudo Inverse Matching methods should be used in order to reduce registration error when the facial rotation angle is within 20 degrees, and Speeded Up Robust Features and Iterative Closest Point methods should be used when the facial rotation angle is over 20 degrees.

Vision-based Target Tracking for UAV and Relative Depth Estimation using Optical Flow (무인 항공기의 영상기반 목표물 추적과 광류를 이용한 상대깊이 추정)

  • Jo, Seon-Yeong;Kim, Jong-Hun;Kim, Jung-Ho;Lee, Dae-Woo;Cho, Kyeum-Rae
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.37 no.3
    • /
    • pp.267-274
    • /
    • 2009
  • Recently, UAVs (Unmanned Aerial Vehicles) are expected much as the Unmanned Systems for various missions. These missions are often based on the Vision System. Especially, missions such as surveillance and pursuit have a process which is carried on through the transmitted vision data from the UAV. In case of small UAVs, monocular vision is often used to consider weights and expenses. Research of missions performance using the monocular vision is continued but, actually, ground and target model have difference in distance from the UAV. So, 3D distance measurement is still incorrect. In this study, Mean-Shift Algorithm, Optical Flow and Subspace Method are posed to estimate the relative depth. Mean-Shift Algorithm is used for target tracking and determining Region of Interest (ROI). Optical Flow includes image motion information using pixel intensity. After that, Subspace Method computes the translation and rotation of image and estimates the relative depth. Finally, we present the results of this study using images obtained from the UAV experiments.

Automated Vision-based Construction Object Detection Using Active Learning (액티브 러닝을 활용한 영상기반 건설현장 물체 자동 인식 프레임워크)

  • Kim, Jinwoo;Chi, Seokho;Seo, JoonOh
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.39 no.5
    • /
    • pp.631-636
    • /
    • 2019
  • Over the last decade, many researchers have investigated a number of vision-based construction object detection algorithms for the purpose of construction site monitoring. However, previous methods require the ground truth labeling, which is a process of manually marking types and locations of target objects from training image data, and thus a large amount of time and effort is being wasted. To address this drawback, this paper proposes a vision-based construction object detection framework that employs an active learning technique while reducing manual labeling efforts. For the validation, the research team performed experiments using an open construction benchmark dataset. The results showed that the method was able to successfully detect construction objects that have various visual characteristics, and also indicated that it is possible to develop the high performance of an object detection model using smaller amount of training data and less iterative training steps compared to the previous approaches. The findings of this study can be used to reduce the manual labeling processes and minimize the time and costs required to build a training database.

Assembly Performance Evaluation for Prefabricated Steel Structures Using k-nearest Neighbor and Vision Sensor (k-근접 이웃 및 비전센서를 활용한 프리팹 강구조물 조립 성능 평가 기술)

  • Bang, Hyuntae;Yu, Byeongjun;Jeon, Haemin
    • Journal of the Computational Structural Engineering Institute of Korea
    • /
    • v.35 no.5
    • /
    • pp.259-266
    • /
    • 2022
  • In this study, we developed a deep learning and vision sensor-based assembly performance evaluation method isfor prefabricated steel structures. The assembly parts were segmented using a modified version of the receptive field block convolution module inspired by the eccentric function of the human visual system. The quality of the assembly was evaluated by detecting the bolt holes in the segmented assembly part and calculating the bolt hole positions. To validate the performance of the evaluation, models of standard and defective assembly parts were produced using a 3D printer. The assembly part segmentation network was trained based on the 3D model images captured from a vision sensor. The sbolt hole positions in the segmented assembly image were calculated using image processing techniques, and the assembly performance evaluation using the k-nearest neighbor algorithm was verified. The experimental results show that the assembly parts were segmented with high precision, and the assembly performance based on the positions of the bolt holes in the detected assembly part was evaluated with a classification error of less than 5%.

Correlation Extraction from KOSHA to enable the Development of Computer Vision based Risks Recognition System

  • Khan, Numan;Kim, Youjin;Lee, Doyeop;Tran, Si Van-Tien;Park, Chansik
    • International conference on construction engineering and project management
    • /
    • 2020.12a
    • /
    • pp.87-95
    • /
    • 2020
  • Generally, occupational safety and particularly construction safety is an intricate phenomenon. Industry professionals have devoted vital attention to enforcing Occupational Safety and Health (OHS) from the last three decades to enhance safety management in construction. Despite the efforts of the safety professionals and government agencies, current safety management still relies on manual inspections which are infrequent, time-consuming and prone to error. Extensive research has been carried out to deal with high fatality rates confronting by the construction industry. Sensor systems, visualization-based technologies, and tracking techniques have been deployed by researchers in the last decade. Recently in the construction industry, computer vision has attracted significant attention worldwide. However, the literature revealed the narrow scope of the computer vision technology for safety management, hence, broad scope research for safety monitoring is desired to attain a complete automatic job site monitoring. With this regard, the development of a broader scope computer vision-based risk recognition system for correlation detection between the construction entities is inevitable. For this purpose, a detailed analysis has been conducted and related rules which depict the correlations (positive and negative) between the construction entities were extracted. Deep learning supported Mask R-CNN algorithm is applied to train the model. As proof of concept, a prototype is developed based on real scenarios. The proposed approach is expected to enhance the effectiveness of safety inspection and reduce the encountered burden on safety managers. It is anticipated that this approach may enable a reduction in injuries and fatalities by implementing the exact relevant safety rules and will contribute to enhance the overall safety management and monitoring performance.

  • PDF

Diagnosis of the Rice Lodging for the UAV Image using Vision Transformer (Vision Transformer를 이용한 UAV 영상의 벼 도복 영역 진단)

  • Hyunjung Myung;Seojeong Kim;Kangin Choi;Donghoon Kim;Gwanghyeong Lee;Hvung geun Ahn;Sunghwan Jeong;Bvoungiun Kim
    • Smart Media Journal
    • /
    • v.12 no.9
    • /
    • pp.28-37
    • /
    • 2023
  • The main factor affecting the decline in rice yield is damage caused by localized heavy rains or typhoons. The method of analyzing the rice lodging area is difficult to obtain objective results based on visual inspection and judgment based on field surveys visiting the affected area. it requires a lot of time and money. In this paper, we propose the method of estimation and diagnosis for rice lodging areas using a Vision Transformer-based Segformer for RGB images, which are captured by unmanned aerial vehicles. The proposed method estimates the lodging, normal, and background area using the Segformer model, and the lodging rate is diagnosed through the rice field inspection criteria in the seed industry Act. The diagnosis result can be used to find the distribution of the rice lodging areas, to show the trend of lodging, and to use the quality management of certified seed in government. The proposed method of rice lodging area estimation shows 98.33% of mean accuracy and 96.79% of mIoU.

Computational Retinal Model by emphasizing region contrast (영역대비강조에 의한 계산론적 망막모델)

  • Je Sung-kwan;Kim Kwang-back;Cho Jae-hyun;Cha Eui-young
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.9 no.7
    • /
    • pp.1594-1600
    • /
    • 2005
  • Recently many researches have been studied in the human vision model to solve the Wblem of the machine vision. Starting from research on the human visual system, first, we investigate the mechanisms of retina through physiological and biological evidence. In retina, input data was processed information processing that was data reduction edge detection, and emphasizing region. The processed image was recognized by region. In this paper, we proposed retinal algorithms that process data reduction and edge detection by the wavelet transform and emphasize region contrast. In experiments, the proposed model simulates processing the retina outputs in the levels and compares with outputs.

3D Building Reconstruction Using Building Model and Segment Measure Function (건물모델 및 선소측정함수를 이용한 건물의 3차원 복원)

  • Ye, Chul-Soo;Lee, Kwae-Hi
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.37 no.4
    • /
    • pp.46-55
    • /
    • 2000
  • This paper presents an algorithm for 3D building reconstruction from a pair of stereo aerial images using the 3D building model and the linear segments of building. Direct extraction of linear segments from original building images using parametric building model is attempted instead of employing the conventional procedures such as edge detection, linear approximation and line linking A segment measure function is simultaneously applied to each line segment extracted in order to improve the accuracy of building detection comparing to individual linear segment detection. The algorithm has been applied to pairs of stereo aerial images and the result showed accurate detection and reconstruction of buildings.

  • PDF

Information Processing in Primate Retinal Ganglion

  • Je, Sung-Kwan;Cho, Jae-Hyun;Kim, Gwang-Baek
    • Journal of information and communication convergence engineering
    • /
    • v.2 no.2
    • /
    • pp.132-137
    • /
    • 2004
  • Most of the current computer vision theories are based on hypotheses that are difficult to apply to the real world, and they simply imitate a coarse form of the human visual system. As a result, they have not been showing satisfying results. In the human visual system, there is a mechanism that processes information due to memory degradation with time and limited storage space. Starting from research on the human visual system, this study analyzes a mechanism that processes input information when information is transferred from the retina to ganglion cells. In this study, a model for the characteristics of ganglion cells in the retina is proposed after considering the structure of the retina and the efficiency of storage space. The MNIST database of handwritten letters is used as data for this research, and ART2 and SOM as recognizers. The results of this study show that the proposed recognition model is not much different from the general recognition model in terms of recognition rate, but the efficiency of storage space can be improved by constructing a mechanism that processes input information.

Deconvolution Pixel Layer Based Semantic Segmentation for Street View Images (디컨볼루션 픽셀층 기반의 도로 이미지의 의미론적 분할)

  • Wahid, Abdul;Lee, Hyo Jong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2019.05a
    • /
    • pp.515-518
    • /
    • 2019
  • Semantic segmentation has remained as a challenging problem in the field of computer vision. Given the immense power of Convolution Neural Network (CNN) models, many complex problems have been solved in computer vision. Semantic segmentation is the challenge of classifying several pixels of an image into one category. With the help of convolution neural networks, we have witnessed prolific results over the time. We propose a convolutional neural network model which uses Fully CNN with deconvolutional pixel layers. The goal is to create a hierarchy of features while the fully convolutional model does the primary learning and later deconvolutional model visually segments the target image. The proposed approach creates a direct link among the several adjacent pixels in the resulting feature maps. It also preserves the spatial features such as corners and edges in images and hence adding more accuracy to the resulting outputs. We test our algorithm on Karlsruhe Institute of Technology and Toyota Technologies Institute (KITTI) street view data set. Our method achieves an mIoU accuracy of 92.04 %.