• Title/Summary/Keyword: Computer vision technology

Search Result 666, Processing Time 0.027 seconds

Efficient Image Warping Mechanism Using Template Matching and Partial Warping (템플릿 매칭과 부분 워핑을 이용한 효율적인 원근 영상 워핑 기법)

  • Jeong, Dae-Heon;Cho, Tai-Hoon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2017.05a
    • /
    • pp.339-342
    • /
    • 2017
  • Geometric transform of an image is used to image correction. Ridid-Body, Simlilary transform, etc, many correction methods are exist in computer vision. Image warping is used to correction for image with perspective. To image warping I extracted 4 feature point about warping position. But It is difficult to extract 4 points accurately and warping result with these point is occurs error over 3 or 4 pixel at warping position. So I used template matching to extract 4 points correctly and selected repeatedly 2 points of 4 points because to confirm result correctly. positions of 2 points are changed in near of 3 by 3 pixel and warped each change. So I selected optimal 4 points with a error of less than 1 pixel and finally, warped image using optimal points. For this way is possible to obtain optimum result.

  • PDF

Active Facial Tracking for Fatigue Detection (피로 검출을 위한 능동적 얼굴 추적)

  • 박호식;정연숙;손동주;나상동;배철수
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2004.05b
    • /
    • pp.603-607
    • /
    • 2004
  • The vision-based driver fatigue detection is one of the most prospective commercial applications of facial expression recognition technology. The facial feature tracking is the primary technique issue in it. Current facial tracking technology faces three challenges: (1) detection failure of some or all of features due to a variety of lighting conditions and head motions; (2) multiple and non-rigid object tracking and (3) features occlusion when the head is in oblique angles. In this paper, we propose a new active approach. First, the active IR sensor is used to robustly detect pupils under variable lighting conditions. The detected pupils are then used to predict the head motion. Furthermore, face movement is assumed to be locally smooth so that a facial feature can be tracked with a Kalman filter. The simultaneous use of the pupil constraint and the Kalman filtering greatly increases the prediction accuracy for each feature position. Feature detection is accomplished in the Gabor space with respect to the vicinity of predicted location. Local graphs consisting of identified features are extracted and used to capture the spatial relationship among detected features. Finally, a graph-based reliability propagation is proposed to tackle the occlusion problem and verify the tracking results. The experimental results show validity of our active approach to real-life facial tracking under variable lighting conditions, head orientations, and facial expressions.

  • PDF

Deep Learning-based Real-Time Super-Resolution Architecture Design (경량화된 딥러닝 구조를 이용한 실시간 초고해상도 영상 생성 기술)

  • Ahn, Saehyun;Kang, Suk-Ju
    • Journal of Broadcast Engineering
    • /
    • v.26 no.2
    • /
    • pp.167-174
    • /
    • 2021
  • Recently, deep learning technology is widely used in various computer vision applications, such as object recognition, classification, and image generation. In particular, the deep learning-based super-resolution has been gaining significant performance improvement. Fast super-resolution convolutional neural network (FSRCNN) is a well-known model as a deep learning-based super-resolution algorithm that output image is generated by a deconvolutional layer. In this paper, we propose an FPGA-based convolutional neural networks accelerator that considers parallel computing efficiency. In addition, the proposed method proposes Optimal-FSRCNN, which is modified the structure of FSRCNN. The number of multipliers is compressed by 3.47 times compared to FSRCNN. Moreover, PSNR has similar performance to FSRCNN. We developed a real-time image processing technology that implements on FPGA.

A System for Determining the Growth Stage of Fruit Tree Using a Deep Learning-Based Object Detection Model (딥러닝 기반의 객체 탐지 모델을 활용한 과수 생육 단계 판별 시스템)

  • Bang, Ji-Hyeon;Park, Jun;Park, Sung-Wook;Kim, Jun-Yung;Jung, Se-Hoon;Sim, Chun-Bo
    • Smart Media Journal
    • /
    • v.11 no.4
    • /
    • pp.9-18
    • /
    • 2022
  • Recently, research and system using AI is rapidly increasing in various fields. Smart farm using artificial intelligence and information communication technology is also being studied in agriculture. In addition, data-based precision agriculture is being commercialized by convergence various advanced technology such as autonomous driving, satellites, and big data. In Korea, the number of commercialization cases of facility agriculture among smart agriculture is increasing. However, research and investment are being biased in the field of facility agriculture. The gap between research and investment in facility agriculture and open-air agriculture continues to increase. The fields of fruit trees and plant factories have low research and investment. There is a problem that the big data collection and utilization system is insufficient. In this paper, we are proposed the system for determining the fruit tree growth stage using a deep learning-based object detection model. The system was proposed as a hybrid app for use in agricultural sites. In addition, we are implemented an object detection function for the fruit tree growth stage determine.

Non-contact mobile inspection system for tunnels: a review (터널의 비접촉 이동식 상태점검 장비: 리뷰)

  • Chulhee Lee;Donggyou Kim
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.25 no.3
    • /
    • pp.245-259
    • /
    • 2023
  • The purpose of this paper is to examine the most recent tunnel scanning systems to obtain insights for the development of non-contact mobile inspection system. Tunnel scanning systems are mostly being developed by adapting two main technologies, namely laser scanning and image scanning systems. Laser scanning system has the advantage of accurately recreating the geometric characteristics of tunnel linings from point cloud. On the other hand, image scanning system employs computer vision to effortlessly identify damage, such as fine cracks and leaks on the tunnel lining surface. The analysis suggests that image scanning system is more suitable for detecting damage on tunnel linings. A camera-based tunnel scanning system under development should include components such as lighting, data storage, power supply, and image-capturing controller synchronized with vehicle speed.

Real-time Online Study and Exam Attitude Dataset Design and Implementation (실시간 온라인 수업 및 시험 태도 데이터 세트 설계 및 구현)

  • Kim, Junsik;Lee, Chanhwi;Song, Hyok;Kwon, Soonchul
    • Journal of Broadcast Engineering
    • /
    • v.27 no.1
    • /
    • pp.124-132
    • /
    • 2022
  • Recently, due to COVID-19, online remote classes and non-face-to-face exams have made it difficult to manage class attitudes and exam cheating. Therefore, there is a need for a system that automatically recognizes and detects the behavior of students online. Action recognition, which recognizes human action, is one of the most studied technologies in computer vision. In order to develop such a technology, data including human arm movement information and information about surrounding objects, which can be key information in online classes and exams, are needed. It is difficult to apply the existing dataset to this system because it is classified into various fields or consists of daily life action. In this paper, we propose a dataset that can classify attitudes in real-time online tests and classes. In addition, it shows whether the proposed dataset is correctly constructed through comparison with the existing action recognition dataset.

Person Identification based on Clothing Feature (의상 특징 기반의 동일인 식별)

  • Choi, Yoo-Joo;Park, Sun-Mi;Cho, We-Duke;Kim, Ku-Jin
    • Journal of the Korea Computer Graphics Society
    • /
    • v.16 no.1
    • /
    • pp.1-7
    • /
    • 2010
  • With the widespread use of vision-based surveillance systems, the capability for person identification is now an essential component. However, the CCTV cameras used in surveillance systems tend to produce relatively low-resolution images, making it difficult to use face recognition techniques for person identification. Therefore, an algorithm is proposed for person identification in CCTV camera images based on the clothing. Whenever a person is authenticated at the main entrance of a building, the clothing feature of that person is extracted and added to the database. Using a given image, the clothing area is detected using background subtraction and skin color detection techniques. The clothing feature vector is then composed of textural and color features of the clothing region, where the textural feature is extracted based on a local edge histogram, while the color feature is extracted using octree-based quantization of a color map. When given a query image, the person can then be identified by finding the most similar clothing feature from the database, where the Euclidean distance is used as the similarity measure. Experimental results show an 80% success rate for person identification with the proposed algorithm, and only a 43% success rate when using face recognition.

Optical Design of a Reflecting Omnidirectional Vision System for Long-wavelength Infrared Light (원적외선용 반사식 전방위 비전 시스템의 광학 설계)

  • Ju, Yun Jae;Jo, Jae Heung;Ryu, Jae Myung
    • Korean Journal of Optics and Photonics
    • /
    • v.30 no.2
    • /
    • pp.37-47
    • /
    • 2019
  • A reflecting omnidirectional optical system with four spherical and aspherical mirrors, for use with long-wavelength infrared light (LWIR) for night surveillance, is proposed. It is designed to include a collecting pseudo-Cassegrain reflector and an imaging inverse pseudo-Cassegrain reflector, and the design process and performance analysis is reported in detail. The half-field of view (HFOV) and F-number of this optical system are $40-110^{\circ}$ and 1.56, respectively. To use the LWIR imaging, the size of the image must be similar to that of the microbolometer sensor for LWIR. As a result, the size of the image must be $5.9mm{\times}5.9mm$ if possible. The image size ratio for an HFOV range of $40^{\circ}$ to $110^{\circ}$ after optimizing the design is 48.86%. At a spatial frequency of 20 lp/mm when the HFOV is $110^{\circ}$, the modulation transfer function (MTF) for LWIR is 0.381. Additionally, the cumulative probability of tolerance for the LWIR at a spatial frequency of 20 lp/mm is 99.75%. As a result of athermalization analysis in the temperature range of $-32^{\circ}C$ to $+55^{\circ}C$, we find that the secondary mirror of the inverse pseudo-Cassegrain reflector can function as a compensator, to alleviate MTF degradation with rising temperature.

A Study on the Correlation of Factors in 3-D Stereoscopic Visual-perception (3차원 입체영상에서 시지각(時知覺) 요인의 상관관계)

  • Cho, Yong-Keun
    • Cartoon and Animation Studies
    • /
    • s.19
    • /
    • pp.161-181
    • /
    • 2010
  • Human beings experience the outside world through senses and have developed various ways of representation to preserve what they've experienced. The rapid progress of digital technology has opened a new era of representation technology, and furthermore, is functioning as a technology which offers new experiences. The sensory experiences through the sense of sight, which humans depend on more than 70% to perceive the outside world, have been becoming the center of representing the reality as the 3-D graphics technology has been growing, and developing by being grafted onto different areas of study. Various technologies to express the sense of reality, such as the technology to reinforce the virtual reality and to represent it in the reality, computer graphic, TUI technology, and five sensory technologies which apply humans' senses, are making advancement based on humans' visual features and sensory elements. In particular, the 3-D technology to display solidness provides not only representation but also new sensory experiences, and is emerging as the key technology to image contents. However, compared to the development of technology of 3-D graphics, there have been few basic studies on the principles of the sense of vision. Therefore, in this study, the principles and elements to sense videos will be examined. The sensory features of 3-D images to represent the sense of reality will be researched into, especially focusing on the experiential and physiological elements to sense 3-D structures, and the physical and psychological elements to sense shapes, which might be hopefully the basic study for producing 3-D contents.

  • PDF

Inexpensive Visual Motion Data Glove for Human-Computer Interface Via Hand Gesture Recognition (손 동작 인식을 통한 인간 - 컴퓨터 인터페이스용 저가형 비주얼 모션 데이터 글러브)

  • Han, Young-Mo
    • The KIPS Transactions:PartB
    • /
    • v.16B no.5
    • /
    • pp.341-346
    • /
    • 2009
  • The motion data glove is a representative human-computer interaction tool that inputs human hand gestures to computers by measuring their motions. The motion data glove is essential equipment used for new computer technologiesincluding home automation, virtual reality, biometrics, motion capture. For its popular usage, this paper attempts to develop an inexpensive visual.type motion data glove that can be used without any special equipment. The proposed approach has the special feature; it can be developed as a low-cost one becauseof not using high-cost motion-sensing fibers that were used in the conventional approaches. That makes its easy production and popular use possible. This approach adopts a visual method that is obtained by improving conventional optic motion capture technology, instead of mechanical method using motion-sensing fibers. Compared to conventional visual methods, the proposed method has the following advantages and originalities Firstly, conventional visual methods use many cameras and equipments to reconstruct 3D pose with eliminating occlusions But the proposed method adopts a mono vision approachthat makes simple and low cost equipments possible. Secondly, conventional mono vision methods have difficulty in reconstructing 3D pose of occluded parts in images because they have weak points about occlusions. But the proposed approach can reconstruct occluded parts in images by using originally designed thin-bar-shaped optic indicators. Thirdly, many cases of conventional methods use nonlinear numerical computation image analysis algorithm, so they have inconvenience about their initialization and computation times. But the proposed method improves these inconveniences by using a closed-form image analysis algorithm that is obtained from original formulation. Fourthly, many cases of conventional closed-form algorithms use approximations in their formulations processes, so they have disadvantages of low accuracy and confined applications due to singularities. But the proposed method improves these disadvantages by original formulation techniques where a closed-form algorithm is derived by using exponential-form twist coordinates, instead of using approximations or local parameterizations such as Euler angels.