• Title/Summary/Keyword: Vision processing

Search Result 1,555, Processing Time 0.028 seconds

A Study on Design and Interpretation of Pattern Laser Coordinate Tracking Method for Curved Screen Using Multiple Cameras (다중카메라를 이용한 곡면 스크린의 패턴 레이저 좌표 추적 방법 설계와 해석 연구)

  • Jo, Jinpyo;Kim, Jeongho;Jeong, Yongbae
    • Journal of Platform Technology
    • /
    • v.9 no.4
    • /
    • pp.60-70
    • /
    • 2021
  • This paper proposes a method capable of stably tracking the coordinates of a patterned laser image in a curved screen shooting system using two or more channels of multiple cameras. This method can track and acquire target points very effectively when applied to a multi-screen shooting method that can replace the HMD shooting method. Images of curved screens with severe deformation obtained from individual cameras are corrected through image normalization, image binarization, and noise removal. This corrected image is created and applied as an Euclidean space map that is easy to track the firing point based on the matching point. As a result of the experiment, the image coordinates of the pattern laser were stably extracted in the curved screen shooting system, and the error of the target point position of the real-world coordinate position and the broadband Euclidean map was minimized. The reliability of the proposed method was confirmed through the experiment.

Assessment and Comparison of Three Dimensional Exoscopes for Near-Infrared Fluorescence-Guided Surgery Using Second-Window Indocyanine-Green

  • Cho, Steve S.;Teng, Clare W.;Ravin, Emma De;Singh, Yash B.;Lee, John Y.K.
    • Journal of Korean Neurosurgical Society
    • /
    • v.65 no.4
    • /
    • pp.572-581
    • /
    • 2022
  • Objective : Compared to microscopes, exoscopes have advantages in field-depth, ergonomics, and educational value. Exoscopes are especially well-poised for adaptation into fluorescence-guided surgery (FGS) due to their excitation source, light path, and image processing capabilities. We evaluated the feasibility of near-infrared FGS using a 3-dimensional (3D), 4 K exoscope with near-infrared fluorescence imaging capability. We then compared it to the most sensitive, commercially-available near-infrared exoscope system (3D and 960 p). In-vitro and intraoperative comparisons were performed. Methods : Serial dilutions of indocyanine-green (1-2000 ㎍/mL) were imaged with the 3D, 4 K Olympus Orbeye (system 1) and the 3D, 960 p VisionSense Iridium (system 2). Near-infrared sensitivity was calculated using signal-to-background ratios (SBRs). In addition, three patients with brain tumors were administered indocyanine-green and imaged with system 1, with two also imaged with system 2 for comparison. Results : Systems 1 and 2 detected near-infrared fluorescence from indocyanine green concentrations of >250 ㎍/L and >31.3 ㎍/L, respectively. Intraoperatively, system 1 visualized strong near-infrared fluorescence from two, strongly gadolinium-enhancing meningiomas (SBR=2.4, 1.7). The high-resolution, bright images were sufficient for the surgeon to appreciate the underlying anatomy in the near-infrared mode. However, system 1 was not able to visualize fluorescence from a weakly-enhancing intraparenchymal metastasis. In contrast, system 2 successfully visualized both the meningioma and the metastasis but lacked high resolution stereopsis. Conclusion : Three-dimensional exoscope systems provide an alternative visualization platform for both standard microsurgery and near-infrared fluorescent guided surgery. However, when tumor fluorescence is weak (i.e., low fluorophore uptake, deep tumors), highly sensitive near-infrared visualization systems may be required.

Development of an intelligent camera for multiple body temperature detection (다중 체온 감지용 지능형 카메라 개발)

  • Lee, Su-In;Kim, Yun-Su;Seok, Jong-Won
    • Journal of IKEEE
    • /
    • v.26 no.3
    • /
    • pp.430-436
    • /
    • 2022
  • In this paper, we propose an intelligent camera for multiple body temperature detection. The proposed camera is composed of optical(4056*3040) and thermal(640*480), which detects abnormal symptoms by analyzing a person's facial expression and body temperature from the acquired image. The optical and thermal imaging cameras are operated simultaneously and detect an object in the optical image, in which the facial region and expression analysis are calculated from the object. Additionally, the calculated coordinate values from the optical image facial region are applied to the thermal image, also the maximum temperature is measured from the region and displayed on the screen. Abnormal symptom detection is determined by using the analyzed three facial expressions(neutral, happy, sadness) and body temperature values. In order to evaluate the performance of the proposed camera, the optical image processing part is tested on Caltech, WIDER FACE, and CK+ datasets for three algorithms(object detection, facial region detection, and expression analysis). Experimental results have shown 91%, 91%, and 84% accuracy scores each.

Single Shot Detector for Detecting Clickable Object in Mobile Device Screen (모바일 디바이스 화면의 클릭 가능한 객체 탐지를 위한 싱글 샷 디텍터)

  • Jo, Min-Seok;Chun, Hye-won;Han, Seong-Soo;Jeong, Chang-Sung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.1
    • /
    • pp.29-34
    • /
    • 2022
  • We propose a novel network architecture and build dataset for recognizing clickable objects on mobile device screens. The data was collected based on clickable objects on the mobile device screen that have numerous resolution, and a total of 24,937 annotation data were subdivided into seven categories: text, edit text, image, button, region, status bar, and navigation bar. We use the Deconvolution Single Shot Detector as a baseline, the backbone network with Squeeze-and-Excitation blocks, the Single Shot Detector layer structure to derive inference results and the Feature pyramid networks structure. Also we efficiently extract features by changing the input resolution of the existing 1:1 ratio of the network to a 1:2 ratio similar to the mobile device screen. As a result of experimenting with the dataset we have built, the mean average precision was improved by up to 101% compared to baseline.

CNN-based Online Sign Language Translation Counseling System (CNN기반의 온라인 수어통역 상담 시스템에 관한 연구)

  • Park, Won-Cheol;Park, Koo-Rack
    • Journal of Convergence for Information Technology
    • /
    • v.11 no.5
    • /
    • pp.17-22
    • /
    • 2021
  • It is difficult for the hearing impaired to use the counseling service without sign language interpretation. Due to the shortage of sign language interpreters, it takes a lot of time to connect to sign language interpreters, or there are many cases where the connection is not available. Therefore, in this paper, we propose a system that captures sign language as an image using OpenCV and CNN (Convolutional Neural Network), recognizes sign language motion, and converts the meaning of sign language into textual data and provides it to users. The counselor can conduct counseling by reading the stored sign language translation counseling contents. Consultation is possible without a professional sign language interpreter, reducing the burden of waiting for a sign language interpreter. If the proposed system is applied to counseling services for the hearing impaired, it is expected to improve the effectiveness of counseling and promote academic research on counseling for the hearing impaired in the future.

2-Stage Detection and Classification Network for Kiosk User Analysis (디스플레이형 자판기 사용자 분석을 위한 이중 단계 검출 및 분류 망)

  • Seo, Ji-Won;Kim, Mi-Kyung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.5
    • /
    • pp.668-674
    • /
    • 2022
  • Machine learning techniques using visual data have high usability in fields of industry and service such as scene recognition, fault detection, security and user analysis. Among these, user analysis through the videos from CCTV is one of the practical way of using vision data. Also, many studies about lightweight artificial neural network have been published to increase high usability for mobile and embedded environment so far. In this study, we propose the network combining the object detection and classification for mobile graphic processing unit. This network detects pedestrian and face, classifies age and gender from detected face. Proposed network is constructed based on MobileNet, YOLOv2 and skip connection. Both detection and classification models are trained individually and combined as 2-stage structure. Also, attention mechanism is used to improve detection and classification ability. Nvidia Jetson Nano is used to run and evaluate the proposed system.

Twin models for high-resolution visual inspections

  • Seyedomid Sajedi;Kareem A. Eltouny;Xiao Liang
    • Smart Structures and Systems
    • /
    • v.31 no.4
    • /
    • pp.351-363
    • /
    • 2023
  • Visual structural inspections are an inseparable part of post-earthquake damage assessments. With unmanned aerial vehicles (UAVs) establishing a new frontier in visual inspections, there are major computational challenges in processing the collected massive amounts of high-resolution visual data. We propose twin deep learning models that can provide accurate high-resolution structural components and damage segmentation masks efficiently. The traditional approach to cope with high memory computational demands is to either uniformly downsample the raw images at the price of losing fine local details or cropping smaller parts of the images leading to a loss of global contextual information. Therefore, our twin models comprising Trainable Resizing for high-resolution Segmentation Network (TRS-Net) and DmgFormer approaches the global and local semantics from different perspectives. TRS-Net is a compound, high-resolution segmentation architecture equipped with learnable downsampler and upsampler modules to minimize information loss for optimal performance and efficiency. DmgFormer utilizes a transformer backbone and a convolutional decoder head with skip connections on a grid of crops aiming for high precision learning without downsizing. An augmented inference technique is used to boost performance further and reduce the possible loss of context due to grid cropping. Comprehensive experiments have been performed on the 3D physics-based graphics models (PBGMs) synthetic environments in the QuakeCity dataset. The proposed framework is evaluated using several metrics on three segmentation tasks: component type, component damage state, and global damage (crack, rebar, spalling). The models were developed as part of the 2nd International Competition for Structural Health Monitoring.

A Study on Extraction of Skin Region and Lip Using Skin Color of Eye Zone (눈 주위의 피부색을 이용한 피부영역검출과 입술검출에 관한 연구)

  • Park, Young-Jae;Jang, Seok-Woo;Kim, Gye-Young
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.4
    • /
    • pp.19-30
    • /
    • 2009
  • In this paper, We propose a method with which we can detect facial components and face in input image. We use eye map and mouth map to detect facial components using eyes and mouth. First, We find out eye zone, and second, We find out color value distribution of skin region using the color around the eye zone. Skin region have characteristic distribution in YCbCr color space. By using it, we separate the skin region and background area. We find out the color value distribution of the extracted skin region and extract around the region. Then, detect mouth using mouthmap from extracted skin region. Proposed method is better than traditional method the reason for it comes good result with accurate mouth region.

Research Trends and Datasets Review using Satellite Image (위성영상 이미지를 활용한 연구 동향 및 데이터셋 리뷰)

  • Kim, Se Hyoung;Chae, Jung Woo;Kang, Ju Young
    • Smart Media Journal
    • /
    • v.11 no.1
    • /
    • pp.17-30
    • /
    • 2022
  • Like other computer vision research trends, research using satellite images was able to achieve rapid growth with the development of GPU-based computer computing capabilities and deep learning methodologies related to image processing. As a result, satellite images are being used in various fields, and the number of studies on how to use satellite images is increasing. Therefore, in this paper, we will introduce the field of research and utilization of satellite images and datasets that can be used for research using satellite images. First, studies using satellite images were collected and classified according to the research method. It was largely classified into a Regression-based Approach and a Classification-based Approach, and the papers used by other methods were summarized. Next, the datasets used in studies using satellite images were summarized. This study proposes information on datasets and methods of use in research. In addition, it introduces how to organize and utilize domestic satellite image datasets that were recently opened by AI hub. In addition, I would like to briefly examine the limitations of satellite image-related research and future trends.

Predicting Unseen Object Pose with an Adaptive Depth Estimator (적응형 깊이 추정기를 이용한 미지 물체의 자세 예측)

  • Sungho, Song;Incheol, Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.12
    • /
    • pp.509-516
    • /
    • 2022
  • Accurate pose prediction of objects in 3D space is an important visual recognition technique widely used in many applications such as scene understanding in both indoor and outdoor environments, robotic object manipulation, autonomous driving, and augmented reality. Most previous works for object pose estimation have the limitation that they require an exact 3D CAD model for each object. Unlike such previous works, this paper proposes a novel neural network model that can predict the poses of unknown objects based on only their RGB color images without the corresponding 3D CAD models. The proposed model can obtain depth maps required for unknown object pose prediction by using an adaptive depth estimator, AdaBins,. In this paper, we evaluate the usefulness and the performance of the proposed model through experiments using benchmark datasets.