• Title/Summary/Keyword: Multi-Vision

Search Result 491, Processing Time 0.027 seconds

Understanding the Effect of Different Scale Information Fusion in Deep Convolutional Neural Networks (딥 CNN에서의 Different Scale Information Fusion (DSIF)의 영향에 대한 이해)

  • Liu, Kai;Cheema, Usman;Moon, Seungbin
    • Annual Conference of KIPS
    • /
    • 2019.10a
    • /
    • pp.1004-1006
    • /
    • 2019
  • Different scale of information is an important component in computer vision systems. Recently, there are considerable researches on utilizing multi-scale information to solve the scale-invariant problems, such as GoogLeNet and FPN. In this paper, we introduce the notion of different scale information fusion (DSIF) and show that it has a significant effect on the performance of object recognition systems. We analyze the DSIF in several architecture designs, and the effect of nonlinear activations, dropout, sub-sampling and skip connections on it. This leads to clear suggestions for ways of the DSIF to choose.

Implementation of Moving Object Recognition based on Deep Learning (딥러닝을 통한 움직이는 객체 검출 알고리즘 구현)

  • Lee, YuKyong;Lee, Yong-Hwan
    • Journal of the Semiconductor & Display Technology
    • /
    • v.17 no.2
    • /
    • pp.67-70
    • /
    • 2018
  • Object detection and tracking is an exciting and interesting research area in the field of computer vision, and its technologies have been widely used in various application systems such as surveillance, military, and augmented reality. This paper proposes and implements a novel and more robust object recognition and tracking system to localize and track multiple objects from input images, which estimates target state using the likelihoods obtained from multiple CNNs. As the experimental result, the proposed algorithm is effective to handle multi-modal target appearances and other exceptions.

Tracking by Detection of Multiple Faces using SSD and CNN Features

  • Tai, Do Nhu;Kim, Soo-Hyung;Lee, Guee-Sang;Yang, Hyung-Jeong;Na, In-Seop;Oh, A-Ran
    • Smart Media Journal
    • /
    • v.7 no.4
    • /
    • pp.61-69
    • /
    • 2018
  • Multi-tracking of general objects and specific faces is an important topic in the field of computer vision applicable to many branches of industry such as biometrics, security, etc. The rapid development of deep neural networks has resulted in a dramatic improvement in face recognition and object detection problems, which helps improve the multiple-face tracking techniques exploiting the tracking-by-detection method. Our proposed method uses face detection trained with a head dataset to resolve the face deformation problem in the tracking process. Further, we use robust face features extracted from the deep face recognition network to match the tracklets with tracking faces using Hungarian matching method. We achieved promising results regarding the usage of deep face features and head detection in a face tracking benchmark.

The Detection of Multi-class Vehicles using Swin Transformer (Swin Transformer를 이용한 항공사진에서 다중클래스 차량 검출)

  • Lee, Ki-chun;Jeong, Yu-seok;Lee, Chang-woo
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.10a
    • /
    • pp.112-114
    • /
    • 2021
  • In order to detect urban conditions, the number of means of transportation and traffic flow are essential factors to be identified. This paper improved the detection system capabilities shown in previous studies using the SwinTransformer model, which showed higher performance than existing convolutional neural networks, by learning various vehicle types using existing Mask R-CNN and introducing today's widely used transformer model to detect certain types of vehicles in urban aerial images.

  • PDF

Control of Multi-Home Devices Using AI Vision and Generative AI (AI 비전과 생성형 AI 를 이용한 멀티 홈 디바이스 제어)

  • Su-Min Hong;Su-Min Kim;Su-Hee Song;Chae-Yeon Ahn
    • Annual Conference of KIPS
    • /
    • 2023.11a
    • /
    • pp.1037-1038
    • /
    • 2023
  • 기술의 발전으로 인해 스마트 가전제품이 늘어나며 스마트 홈 기술이 주목을 받고 있다. 그러나 이러한 기술은 설정과정의 복잡성으로 사용자들이 쉽게 접근하기 어렵다. 특히 디지털 기기 사용에 익숙하지 않은 사용자들을 스마트 홈 기술로부터 소외시키는 결과를 낳고 있다. 본 논문에서는 사용자 친화적인 스마트 홈 시스템을 제안한다. 사용자의 시선 방향을 추적하여 디바이스를 선택하고 간단한 인터페이스의 컨트롤러로 디바이스를 손쉽게 조작할 수 있도록 한다. 또한, 생성형 인공지능과 RAG 를 결합하여 사용자가 가전제품과 자연스럽게 대화하며 정보를 얻을 수 있는 인터페이스를 제공한다.

Color Pattern Recognition and Tracking for Multi-Object Tracking in Artificial Intelligence Space (인공지능 공간상의 다중객체 구분을 위한 컬러 패턴 인식과 추적)

  • Tae-Seok Jin
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.27 no.2_2
    • /
    • pp.319-324
    • /
    • 2024
  • In this paper, the Artificial Intelligence Space(AI-Space) for human-robot interface is presented, which can enable human-computer interfacing, networked camera conferencing, industrial monitoring, service and training applications. We present a method for representing, tracking, and objects(human, robot, chair) following by fusing distributed multiple vision systems in AI-Space. The article presents the integration of color distributions into particle filtering. Particle filters provide a robust tracking framework under ambiguous conditions. We propose to track the moving objects(human, robot, chair) by generating hypotheses not in the image plane but on the top-view reconstruction of the scene.

Deep Reference-based Dynamic Scene Deblurring

  • Cunzhe Liu;Zhen Hua;Jinjiang Li
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.3
    • /
    • pp.653-669
    • /
    • 2024
  • Dynamic scene deblurring is a complex computer vision problem owing to its difficulty to model mathematically. In this paper, we present a novel approach for image deblurring with the help of the sharp reference image, which utilizes the reference image for high-quality and high-frequency detail results. To better utilize the clear reference image, we develop an encoder-decoder network and two novel modules are designed to guide the network for better image restoration. The proposed Reference Extraction and Aggregation Module can effectively establish the correspondence between blurry image and reference image and explore the most relevant features for better blur removal and the proposed Spatial Feature Fusion Module enables the encoder to perceive blur information at different spatial scales. In the final, the multi-scale feature maps from the encoder and cascaded Reference Extraction and Aggregation Modules are integrated into the decoder for a global fusion and representation. Extensive quantitative and qualitative experimental results from the different benchmarks show the effectiveness of our proposed method.

Assessment of UV Blocking Performance for Development of Converged Technologies of Vision Correcting Spectacle Lenses (시력교정용 안경렌즈의 융복합적 기술개발을 위한 UV차단 성능 평가)

  • Kim, Heung-Soo
    • Journal of the Korea Convergence Society
    • /
    • v.9 no.4
    • /
    • pp.93-98
    • /
    • 2018
  • This study was wanted to confirm ability for UV blocking according to its material. The lenses materials were Acryl, CR-39, NK-55, and MR-8. It was grouped: Group A consisting of anti-scratch hard coated lenses and anti-refractive multi coated lenses, Group B added UV blocking coating on the group A, and Group C consisting of only UV blocking lenses. The results measured UV transmittance, On the UV-A wavelength, Group A showed the UV transmittance of 7.726%, 0.043%, 0.007%, and 0.007% respectively. Group B showed 0.038%, 0.037%, 0.007%, and 0.007%, respectively. The UV-blocking performance of CR-39 has been greatly improved. Group C has shown the best UV blocking function; only 0.005% and 0.004% of UV transmittances.(1.60 and 1.67 index of refraction respectively). For the low power of lenses and sunglasses, the CR-39 lens is the most used. Therefore, to UV blocking from the lens, new materials or UV absorbers or UV coating technology and development of Converged Technologies are required.

Development of Automatic Grading and Sorting System for Dry Oak Mushrooms -2nd Prototype- (건표고 자동 등급선별 시스템 개발 -시작 2호기-)

  • Hwang, H.;Kim, S. C.;Im, D. H.;Song, K. S.;Choi, T. H.
    • Journal of Biosystems Engineering
    • /
    • v.26 no.2
    • /
    • pp.147-154
    • /
    • 2001
  • In Korea and Japan, dried oak mushrooms are classified into 12 to 16 different categories based on its external visual quality. And grading used to be done manually by the human expert and is limited to the randomly sampled oak mushrooms. Visual features of dried oak mushrooms dominate its quality and are distributed over both sides of the gill and the cap. The 2nd prototype computer vision based automatic grading and sorting system for dried oak mushrooms was developed based on the 1st prototype. Sorting function was improved and overall system for grading was simplified to one stage grading instead of two stage grading by inspecting both front and back sides of mushrooms. Neuro-net based side(gill or cap) recognition algorithm of the fed mushroom was adopted. Grading was performed with both images of gill and cap using neural network. A real time simultaneous discharge algorithm, which is good for objects randomly fed individually and for multi-objects located along a series of discharge buckets, was developed and implemented to the controller and the performance was verified. Two hundreds samples chosen from 10 samples per 20 grade categories were used to verify the performance of each unit such as feeding, reversing, grading, and discharging unites. Test results showed that success rates of one-line feeding, reversing, grading, and discharging functions were 93%, 95%, 94%, and 99% respectively. The developed prototype revealed successful performance such as the approximate sorting capability of 3,600 mushrooms/hr per each line i.e. average 1sec/mushroom. Considering processing time of approximate 0.2 sec for grading, it was desired to reduce time to reverse a mushroom to acquire the reversed surface image.

  • PDF

RealBook: A Tangible Electronic Book Based on the Interface of TouchFace-V (RealBook: TouchFace-V 인터페이스 기반 실감형 전자책)

  • Song, Dae-Hyeon;Bae, Ki-Tae;Lee, Chil-Woo
    • The Journal of the Korea Contents Association
    • /
    • v.13 no.12
    • /
    • pp.551-559
    • /
    • 2013
  • In this paper, we proposed a tangible RealBook based on the interface of TouchFace-V which is able to recognize multi-touch and hand gesture. The TouchFace-V is applied projection technology on a flat surface such as table, without constraint of space. The system's configuration is addressed installation, calibration, and portability issues that are most existing front-projected vision-based tabletop display. It can provide hand touch and gesture applying computer vision by adopting tracking technology without sensor and traditional input device. The RealBook deals with the combination of each advantage of analog sensibility on texts and multimedia effects of e-book. Also, it provides digitally created stories that would differ in experiences and environments with interacting users' choices on the interface of the book. We proposed e-book that is new concept of electronic book; named RealBook, different from existing and TouchFace-V interface, which can provide more direct viewing, natural and intuitive interactions with hand touch and gesture.