• Title/Summary/Keyword: visual model

Search Result 2,032, Processing Time 0.029 seconds

Improved Viewing Quality of 3-D Images in Computational Integral Imaging Reconstruction Based on Round Mapping Model

  • Shin, Dong-Hak;Kim, Nam-Woo;Yoo, Hoon;Lee, Joon-Jae;Lee, Byoung-Ho;Kim, Eun-Soo
    • ETRI Journal
    • /
    • v.29 no.5
    • /
    • pp.649-654
    • /
    • 2007
  • In this paper, we propose a computational integral imaging reconstruction (CIIR) method using a round mapping model to improve the viewing quality of 3-D images. The proposed CIIR method can overcome the problem of non-uniformly reconstructed images caused by the conventional method. To show the usefulness of proposed method, some experiments are carried out and the results are presented.

  • PDF

Video Captioning with Visual and Semantic Features

  • Lee, Sujin;Kim, Incheol
    • Journal of Information Processing Systems
    • /
    • v.14 no.6
    • /
    • pp.1318-1330
    • /
    • 2018
  • Video captioning refers to the process of extracting features from a video and generating video captions using the extracted features. This paper introduces a deep neural network model and its learning method for effective video captioning. In this study, visual features as well as semantic features, which effectively express the video, are also used. The visual features of the video are extracted using convolutional neural networks, such as C3D and ResNet, while the semantic features are extracted using a semantic feature extraction network proposed in this paper. Further, an attention-based caption generation network is proposed for effective generation of video captions using the extracted features. The performance and effectiveness of the proposed model is verified through various experiments using two large-scale video benchmarks such as the Microsoft Video Description (MSVD) and the Microsoft Research Video-To-Text (MSR-VTT).

Benchmark for Deep Learning based Visual Odometry and Monocular Depth Estimation (딥러닝 기반 영상 주행기록계와 단안 깊이 추정 및 기술을 위한 벤치마크)

  • Choi, Hyukdoo
    • The Journal of Korea Robotics Society
    • /
    • v.14 no.2
    • /
    • pp.114-121
    • /
    • 2019
  • This paper presents a new benchmark system for visual odometry (VO) and monocular depth estimation (MDE). As deep learning has become a key technology in computer vision, many researchers are trying to apply deep learning to VO and MDE. Just a couple of years ago, they were independently studied in a supervised way, but now they are coupled and trained together in an unsupervised way. However, before designing fancy models and losses, we have to customize datasets to use them for training and testing. After training, the model has to be compared with the existing models, which is also a huge burden. The benchmark provides input dataset ready-to-use for VO and MDE research in 'tfrecords' format and output dataset that includes model checkpoints and inference results of the existing models. It also provides various tools for data formatting, training, and evaluation. In the experiments, the exsiting models were evaluated to verify their performances presented in the corresponding papers and we found that the evaluation result is inferior to the presented performances.

Real-Time Control of a SCARA Robot by Visual Servoing with the Stereo Vision

  • S. H. Han;Lee, M. H.;K. Son;Lee, M. C.;Park, J. W.;Lee, J. M.
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1998.10a
    • /
    • pp.238-243
    • /
    • 1998
  • This paper presents a new approach to visual servoing with the stereo vision. In order to control the position and orientation of a robot with respect to an object, a new technique is proposed using a binocular stereo vision. The stereo vision enables us to calculate an exact image Jacobian not only at around a desired location but also at the other locations. The suggested technique can guide a robot manipulator to the desired location without giving such priori knowledge as the relative distance to the desired location or the model of an object even if the initial positioning error is large. This paper describes a model of stereo vision and how to generate feedback commands. The performance of the proposed visual servoing system is illustrated by the simulation and experimental results and compared with the case of conventional method fur a SCARA robot.

  • PDF

Information Requirements for Model-based Monitoring of Construction via Emerging Big Visual Data and BIM

  • Han, Kevin K.;Golparvar-Fard, Mani
    • International conference on construction engineering and project management
    • /
    • 2015.10a
    • /
    • pp.317-320
    • /
    • 2015
  • Documenting work-in-progress on construction sites using images captured with smartphones, point-and-shoot cameras, and Unmanned Aerial Vehicles (UAVs) has gained significant popularity among practitioners. The spatial and temporal density of these large-scale site image collections and the availability of 4D Building Information Models (BIM) provide a unique opportunity to develop BIM-driven visual analytics that can quickly and easily detect and visualize construction progress deviations. Building on these emerging sources of information this paper presents a pipeline for model-driven visual analytics of construction progress. It particularly focuses on the following key steps: 1) capturing, transferring, and storing images; 2) BIM-driven analytics to identify performance deviations, and 3) visualizations that enable root-cause assessments on performance deviations. The information requirements, and the challenges and opportunities for improvements in data collection, plan preparations, progress deviation analysis particularly under limited visibility, and transforming identified deviations into performance metrics to enable root-cause assessments are discussed using several real world case studies.

  • PDF

Visual servoing based on neuro-fuzzy model

  • Jun, Hyo-Byung;Sim, Kwee-Bo
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1997.10a
    • /
    • pp.712-715
    • /
    • 1997
  • In image jacobian based visual servoing, generally, inverse jacobian should be calculated by complicated coordinate transformations. These are required excessive computation and the singularity of the image jacobian should be considered. This paper presents a visual servoing to control the pose of the robotic manipulator for tracking and grasping 3-D moving object whose pose and motion parameters are unknown. Because the object is in motion tracking and grasping must be done on-line and the controller must have continuous learning ability. In order to estimate parameters of a moving object we use the kalman filter. And for tracking and grasping a moving object we use a fuzzy inference based reinforcement learning algorithm of dynamic recurrent neural networks. Computer simulation results are presented to demonstrate the performance of this visual servoing

  • PDF

Development of Mobile Phone Menu Structure based on Visual Concept Map (Visual Concept Map 에 기초한 핸드폰 메뉴 구조 개발)

  • Lee, Suk-Won;Myung, Ro-Hae;Kim, In-Soo
    • 한국HCI학회:학술대회논문집
    • /
    • 2008.02b
    • /
    • pp.399-404
    • /
    • 2008
  • 사용자 중심의 메뉴 기반 인터페이스를 설계하기 위해서는 인간의 지식 구조를 이해하는 것이 중요하다. 인간의 지식 구조를 이해하게 되면, 인터페이스를 통해서 전달된 자극들이 만들어낸 개념들이 어떠한 관계를 가지고 정신 모형(mental model)을 형성하고 있는지 알 수 있다. 인간의 지식 구조는 MDS (Multidimensional Scaling)과 Trajectory Mapping을 이용하여 Visual Concept Map 으로 나타낼 수 있고, 이것을 바탕으로 인간의 지식구조를 시각적으로 이해할 수 있다. MDS 는 인간의 머릿속에 자리잡고 있는 개념들의 상대적 위치를 알려주고, Trajectory Mapping 은 개념들 간의 연결 상태를 보여준다. 즉, Trajectory Mapping 을 통하여 개념들 간악 인지적 정보를 알 수 있다. 본 연구에서는 MDS 와 Trajectory Mapping 을 이용하여 핸드폰 메뉴로부터 전달 받은 시각적 자극들에 악해 형성된 개념들에 대한 인간의 지식 구조를 Visual Concept Map 으로 시각화하였다. 그리고 이렇게 시각화된 지식 구조를 바탕으로 메뉴 구조를 개발하였다. 본 연구 결과, MDS 와 Trajectory Mapping 을 이용한 인간의 지식 구조의 시각화는 사용자 중심의 메뉴 기반 인터페이스를 설계하는데 유용하게 쓰일 수 있을 것으로 보인다.

  • PDF

Visual Touch Recognition for NUI Using Voronoi-Tessellation Algorithm (보로노이-테셀레이션 알고리즘을 이용한 NUI를 위한 비주얼 터치 인식)

  • Kim, Sung Kwan;Joo, Young Hoon
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.64 no.3
    • /
    • pp.465-472
    • /
    • 2015
  • This paper presents a visual touch recognition for NUI(Natural User Interface) using Voronoi-tessellation algorithm. The proposed algorithms are three parts as follows: hand region extraction, hand feature point extraction, visual-touch recognition. To improve the robustness of hand region extraction, we propose RGB/HSI color model, Canny edge detection algorithm, and use of spatial frequency information. In addition, to improve the accuracy of the recognition of hand feature point extraction, we propose the use of Douglas Peucker algorithm, Also, to recognize the visual touch, we propose the use of the Voronoi-tessellation algorithm. Finally, we demonstrate the feasibility and applicability of the proposed algorithms through some experiments.

Spatio-temporal Characteristics Analysis of Visual System (시각계통의 시.공간적 특성 해석)

  • 한만춘;박상희;김강서
    • 전기의세계
    • /
    • v.21 no.5
    • /
    • pp.7-12
    • /
    • 1972
  • Applying the theory of physiology and control systems, the visual system was studied as a regulator of impining light. The characteristics function of visual system is mainly analysed by spato-temporal characteristics based upon Enroth's model, Broca-Sulzer phenomenon and Mach effect. Some aims of this paper are as follows. (1) In order to get the excitatory and inhibitory potential of the intermediated cell layer in the retina, the exponential value, {exp(FM/kT)- $I_{mn}$ } is caculated based on the physiological theory in neuro-phenomena. (2) To show the visual characteristics by analog simulation for generating stimulus waveforms and analysis, the visual adaptation was recorded as electrical stimulation in the form of step functions. Furthermore, ti is shown that the above experimental data agrees satisfactorily with the theoretical (psychophysiological) values. This study is expected to lead to further studies concerned with human observer and human operator in control and especially pattern recognition systems.stems.

  • PDF

Intelligentization of Landscape Bamboo Buildings Based on Visual Data Transmission and 5G Communication

  • ke Yu Kai
    • International Journal of Advanced Culture Technology
    • /
    • v.11 no.1
    • /
    • pp.389-394
    • /
    • 2023
  • Based on intelligent visual information and 5G, this paper studies the intelligent visual communication of landscape bamboo buildings, and provides a new method of intelligent perception and interactive computing for the real world, which can represent, model, Perception and cognition; through the integration of virtual and real, the situational understanding of the human-machine-material fusion environment and the interaction with nature. The 5G network can well meet the combination of high-bandwidth uplink transmission and low-latency downlink control. At the same time, 5G-based AR intelligent inspection, remote operation and maintenance guidance, and machine vision inspection. Taking the bamboo building as an example, through field inspections to analyze tourism Bamboo buildings before and after development, and the intelligentization of bamboo buildings based on 5G and visual modeling.