• Title/Summary/Keyword: Visual Scene

Search Result 370, Processing Time 0.025 seconds

Extended Support Vector Machines for Object Detection and Localization

  • Feyereisl, Jan;Han, Bo-Hyung
    • The Magazine of the IEIE
    • /
    • v.39 no.2
    • /
    • pp.45-54
    • /
    • 2012
  • Object detection is a fundamental task for many high-level computer vision applications such as image retrieval, scene understanding, activity recognition, visual surveillance and many others. Although object detection is one of the most popular problems in computer vision and various algorithms have been proposed thus far, it is also notoriously difficult, mainly due to lack of proper models for object representation, that handle large variations of object structure and appearance. In this article, we review a branch of object detection algorithms based on Support Vector Machines (SVMs), a well-known max-margin technique to minimize classification error. We introduce a few variations of SVMs-Structural SVMs and Latent SVMs-and discuss their applications to object detection and localization.

  • PDF

A Study on the Scene Change Detection on the Content-Based Domain (내용기반 영역에서의 효과적인 장면전환에 관한 연구)

  • Lee, Hae-Mun;O, Il-Gyun;Lee, Jae-Yeon;Bae, Yeong-Rae;Jang, Jong-Hwan
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.9
    • /
    • pp.2305-2310
    • /
    • 1999
  • Histogram-based methods have been used generally for retrieving an image to be searched in the video database. In this paper, we present an algorithm to retrieve an image incorporating the HVS. Comparing the performance of the previous algorithm with that of the proposed one, the new and better algorithm is proposed than the previous algorithm.

  • PDF

An Adaptive Weighted Regression and Guided Filter Hybrid Method for Hyperspectral Pansharpening

  • Dong, Wenqian;Xiao, Song
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.1
    • /
    • pp.327-346
    • /
    • 2019
  • The goal of hyperspectral pansharpening is to combine a hyperspectral image (HSI) with a panchromatic image (PANI) derived from the same scene to obtain a single fused image. In this paper, a new hyperspectral pansharpening approach using adaptive weighted regression and guided filter is proposed. First, the intensity information (INT) of the HSI is obtained by the adaptive weighted regression algorithm. Especially, the optimization formula is solved to obtain the closed solution to reduce the calculation amount. Then, the proposed method proposes a new way to obtain the sufficient spatial information from the PANI and INT by guided filtering. Finally, the fused HSI is obtained by adding the extracted spatial information to the interpolated HSI. Experimental results demonstrate that the proposed approach achieves better property in preserving the spectral information as well as enhancing the spatial detail compared with other excellent approaches in visual interpretation and objective fusion metrics.

3D Res-Inception Network Transfer Learning for Multiple Label Crowd Behavior Recognition

  • Nan, Hao;Li, Min;Fan, Lvyuan;Tong, Minglei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.3
    • /
    • pp.1450-1463
    • /
    • 2019
  • The problem towards crowd behavior recognition in a serious clustered scene is extremely challenged on account of variable scales with non-uniformity. This paper aims to propose a crowed behavior classification framework based on a transferring hybrid network blending 3D res-net with inception-v3. First, the 3D res-inception network is presented so as to learn the augmented visual feature of UCF 101. Then the target dataset is applied to fine-tune the network parameters in an attempt to classify the behavior of densely crowded scenes. Finally, a transferred entropy function is used to calculate the probability of multiple labels in accordance with these features. Experimental results show that the proposed method could greatly improve the accuracy of crowd behavior recognition and enhance the accuracy of multiple label classification.

Performance Analysis on View Synthesis of 360 Video for Omnidirectional 6DoF

  • Kim, Hyun-Ho;Lee, Ye-Jin;Kim, Jae-Gon
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2018.11a
    • /
    • pp.22-24
    • /
    • 2018
  • MPEG-I Visual group is actively working on enhancing immersive experiences with up to six degree of freedom (6DoF). In virtual space of omnidirectional 6DoF, which is defined as a case of degree of freedom providing 6DoF in a restricted area, looking at the scene from another viewpoint (another position in space) requires rendering additional viewpoints called virtual omnidirectional viewpoints. This paper presents the performance analysis on view synthesis, which is done as the exploration experiment (EE) in MPEG-I, from a set of 360 videos providing omnidirectional 6DoF in various ways with different distances, directions, and number of input views. In addition, we compared the subjective quality between synthesized images with one input view and two input views.

  • PDF

Case study of Creating CG Handheld Steadicam using maya nParticle

  • Choi, Chul Young
    • International journal of advanced smart convergence
    • /
    • v.10 no.3
    • /
    • pp.157-162
    • /
    • 2021
  • With the recent increase in YouTube content, many YouTubers are shooting with a handheld camera. Audiences are increasingly accustomed to the movement of handheld cameras. As the camera moves faster than the camera movement of the old movies, and the camera moves splendidly to the music of the music video, the camera movement in CG animation is also needed to change. The handheld Steadicam creates a natural camera movement by compensating so that the screen does not vibrate significantly even when the vibration is large and by minimizing rotation. In order to implement such camera movement, we tried to make a handheld Steadicam using nParticle simulation of Maya software and apply it to the scene to verify whether it is possible to implement the necessary natural and various movement.

Enhancing Single Thermal Image Depth Estimation via Multi-Channel Remapping for Thermal Images (열화상 이미지 다중 채널 재매핑을 통한 단일 열화상 이미지 깊이 추정 향상)

  • Kim, Jeongyun;Jeon, Myung-Hwan;Kim, Ayoung
    • The Journal of Korea Robotics Society
    • /
    • v.17 no.3
    • /
    • pp.314-321
    • /
    • 2022
  • Depth information used in SLAM and visual odometry is essential in robotics. Depth information often obtained from sensors or learned by networks. While learning-based methods have gained popularity, they are mostly limited to RGB images. However, the limitation of RGB images occurs in visually derailed environments. Thermal cameras are in the spotlight as a way to solve these problems. Unlike RGB images, thermal images reliably perceive the environment regardless of the illumination variance but show lacking contrast and texture. This low contrast in the thermal image prohibits an algorithm from effectively learning the underlying scene details. To tackle these challenges, we propose multi-channel remapping for contrast. Our method allows a learning-based depth prediction model to have an accurate depth prediction even in low light conditions. We validate the feasibility and show that our multi-channel remapping method outperforms the existing methods both visually and quantitatively over our dataset.

Study of Scene Directing with Cinemachine

  • Park, Sung-Suk;Kim, Jae-Ho
    • International Journal of Contents
    • /
    • v.18 no.1
    • /
    • pp.98-104
    • /
    • 2022
  • With Unity creating footage is possible by using 3D motion, 2D motion, particular, and sound. Even post-production video editing is possible by combining the footage. In particular, Cinemachine, a suite of camera tools for Unity, that greatly affects screen layout and the flow of video images, can implement most of the functions of a physical camera. Visual aesthetics can be achieved through it. However, as it is a part of a game engine. Thus, the understanding of the game engine should come first. Also doubts may arise as to how similar it is to a physical camera. Accordingly, the purpose of this study is to examine the advantages and cautions of virtual cameras in Cinemachine, and explore the potential for development by implementing storytelling directly.

Scene Graph Generation with Graph Neural Network and Multimodal Context (그래프 신경망과 멀티 모달 맥락 정보를 이용한 장면 그래프 생성)

  • Jung, Ga-Young;Kim, In-cheol
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2020.05a
    • /
    • pp.555-558
    • /
    • 2020
  • 본 논문에서는 입력 영상에 담긴 다양한 물체들과 그들 간의 관계를 효과적으로 탐지하여, 하나의 장면 그래프로 표현해내는 새로운 심층 신경망 모델을 제안한다. 제안 모델에서는 물체와 관계의 효과적인 탐지를 위해, 합성 곱 신경망 기반의 시각 맥락 특징들뿐만 아니라 언어 맥락 특징들을 포함하는 다양한 멀티 모달 맥락 정보들을 활용한다. 또한, 제안 모델에서는 관계를 맺는 두 물체 간의 상호 의존성이 그래프 노드 특징값들에 충분히 반영되도록, 그래프 신경망을 이용해 맥락 정보를 임베딩한다. 본 논문에서는 Visual Genome 벤치마크 데이터 집합을 이용한 비교 실험들을 통해, 제안 모델의 효과와 성능을 입증한다.

Compact near-eye display for firefighter's self-contained breathing apparatus

  • Ungyeon Yang
    • ETRI Journal
    • /
    • v.45 no.6
    • /
    • pp.1046-1055
    • /
    • 2023
  • We introduce a display for virtual-reality (VR) fire training. Firefighters prefer to wear and operate a real breathing apparatus while experiencing full visual immersion in a VR fire space. Thus, we used a thin head-mounted display (HMD) with a light field and folded optical system, aiming to both minimize the volume for integration in front of the face into a breathing apparatus and maintain adequate visibility, including a wide viewing angle and resolution similar to that of commercial displays. We developed the optical system testing modules and prototypes of the integrated breathing apparatus. Through iterative testing, the thickness of the output optical module in front of the eyes was reduced from 50 mm to 60 mm to less than 20 mm while maintaining a viewing angle of 103°. In addition, the resolution and image quality degradation of the light field in the display was mitigated. Hence, we obtained a display with a structure consistent with the needs of firefighters in the field. In future work, we will conduct user evaluation regarding fire scene reproducibility by combining immersive VR fire training and real firefighting equipment.