• Title/Summary/Keyword: scene image

Search Result 946, Processing Time 0.033 seconds

Object Recogniton for Markerless Augmented Reality Embodiment (마커 없는 증강 현실 구현을 위한 물체인식)

  • Paul, Anjan Kumar;Lee, Hyung-Jin;Kim, Young-Bum;Islam, Mohammad Khairul;Baek, Joong-Hwan
    • Journal of Advanced Navigation Technology
    • /
    • v.13 no.1
    • /
    • pp.126-133
    • /
    • 2009
  • In this paper, we propose an object recognition technique for implementing marker less augmented reality. Scale Invariant Feature Transform (SIFT) is used for finding the local features from object images. These features are invariant to scale, rotation, translation, and partially invariant to illumination changes. Extracted Features are distinct and have matched with different image features in the scene. If the trained image is properly matched, then it is expected to find object in scene. In this paper, an object is found from a scene by matching the template images that can be generated from the first frame of the scene. Experimental results of object recognition for 4 kinds of objects showed that the proposed technique has a good performance.

  • PDF

Co-saliency Detection Based on Superpixel Matching and Cellular Automata

  • Zhang, Zhaofeng;Wu, Zemin;Jiang, Qingzhu;Du, Lin;Hu, Lei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.5
    • /
    • pp.2576-2589
    • /
    • 2017
  • Co-saliency detection is a task of detecting same or similar objects in multi-scene, and has been an important preprocessing step for multi-scene image processing. However existing methods lack efficiency to match similar areas from different images. In addition, they are confined to single image detection without a unified framework to calculate co-saliency. In this paper, we propose a novel model called Superpixel Matching-Cellular Automata (SMCA). We use Hausdorff distance adjacent superpixel sets instead of single superpixel since the feature matching accuracy of single superpixel is poor. We further introduce Cellular Automata to exploit the intrinsic relevance of similar regions through interactions with neighbors in multi-scene. Extensive evaluations show that the SMCA model achieves leading performance compared to state-of-the-art methods on both efficiency and accuracy.

Semantic Scenes Classification of Sports News Video for Sports Genre Analysis (스포츠 장르 분석을 위한 스포츠 뉴스 비디오의 의미적 장면 분류)

  • Song, Mi-Young
    • Journal of Korea Multimedia Society
    • /
    • v.10 no.5
    • /
    • pp.559-568
    • /
    • 2007
  • Anchor-person scene detection is of significance for video shot semantic parsing and indexing clues extraction in content-based news video indexing and retrieval system. This paper proposes an efficient algorithm extracting anchor ranges that exist in sports news video for unit structuring of sports news. To detect anchor person scenes, first, anchor person candidate scene is decided by DCT coefficients and motion vector information in the MPEG4 compressed video. Then, from the candidate anchor scenes, image processing method is utilized to classify the news video into anchor-person scenes and non-anchor(sports) scenes. The proposed scheme achieves a mean precision and recall of 98% in the anchor-person scenes detection experiment.

  • PDF

Multimodal Attention-Based Fusion Model for Context-Aware Emotion Recognition

  • Vo, Minh-Cong;Lee, Guee-Sang
    • International Journal of Contents
    • /
    • v.18 no.3
    • /
    • pp.11-20
    • /
    • 2022
  • Human Emotion Recognition is an exciting topic that has been attracting many researchers for a lengthy time. In recent years, there has been an increasing interest in exploiting contextual information on emotion recognition. Some previous explorations in psychology show that emotional perception is impacted by facial expressions, as well as contextual information from the scene, such as human activities, interactions, and body poses. Those explorations initialize a trend in computer vision in exploring the critical role of contexts, by considering them as modalities to infer predicted emotion along with facial expressions. However, the contextual information has not been fully exploited. The scene emotion created by the surrounding environment, can shape how people perceive emotion. Besides, additive fusion in multimodal training fashion is not practical, because the contributions of each modality are not equal to the final prediction. The purpose of this paper was to contribute to this growing area of research, by exploring the effectiveness of the emotional scene gist in the input image, to infer the emotional state of the primary target. The emotional scene gist includes emotion, emotional feelings, and actions or events that directly trigger emotional reactions in the input image. We also present an attention-based fusion network, to combine multimodal features based on their impacts on the target emotional state. We demonstrate the effectiveness of the method, through a significant improvement on the EMOTIC dataset.

Generalized Panoramic Scene Reconstruction from Video Sequences Based on Outlier Rejection (아웃라이어 배제에 기초한 일반화된 파노라마 영상 재구성)

  • 서종열;박종현;강문기
    • Journal of Broadcast Engineering
    • /
    • v.6 no.2
    • /
    • pp.160-168
    • /
    • 2001
  • In this paper, we propose a new practical motion model that can exploit the general properties of camera motion in constructing a panorama. accounting for panning. tilting, and evert the change in focal length of the camera. We also present an efficient algorithm to handle moving objects or noose in the scene based on outliers rejection. Spatial and temporal statistical properties of motion field are exploited to detect the outliers. The proposed algorithm removes moving objects or noise from the panoramic Image so that mode clear and complete view of the background Image can be obtained. This method does not require assumptions or a priors knowledge of the scene. The entire process is fully automatic as this method does not require any manual correction in the process of constructing a Panorama. The proposed algorithm is tested on the broadcasting images of soccer games. Oun simulation result shows that this method is superior to conventional image mosaicing algorithms.

  • PDF

Effects of Depth Map Quantization for Computer-Generated Multiview Images using Depth Image-Based Rendering

  • Kim, Min-Young;Cho, Yong-Joo;Choo, Hyon-Gon;Kim, Jin-Woong;Park, Kyoung-Shin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.5 no.11
    • /
    • pp.2175-2190
    • /
    • 2011
  • This paper presents the effects of depth map quantization for multiview intermediate image generation using depth image-based rendering (DIBR). DIBR synthesizes multiple virtual views of a 3D scene from a 2D image and its associated depth map. However, it needs precise depth information in order to generate reliable and accurate intermediate view images for use in multiview 3D display systems. Previous work has extensively studied the pre-processing of the depth map, but little is known about depth map quantization. In this paper, we conduct an experiment to estimate the depth map quantization that affords acceptable image quality to generate DIBR-based multiview intermediate images. The experiment uses computer-generated 3D scenes, in which the multiview images captured directly from the scene are compared to the multiview intermediate images constructed by DIBR with a number of quantized depth maps. The results showed that there was no significant effect on depth map quantization from 16-bit to 7-bit (and more specifically 96-scale) on DIBR. Hence, a depth map above 7-bit is needed to maintain sufficient image quality for a DIBR-based multiview 3D system.

Comparisons of Object Recognition Performance with 3D Photon Counting & Gray Scale Images

  • Lee, Chung-Ghiu;Moon, In-Kyu
    • Journal of the Optical Society of Korea
    • /
    • v.14 no.4
    • /
    • pp.388-394
    • /
    • 2010
  • In this paper the object recognition performance of a photon counting integral imaging system is quantitatively compared with that of a conventional gray scale imaging system. For 3D imaging of objects with a small number of photons, the elemental image set of a 3D scene is obtained using the integral imaging set up. We assume that the elemental image detection follows a Poisson distribution. Computational geometrical ray back propagation algorithm and parametric maximum likelihood estimator are applied to the photon counting elemental image set in order to reconstruct the original 3D scene. To evaluate the photon counting object recognition performance, the normalized correlation peaks between the reconstructed 3D scenes are calculated for the varied and fixed total number of photons in the reconstructed sectional image changing the total number of image channels in the integral imaging system. It is quantitatively illustrated that the recognition performance of the photon counting integral imaging system can be similar to that of a conventional gray scale imaging system as the number of image viewing channels in the photon counting integral imaging (PCII) system is increased up to the threshold point. Also, we present experiments to find the threshold point on the total number of image channels in the PCII system which can guarantee a comparable recognition performance with a gray scale imaging system. To the best of our knowledge, this is the first report on comparisons of object recognition performance with 3D photon counting & gray scale images.

High-resolution Depth Generation using Multi-view Camera and Time-of-Flight Depth Camera (다시점 카메라와 깊이 카메라를 이용한 고화질 깊이 맵 제작 기술)

  • Kang, Yun-Suk;Ho, Yo-Sung
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.48 no.6
    • /
    • pp.1-7
    • /
    • 2011
  • The depth camera measures range information of the scene in real time using Time-of-Flight (TOF) technology. Measured depth data is then regularized and provided as a depth image. This depth image is utilized with the stereo or multi-view image to generate high-resolution depth map of the scene. However, it is required to correct noise and distortion of TOF depth image due to the technical limitation of the TOF depth camera. The corrected depth image is combined with the color image in various methods, and then we obtain the high-resolution depth of the scene. In this paper, we introduce the principal and various techniques of sensor fusion for high-quality depth generation that uses multiple camera with depth cameras.

Research about a game image 3D versification (3D 게임영상 작성법에 관한 연구)

  • Lee Dong-Lyeor
    • Journal of Game and Entertainment
    • /
    • v.1 no.1
    • /
    • pp.31-38
    • /
    • 2005
  • Correct flow of various game manufacture among the justice which is used at the game development. and The understanding about the manufacture regards we making rather correct game. We justice understanding which we are correct in the image manufacture to become the reason air control of the game and We put the center in a 3B game image manufacture understanding. we are marked in maneuvered the game in actual game good. The image of the back of Cut Scene which is inserted at an opeuning incomparableness event time, we have been produced in this method. The thing which a 3D game image is utilized in a special effectiveness image though it is different from the game in the theater movie, we are the graphic which a game manufacture o'clock must be considered. The reason air control which the game player Is rather correct, we are regarded we offer the reason to immerse with his game.

  • PDF

3D Motion of Objects in an Image Using Vanishing Points (소실점을 이용한 2차원 영상의 물체 변환)

  • 김대원;이동훈;정순기
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.30 no.11
    • /
    • pp.621-628
    • /
    • 2003
  • This paper addresses a method of enabling objects in an image to have apparent 3D motion. Many researchers have solved this issue by reconstructing 3D model from several images using image-based modeling techniques, or building a cube-modeled scene from camera calibration using vanishing points. This paper, however, presents the possibility of image-based motion without exact 3D information of scene geometry and camera calibration. The proposed system considers the image plane as a projective plane with respect to a view point and models a 2D frame of a projected 3D object using only lines and points. And a modeled frame refers to its vanishing points as local coordinates when it is transformed.