• Title/Summary/Keyword: scene image

Search Result 946, Processing Time 0.028 seconds

Salient Region Extraction based on Global Contrast Enhancement and Saliency Cut for Image Information Recognition of the Visually Impaired

  • Yoon, Hongchan;Kim, Baek-Hyun;Mukhriddin, Mukhiddinov;Cho, Jinsoo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.5
    • /
    • pp.2287-2312
    • /
    • 2018
  • Extracting key visual information from images containing natural scene is a challenging task and an important step for the visually impaired to recognize information based on tactile graphics. In this study, a novel method is proposed for extracting salient regions based on global contrast enhancement and saliency cuts in order to improve the process of recognizing images for the visually impaired. To accomplish this, an image enhancement technique is applied to natural scene images, and a saliency map is acquired to measure the color contrast of homogeneous regions against other areas of the image. The saliency maps also help automatic salient region extraction, referred to as saliency cuts, and assist in obtaining a binary mask of high quality. Finally, outer boundaries and inner edges are detected in images with natural scene to identify edges that are visually significant. Experimental results indicate that the method we propose in this paper extracts salient objects effectively and achieves remarkable performance compared to conventional methods. Our method offers benefits in extracting salient objects and generating simple but important edges from images containing natural scene and for providing information to the visually impaired.

A Remote Sensing Scene Classification Model Based on EfficientNetV2L Deep Neural Networks

  • Aljabri, Atif A.;Alshanqiti, Abdullah;Alkhodre, Ahmad B.;Alzahem, Ayyub;Hagag, Ahmed
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.10
    • /
    • pp.406-412
    • /
    • 2022
  • Scene classification of very high-resolution (VHR) imagery can attribute semantics to land cover in a variety of domains. Real-world application requirements have not been addressed by conventional techniques for remote sensing image classification. Recent research has demonstrated that deep convolutional neural networks (CNNs) are effective at extracting features due to their strong feature extraction capabilities. In order to improve classification performance, these approaches rely primarily on semantic information. Since the abstract and global semantic information makes it difficult for the network to correctly classify scene images with similar structures and high interclass similarity, it achieves a low classification accuracy. We propose a VHR remote sensing image classification model that uses extracts the global feature from the original VHR image using an EfficientNet-V2L CNN pre-trained to detect similar classes. The image is then classified using a multilayer perceptron (MLP). This method was evaluated using two benchmark remote sensing datasets: the 21-class UC Merced, and the 38-class PatternNet. As compared to other state-of-the-art models, the proposed model significantly improves performance.

Real Scene Text Image Super-Resolution Based on Multi-Scale and Attention Fusion

  • Xinhua Lu;Haihai Wei;Li Ma;Qingji Xue;Yonghui Fu
    • Journal of Information Processing Systems
    • /
    • v.19 no.4
    • /
    • pp.427-438
    • /
    • 2023
  • Plenty of works have indicated that single image super-resolution (SISR) models relying on synthetic datasets are difficult to be applied to real scene text image super-resolution (STISR) for its more complex degradation. The up-to-date dataset for realistic STISR is called TextZoom, while the current methods trained on this dataset have not considered the effect of multi-scale features of text images. In this paper, a multi-scale and attention fusion model for realistic STISR is proposed. The multi-scale learning mechanism is introduced to acquire sophisticated feature representations of text images; The spatial and channel attentions are introduced to capture the local information and inter-channel interaction information of text images; At last, this paper designs a multi-scale residual attention module by skillfully fusing multi-scale learning and attention mechanisms. The experiments on TextZoom demonstrate that the model proposed increases scene text recognition's (ASTER) average recognition accuracy by 1.2% compared to text super-resolution network.

A Study on Localization of Text in Natural Scene Images (자연 영상에서의 정확한 문자 검출에 관한 연구)

  • Choi, Mi-Young;Kim, Gye-Young;Choi, Hyung-Il
    • Journal of the Korea Society of Computer and Information
    • /
    • v.13 no.5
    • /
    • pp.77-84
    • /
    • 2008
  • This paper proposes a new approach to eliminate the reflectance component for the localization of text in natural scene images. Natural scene images normally have an illumination component as well as a reflectance component. It is well known that a reflectance component usually obstructs the task of detecting and recognizing objects like texts in the scene, since it blurs out an overall image. We have developed an approach that efficiently removes reflectance components while Preserving illumination components. We decided whether an input image hits Normal or Polarized for determining the light environment, using the histogram which consisted of a red component. In the normal image, we acquired the text region without additional processing. Otherwise we removed light reflecting from the object using homomorphic filtering in the polarized image. And then this decided the each text region based on the color merging technique and the Saliency Map. Finally, we localized text region on these two candidate regions.

  • PDF

A Study on the code and design elements as a way of transition (애니메이션 화면 전환 수단으로서의 조형 요소 변화에 대한 연구)

  • Kim, Jean-Young
    • Cartoon and Animation Studies
    • /
    • s.14
    • /
    • pp.83-99
    • /
    • 2008
  • In general, the cut or dissolve or etc., collective changeover represents the change of scene in the film. Animation film makes scene's various parts to allow intended sensibility and narrative factors by special manufacturing skill generating frame image one by one and transfer it to the different symbolic dimensional expression. Nowadays sequential scene composition is not any more the unique special treatment for 2D animation according to image handling skill like morphing, metamorphosis, etc. becomes diverse and elaborate. But 2D manual animation makes spectator to be absorbed into different visual dimensions continuously and strongly beyond character and background, namely object and space. that is 2D manual animation's strong attractiveness. Finally these characteristics enable literary function which makes it possible to do delicate metaphorical through full scene composition basis and to communicate a implicative meaning system The analysis about scene broke boundary between symbolic perspective world and plane formative world and it became more diverse and complicated. Hereupon the analyzing the composition basis of formative element in the animation film scene and it's application effect make it helpful to analysis and application in the modern image scene having new absorbing methods

  • PDF

3D Analysis of Scene and Light Environment Reconstruction for Image Synthesis (영상합성을 위한 3D 공간 해석 및 조명환경의 재구성)

  • Hwang, Yong-Ho;Hong, Hyun-Ki
    • Journal of Korea Game Society
    • /
    • v.6 no.2
    • /
    • pp.45-50
    • /
    • 2006
  • In order to generate a photo-realistic synthesized image, we should reconstruct light environment by 3D analysis of scene. This paper presents a novel method for identifying the positions and characteristics of the lights-the global and local lights-in the real image, which are used to illuminate the synthetic objects. First, we generate High Dynamic Range(HDR) radiance map from omni-directional images taken by a digital camera with a fisheye lens. Then, the positions of the camera and light sources in the scene are identified automatically from the correspondences between images without a priori camera calibration. Types of the light sources are classified according to whether they illuminate the whole scene, and then we reconstruct 3D illumination environment. Experimental results showed that the proposed method with distributed ray tracing makes it possible to achieve photo-realistic image synthesis. It is expected that animators and lighting experts for the film and animation industry would benefit highly from it.

  • PDF

Effectual Method FOR 3D Rebuilding From Diverse Images

  • Leung, Carlos Wai Yin;Hons, B.E.
    • 한국정보컨버전스학회:학술대회논문집
    • /
    • 2008.06a
    • /
    • pp.145-150
    • /
    • 2008
  • This thesis explores the problem of reconstructing a three-dimensional(3D) scene given a set of images or image sequences of the scene. It describes efficient methods for the 3D reconstruction of static and dynamic scenes from stereo images, stereo image sequences, and images captured from multiple viewpoints. Novel methods for image-based and volumetric modelling approaches to 3D reconstruction are presented, with an emphasis on the development of efficient algorithm which produce high quality and accurate reconstructions. For image-based 3D reconstruction a novel energy minimisation scheme, Iterated Dynamic Programming, is presented for the efficient computation of strong local minima of discontinuity preserving energyy functions. Coupled with a novel morphological decomposition method and subregioning schemes for the efficient computation of a narrowband matching cost volume. the minimisation framework is applied to solve problems in stereo matching, stereo-temporal reconstruction, motion estimation, 2D image registration and 3D image registration. This thesis establishes Iterated Dynamic Programming as an efficient and effective energy minimisation scheme suitable for computer vision problems which involve finding correspondences across images. For 3D reconstruction from multiple view images with arbitrary camera placement, a novel volumetric modelling technique, Embedded Voxel Colouring, is presented that efficiently embeds all reconstructions of a 3D scene into a single output in a single scan of the volumetric space under exact visibility. An adaptive thresholding framework is also introduced for the computation of the optimal set of thresholds to obtain high quality 3D reconstructions. This thesis establishes the Embedded Voxel Colouring framework as a fast, efficient and effective method for 3D reconstruction from multiple view images.

  • PDF

Scene Recognition based Autonomous Robot Navigation robust to Dynamic Environments (동적 환경에 강인한 장면 인식 기반의 로봇 자율 주행)

  • Kim, Jung-Ho;Kweon, In-So
    • The Journal of Korea Robotics Society
    • /
    • v.3 no.3
    • /
    • pp.245-254
    • /
    • 2008
  • Recently, many vision-based navigation methods have been introduced as an intelligent robot application. However, many of these methods mainly focus on finding an image in the database corresponding to a query image. Thus, if the environment changes, for example, objects moving in the environment, a robot is unlikely to find consistent corresponding points with one of the database images. To solve these problems, we propose a novel navigation strategy which uses fast motion estimation and a practical scene recognition scheme preparing the kidnapping problem, which is defined as the problem of re-localizing a mobile robot after it is undergone an unknown motion or visual occlusion. This algorithm is based on motion estimation by a camera to plan the next movement of a robot and an efficient outlier rejection algorithm for scene recognition. Experimental results demonstrate the capability of the vision-based autonomous navigation against dynamic environments.

  • PDF

Video Segmentation and Key frame Extraction using Multi-resolution Analysis and Statistical Characteristic

  • Cho, Wan-Hyun;Park, Soon-Young;Park, Jong-Hyun
    • Communications for Statistical Applications and Methods
    • /
    • v.10 no.2
    • /
    • pp.457-469
    • /
    • 2003
  • In this paper, we have proposed the efficient algorithm that can segment the video scene change using a various statistical characteristics obtained from by applying the wavelet transformation for each frames. Our method firstly extracts the histogram features from low frequency subband of wavelet-transformed image and then uses these features to detect the abrupt scene change. Second, it extracts the edge information from applying the mesh method to the high frequency subband of transformed image. We quantify the extracted edge information as the values of variance characteristic of each pixel and use these values to detect the gradual scene change. And we have also proposed an algorithm how extract the proper key frame from segmented video scene. Experiment results show that the proposed method is both very efficient algorithm in segmenting video frames and also is to become the appropriate key frame extraction method.

Self-Positioning of a Mobile Robot using a Vision System and Image Overlay with VRML (비전 시스템을 이용한 이동로봇 Self-positioning과 VRML과의 영상오버레이)

  • Hyun, Kwon-Bang;To, Chong-Kil
    • Proceedings of the KIEE Conference
    • /
    • 2005.05a
    • /
    • pp.258-260
    • /
    • 2005
  • We describe a method for localizing a mobile robot in the working environment using a vision system and VRML. The robot identifies landmarks in the environment and carries out the self-positioning. The image-processing and neural network pattern matching technique are employed to recognize landmarks placed in a robot working environment. The robot self-positioning using vision system is based on the well-known localization algorithm. After self-positioning, 2D scene is overlaid with VRML scene. This paper describes how to realize the self-positioning and shows the result of overlaying between 2D scene and VRML scene. In addition we describe the advantage expected from overlapping both scenes.

  • PDF