• Title/Summary/Keyword: Pixel representation

Search Result 67, Processing Time 0.024 seconds

Autism Spectrum Disorder Recognition with Deep Learning

  • Shin, Jongmin;Choi, Jinwoo
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2022.06a
    • /
    • pp.1268-1271
    • /
    • 2022
  • Since it is common to have touch-screen devices, it is less challenging to draw sketches anywhere and save them in vector form. Current research on sketches considers coordinate sequence data and adopts sequential models for learning sketch representation in sketch understanding. In the sketch dataset, it has become customary that the dataset is in vector coordinate format. Moreover, the popular dataset does not consider real-life sketches, sketches from pencil, pen, and paper. Art psychology uses real-life sketches to analyze patients. ETRI presents a unique sketch dataset for sketch recognition of autism spectrum disorder in pixel format. We present a method to formulate the dataset for better generalization of sketch data. Through experiments, we show that pixel-based models can produce a good performance.

  • PDF

Noise-free Distributions Comparison of Bayesian Wavelet Threshold for Image Denoise

  • Choi, Ilsu;Rhee, Sung-Suk;Ahn, Yunkee
    • Communications for Statistical Applications and Methods
    • /
    • v.8 no.2
    • /
    • pp.573-579
    • /
    • 2001
  • Wavelet thresholding is a method for he reduction of noise in image. Wavelet coefficients of image are correlated in local characterization. Thee correlations also appear in he original pixel representation of the image, and they do not follow from the characterizations of the wavelet transform. In this paper, we compare noise-free distributions of Bayes approach to improve the classical threshold algorithm.

  • PDF

Automatic Lipreading Based on Image Transform and HMM (이미지 변환과 HMM에 기반한 자동 립리딩)

  • 김진범;김진영
    • Proceedings of the IEEK Conference
    • /
    • 1999.11a
    • /
    • pp.585-588
    • /
    • 1999
  • This paper concentrates on an experimental results on visual only recognition tasks using an image transform approach and HMM based recognition system. There are two approaches for extracting features of lipreading, a lip contour based approach and an image transform based one. The latter obtains a compressed representation of the image pixel values that contain the speaker's mouth results in superior lipreading performance. In addition, PCA(Principal component analysis) is used for fast algorithm. Finally, HMM recognition tasks are compared with the another.

  • PDF

Exploring Optimal Threshold of RGB Pixel Values to Extract Road Features from Google Earth (Google Earth에서 도로 추출을 위한 RGB 화소값 최적구간 추적)

  • Park, Jae-Young;Um, Jung-Sup
    • Journal of Korea Spatial Information System Society
    • /
    • v.12 no.1
    • /
    • pp.66-75
    • /
    • 2010
  • The authors argues that the current road updating system based on traditional aerial photograph or multi-spectral satellite image appears to be non-user friendly due to lack of the frequent cartographic representation for the new construction sites. Google Earth are currently being emerged as one of important places to extract road features since the RGB satellite image with high multi-temporal resolution can be accessed freely over large areas. This paper is primarily intended to evaluate optimal threshold of RGB pixel values to extract road features from Google Earth. An empirical study for five experimental sites was conducted to confirm how a RGB picture provided Google Earth can be used to extact the road feature. The results indicate that optimal threshold of RGB pixel values to extract road features was identified as 126, 125, 127 for manual operation which corresponds to 25%, 30%, 19%. Also, it was found that display scale difference of Google Earth was not very influential in tracking required RGB pixel value. As a result the 61cm resolution of Quickbird RGB data has shown the potential to realistically identified the major type of road feature by large scale spatial precision while the typical algorithm revealed successfully the area-wide optimal threshold of RGB pixel for road appeared in the study area.

CRF-Based Figure/Ground Segmentation with Pixel-Level Sparse Coding and Neighborhood Interactions

  • Zhang, Lihe;Piao, Yongri
    • Journal of information and communication convergence engineering
    • /
    • v.13 no.3
    • /
    • pp.205-214
    • /
    • 2015
  • In this paper, we propose a new approach to learning a discriminative model for figure/ground segmentation by incorporating the bag-of-features and conditional random field (CRF) techniques. We advocate the use of image patches instead of superpixels as the basic processing unit. The latter has a homogeneous appearance and adheres to object boundaries, while an image patch often contains more discriminative information (e.g., local image structure) to distinguish its categories. We use pixel-level sparse coding to represent an image patch. With the proposed feature representation, the unary classifier achieves a considerable binary segmentation performance. Further, we integrate unary and pairwise potentials into the CRF model to refine the segmentation results. The pairwise potentials include color and texture potentials with neighborhood interactions, and an edge potential. High segmentation accuracy is demonstrated on three benchmark datasets: the Weizmann horse dataset, the VOC2006 cow dataset, and the MSRC multiclass dataset. Extensive experiments show that the proposed approach performs favorably against the state-of-the-art approaches.

Survey on Deep Learning-based Panoptic Segmentation Methods (딥 러닝 기반의 팬옵틱 분할 기법 분석)

  • Kwon, Jung Eun;Cho, Sung In
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.16 no.5
    • /
    • pp.209-214
    • /
    • 2021
  • Panoptic segmentation, which is now widely used in computer vision such as medical image analysis, and autonomous driving, helps understanding an image with holistic view. It identifies each pixel by assigning a unique class ID, and an instance ID. Specifically, it can classify 'thing' from 'stuff', and provide pixel-wise results of semantic prediction and object detection. As a result, it can solve both semantic segmentation and instance segmentation tasks through a unified single model, producing two different contexts for two segmentation tasks. Semantic segmentation task focuses on how to obtain multi-scale features from large receptive field, without losing low-level features. On the other hand, instance segmentation task focuses on how to separate 'thing' from 'stuff' and how to produce the representation of detected objects. With the advances of both segmentation techniques, several panoptic segmentation models have been proposed. Many researchers try to solve discrepancy problems between results of two segmentation branches that can be caused on the boundary of the object. In this survey paper, we will introduce the concept of panoptic segmentation, categorize the existing method into two representative methods and explain how it is operated on two methods: top-down method and bottom-up method. Then, we will analyze the performance of various methods with experimental results.

A Study on the Hierarchical Representation of Images: An Efficient Representation of Quadtrees BF Linear Quadtree (화상의 구조적 표현에 관한 연구- 4진트리의 효율적인 표현법:BF선형 4진트)

  • Kim, Min-Hwan;Han, Sang-Ho;Hwang, Hee-Yeung
    • The Transactions of the Korean Institute of Electrical Engineers
    • /
    • v.37 no.7
    • /
    • pp.498-509
    • /
    • 1988
  • A BF(breadth-first) linear quadtree as a new data structure for image data is suggested, which enables us to compress the image data efficiently and to make operations of the compressed data easily. It is a list of path names for black nodes as the linear quadtree is. The path name for each black node of a BF linear quadtree is represented as a sequence of path codes from the root node to itself, whereas that of linear quadtree as a sequence of path codes from the root node to itself and fill characters for cut-offed path from it to any n-level node which corresponds to a pixel of an image. The BF linear quadtree provides a more efficent compression ratio than the linear quadtree does, because the former does not require redundant characters, fill characters, for the cut-offed paths. Several operations for image processing can be also implemented on this hierarchical structure efficiently, because it is composed of only the black nodes ad the linear quadtree is . In this paper, algorithms for several operations on the BF linear quadtree are defined and analyzed. Experimental results for forur image data are also given and discussed.

New Cellular Neural Networks Template for Image Halftoning based on Bayesian Rough Sets

  • Elsayed Radwan;Basem Y. Alkazemi;Ahmed I. Sharaf
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.4
    • /
    • pp.85-94
    • /
    • 2023
  • Image halftoning is a technique for varying grayscale images into two-tone binary images. Unfortunately, the static representation of an image-half toning, wherever each pixel intensity is combined by its local neighbors only, causes missing subjective problem. Also, the existing noise causes an instability criterion. In this paper an image half-toning is represented as a dynamical system for recognizing the global representation. Also, noise is reduced based on a probabilistic model. Since image half-toning is considered as 2-D matrix with a full connected pass, this structure is recognized by the dynamical system of Cellular Neural Networks (CNNs) which is defined by its template. Bayesian Rough Sets is used in exploiting the ideal CNNs construction that synthesis its dynamic. Also, Bayesian rough sets contribute to enhance the quality of the halftone image by removing noise and discovering the effective parameters in the CNNs template. The novelty of this method lies in finding a probabilistic based technique to discover the term of CNNs template and define new learning rules for CNNs internal work. A numerical experiment is conducted on image half-toning corrupted by Gaussian noise.

Displacement Mapping for the Precise Representation of Protrusion (정확한 돌출 형상의 표현을 위한 변위매핑)

  • Yoo, Byoung-Hyun;Han, Soon-Hung
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.33 no.10
    • /
    • pp.777-788
    • /
    • 2006
  • This paper describes a displacement mapping technique which represents protruded shapes on the surface of an object. Previous approaches for image-based displacement mapping can represent only shapes depressed from the polygon surface. The proposed technique can represent shapes protruded from the underlying surface in real-time. Two auxiliary surfaces which are perpendicular to the underlying surface are added along the boundary of the polygon surface, in order to represent the pixels which overflow over the boundary of the polygon surface. The proposed approach can represent accurate silhouette of protruded shape. It can represent not only smooth displacement of protruded shape, but also abrupt displacement such as perpendicular protrusion by means of adding the supplementary texture information to the steep surface of protruded shape. By per-pixel instructions on the programmable GPU this approach can be executed in real-time. It provides an effective solution for the representation of protruded shape such as high-rise buildings on the ground.

Super Resolution by Learning Sparse-Neighbor Image Representation (Sparse-Neighbor 영상 표현 학습에 의한 초해상도)

  • Eum, Kyoung-Bae;Choi, Young-Hee;Lee, Jong-Chan
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.12
    • /
    • pp.2946-2952
    • /
    • 2014
  • Among the Example based Super Resolution(SR) techniques, Neighbor embedding(NE) has been inspired by manifold learning method, particularly locally linear embedding. However, the poor generalization of NE decreases the performance of such algorithm. The sizes of local training sets are always too small to improve the performance of NE. We propose the Learning Sparse-Neighbor Image Representation baesd on SVR having an excellent generalization ability to solve this problem. Given a low resolution image, we first use bicubic interpolation to synthesize its high resolution version. We extract the patches from this synthesized image and determine whether each patch corresponds to regions with high or low spatial frequencies. After the weight of each patch is obtained by our method, we used to learn separate SVR models. Finally, we update the pixel values using the previously learned SVRs. Through experimental results, we quantitatively and qualitatively confirm the improved results of the proposed algorithm when comparing with conventional interpolation methods and NE.