• Title/Summary/Keyword: Spatial convolution

Search Result 93, Processing Time 0.028 seconds

Combining 2D CNN and Bidirectional LSTM to Consider Spatio-Temporal Features in Crop Classification (작물 분류에서 시공간 특징을 고려하기 위한 2D CNN과 양방향 LSTM의 결합)

  • Kwak, Geun-Ho;Park, Min-Gyu;Park, Chan-Won;Lee, Kyung-Do;Na, Sang-Il;Ahn, Ho-Yong;Park, No-Wook
    • Korean Journal of Remote Sensing
    • /
    • v.35 no.5_1
    • /
    • pp.681-692
    • /
    • 2019
  • In this paper, a hybrid deep learning model, called 2D convolution with bidirectional long short-term memory (2DCBLSTM), is presented that can effectively combine both spatial and temporal features for crop classification. In the proposed model, 2D convolution operators are first applied to extract spatial features of crops and the extracted spatial features are then used as inputs for a bidirectional LSTM model that can effectively process temporal features. To evaluate the classification performance of the proposed model, a case study of crop classification was carried out using multi-temporal unmanned aerial vehicle images acquired in Anbandegi, Korea. For comparison purposes, we applied conventional deep learning models including two-dimensional convolutional neural network (CNN) using spatial features, LSTM using temporal features, and three-dimensional CNN using spatio-temporal features. Through the impact analysis of hyper-parameters on the classification performance, the use of both spatial and temporal features greatly reduced misclassification patterns of crops and the proposed hybrid model showed the best classification accuracy, compared to the conventional deep learning models that considered either spatial features or temporal features. Therefore, it is expected that the proposed model can be effectively applied to crop classification owing to its ability to consider spatio-temporal features of crops.

Reconstruction Method of Spatially Filtered 3D images in Integral Imaging based on Parallel Lens Array (병렬렌즈배열 기반의 집적영상에서 공간필터링된 3차원 영상 복원)

  • Jang, Jae-Young;Cho, Myungjin
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.19 no.3
    • /
    • pp.659-666
    • /
    • 2015
  • In this paper, we propose a novel reconstruction method of spatially filtered 3D images in integral imaging based on parallel lens array. The parallel lens array is composed of two lens arrays, which are positioned side by side through longitudinal direction. Conventional spatial filtering method by using convolution property between periodic functions has drawback that is the limitation of the position of target object. this caused the result that the target object should be located on the low depth resolution region. The available spatial filtering region of the spatial filtering method is depending on the focal length and the number of elemental lens in the integral imaging pickup system. In this regard, we propose the parallel lens array system to enhance the available spatial filtering region and depth resolution. The experiment result indicate that the proposed method outperforms the conventional method.

A Multi-Stage Convolution Machine with Scaling and Dilation for Human Pose Estimation

  • Nie, Yali;Lee, Jaehwan;Yoon, Sook;Park, Dong Sun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.6
    • /
    • pp.3182-3198
    • /
    • 2019
  • Vision-based Human Pose Estimation has been considered as one of challenging research subjects due to problems including confounding background clutter, diversity of human appearances and illumination changes in scenes. To tackle these problems, we propose to use a new multi-stage convolution machine for estimating human pose. To provide better heatmap prediction of body joints, the proposed machine repeatedly produces multiple predictions according to stages with receptive field large enough for learning the long-range spatial relationship. And stages are composed of various modules according to their strategic purposes. Pyramid stacking module and dilation module are used to handle problem of human pose at multiple scales. Their multi-scale information from different receptive fields are fused with concatenation, which can catch more contextual information from different features. And spatial and channel information of a given input are converted to gating factors by squeezing the feature maps to a single numeric value based on its importance in order to give each of the network channels different weights. Compared with other ConvNet-based architectures, we demonstrated that our proposed architecture achieved higher accuracy on experiments using standard benchmarks of LSP and MPII pose datasets.

DP-LinkNet: A convolutional network for historical document image binarization

  • Xiong, Wei;Jia, Xiuhong;Yang, Dichun;Ai, Meihui;Li, Lirong;Wang, Song
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.5
    • /
    • pp.1778-1797
    • /
    • 2021
  • Document image binarization is an important pre-processing step in document analysis and archiving. The state-of-the-art models for document image binarization are variants of encoder-decoder architectures, such as FCN (fully convolutional network) and U-Net. Despite their success, they still suffer from three limitations: (1) reduced feature map resolution due to consecutive strided pooling or convolutions, (2) multiple scales of target objects, and (3) reduced localization accuracy due to the built-in invariance of deep convolutional neural networks (DCNNs). To overcome these three challenges, we propose an improved semantic segmentation model, referred to as DP-LinkNet, which adopts the D-LinkNet architecture as its backbone, with the proposed hybrid dilated convolution (HDC) and spatial pyramid pooling (SPP) modules between the encoder and the decoder. Extensive experiments are conducted on recent document image binarization competition (DIBCO) and handwritten document image binarization competition (H-DIBCO) benchmark datasets. Results show that our proposed DP-LinkNet outperforms other state-of-the-art techniques by a large margin. Our implementation and the pre-trained models are available at https://github.com/beargolden/DP-LinkNet.

Convolution and Deconvolution Algorithms for Large-Volume Cosmological Surveys

  • Park, KeunWoo;Rossi, Graziano
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.40 no.2
    • /
    • pp.50.4-51
    • /
    • 2015
  • Current and planned deep multicolor wide-area cosmological surveys will map in detail the spatial distribution of galaxies and quasars over unprecedented volumes, and provide a number of objects with photometric redshifts more than an order of magnitude bigger than that of spectroscopic redshifts. Photometric information is statistically more significant for studying cosmological evolution, dark energy, and the expansion history of the universe at a fraction of the cost of a full spectroscopic survey, but intrinsically carries a bias due to noise in the distance estimates. We provide convolution- and deconvolution-based algorithms capable of removing this bias -- thus able to exploit the full cosmological information -- in order to reconstruct intrinsic distributions and correlations between distance-dependent quantities. We then show some direct applications of our techniques to the VIMOS Public Extragalactic Redshift Survey (VIPERS) and the Sloan Digital Sky Survey (SDSS) datasets. Our methods impact a broader range of studies, when at least one distance-dependent quantity is involved; hence, they will be useful for upcoming large-volume surveys, some of which will only have photometric information.

  • PDF

Scattering Characteristics of the Infinite Strip Conductor for TE Waves (무환히 긴 도체 스트립의 TE파 산란 특성)

  • Chang, Jae-Sung;Lee, Sang-Seol
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.26 no.5
    • /
    • pp.18-22
    • /
    • 1989
  • We calculate the distribution of the induced current on the strip by the TE waves on the infinite conducting strip. The boundary equations represented as the spatial domain function becomevery complicated equations including convolution integral. As we transform it to the spectral domain, we have a very simple equation expressed by some algebraic multiplication of the current density function and Green's function. It is shown that the computation result of the induced current distribution gives the optimum value, when the stop condition of iteration presented in this paper are satisfied.

  • PDF

Scattering Characteristics of The Infinite Strip Conductor for TM Waves (무한히 긴 도체 스트립의 TM파 산란 특성)

  • 장재성;이상설
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.13 no.5
    • /
    • pp.437-443
    • /
    • 1988
  • We calculate the distribution of the current on the strip by the incident waves on the infinite conducting strip line. The boundary equations represented as the spatial domain function become very complicated equations including convolution integral. Transformed it to the spectral domain, we have a very simple equation is composed by some algebraic multiplication of the current density function and Green's function. the acceleration of iteration procedure is achieved by Kastner's method. The result of iteration gives us the optimum value when it satisfies the iteration stop condition presented in this paper. We confirmed that the induced current density distribution on the stripline has been changed as variaties of the width.

  • PDF

Remaining Useful Life Prediction for Litium-Ion Batteries Using EMD-CNN-LSTM Hybrid Method (EMD-CNN-LSTM을 이용한 하이브리드 방식의 리튬 이온 배터리 잔여 수명 예측)

  • Lim, Je-Yeong;Kim, Dong-Hwan;Noh, Tae-Won;Lee, Byoung-Kuk
    • The Transactions of the Korean Institute of Power Electronics
    • /
    • v.27 no.1
    • /
    • pp.48-55
    • /
    • 2022
  • This paper proposes a battery remaining useful life (RUL) prediction method using a deep learning-based EMD-CNN-LSTM hybrid method. The proposed method pre-processes capacity data by applying empirical mode decomposition (EMD) and predicts the remaining useful life using CNN-LSTM. CNN-LSTM is a hybrid method that combines convolution neural network (CNN), which analyzes spatial features, and long short term memory (LSTM), which is a deep learning technique that processes time series data analysis. The performance of the proposed remaining useful life prediction method is verified using the battery aging experiment data provided by the NASA Ames Prognostics Center of Excellence and shows higher accuracy than does the conventional method.

Compression Artifact Reduction for 360-degree Images using Reference-based Deformable Convolutional Neural Network

  • Kim, Hee-Jae;Kang, Je-Won;Lee, Byung-Uk
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • fall
    • /
    • pp.41-44
    • /
    • 2021
  • In this paper, we propose an efficient reference-based compression artifact reduction network for 360-degree images in an equi-rectangular projection (ERP) domain. In our insight, conventional image restoration methods cannot be applied straightforwardly to 360-degree images due to the spherical distortion. To address this problem, we propose an adaptive disparity estimator using a deformable convolution to exploit correlation among 360-degree images. With the help of the proposed convolution, the disparity estimator establishes the spatial correspondence successfully between the ERPs and extract matched textures to be used for image restoration. The experimental results demonstrate that the proposed algorithm provides reliable high-quality textures from the reference and improves the quality of the restored image as compared to the state-of-the-art single image restoration methods.

  • PDF

A Novel RGB Channel Assimilation for Hyperspectral Image Classification using 3D-Convolutional Neural Network with Bi-Long Short-Term Memory

  • M. Preethi;C. Velayutham;S. Arumugaperumal
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.3
    • /
    • pp.177-186
    • /
    • 2023
  • Hyperspectral imaging technology is one of the most efficient and fast-growing technologies in recent years. Hyperspectral image (HSI) comprises contiguous spectral bands for every pixel that is used to detect the object with significant accuracy and details. HSI contains high dimensionality of spectral information which is not easy to classify every pixel. To confront the problem, we propose a novel RGB channel Assimilation for classification methods. The color features are extracted by using chromaticity computation. Additionally, this work discusses the classification of hyperspectral image based on Domain Transform Interpolated Convolution Filter (DTICF) and 3D-CNN with Bi-directional-Long Short Term Memory (Bi-LSTM). There are three steps for the proposed techniques: First, HSI data is converted to RGB images with spatial features. Before using the DTICF, the RGB images of HSI and patch of the input image from raw HSI are integrated. Afterward, the pair features of spectral and spatial are excerpted using DTICF from integrated HSI. Those obtained spatial and spectral features are finally given into the designed 3D-CNN with Bi-LSTM framework. In the second step, the excerpted color features are classified by 2D-CNN. The probabilistic classification map of 3D-CNN-Bi-LSTM, and 2D-CNN are fused. In the last step, additionally, Markov Random Field (MRF) is utilized for improving the fused probabilistic classification map efficiently. Based on the experimental results, two different hyperspectral images prove that novel RGB channel assimilation of DTICF-3D-CNN-Bi-LSTM approach is more important and provides good classification results compared to other classification approaches.