• Title/Summary/Keyword: Convolutional encoder

Search Result 93, Processing Time 0.022 seconds

DP-LinkNet: A convolutional network for historical document image binarization

  • Xiong, Wei;Jia, Xiuhong;Yang, Dichun;Ai, Meihui;Li, Lirong;Wang, Song
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.5
    • /
    • pp.1778-1797
    • /
    • 2021
  • Document image binarization is an important pre-processing step in document analysis and archiving. The state-of-the-art models for document image binarization are variants of encoder-decoder architectures, such as FCN (fully convolutional network) and U-Net. Despite their success, they still suffer from three limitations: (1) reduced feature map resolution due to consecutive strided pooling or convolutions, (2) multiple scales of target objects, and (3) reduced localization accuracy due to the built-in invariance of deep convolutional neural networks (DCNNs). To overcome these three challenges, we propose an improved semantic segmentation model, referred to as DP-LinkNet, which adopts the D-LinkNet architecture as its backbone, with the proposed hybrid dilated convolution (HDC) and spatial pyramid pooling (SPP) modules between the encoder and the decoder. Extensive experiments are conducted on recent document image binarization competition (DIBCO) and handwritten document image binarization competition (H-DIBCO) benchmark datasets. Results show that our proposed DP-LinkNet outperforms other state-of-the-art techniques by a large margin. Our implementation and the pre-trained models are available at https://github.com/beargolden/DP-LinkNet.

Convolutional auto-encoder based multiple description coding network

  • Meng, Lili;Li, Hongfei;Zhang, Jia;Tan, Yanyan;Ren, Yuwei;Zhang, Huaxiang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.4
    • /
    • pp.1689-1703
    • /
    • 2020
  • When data is transmitted over an unreliable channel, the error of the data packet may result in serious degradation. The multiple description coding (MDC) can solve this problem and save transmission costs. In this paper, we propose a deep multiple description coding network (MDCN) to realize efficient image compression. Firstly, our network framework is based on convolutional auto-encoder (CAE), which include multiple description encoder network (MDEN) and multiple description decoder network (MDDN). Secondly, in order to obtain high-quality reconstructed images at low bit rates, the encoding network and decoding network are integrated into an end-to-end compression framework. Thirdly, the multiple description decoder network includes side decoder network and central decoder network. When the decoder receives only one of the two multiple description code streams, side decoder network is used to obtain side reconstructed image of acceptable quality. When two descriptions are received, the high quality reconstructed image is obtained. In addition, instead of quantization with additive uniform noise, and SSIM loss and distance loss combine to train multiple description encoder networks to ensure that they can share structural information. Experimental results show that the proposed framework performs better than traditional multiple description coding methods.

Performance Improvement of Multi-Code CDMA Systems Using Bi-Orthogonal Modulation (Bi-Orthogonal 변조를 이용한 Multi-Code CDMA 시스템의 성능 개선)

  • 한재광;신요안
    • Proceedings of the IEEK Conference
    • /
    • 2000.06a
    • /
    • pp.29-32
    • /
    • 2000
  • In this paper, we present an extension of the multi-code CDMA (code division multiple access) systems based on bi-orthogonal modulation by employing a convolutional encoder and an interleaver before serial-to-parallel conversion in the modulator. Bandwidth expansion by the convolutional encoder can be compensated for by the bi-orthogonal modulation, and the interleaver in the system scrambles the convolutionally encoded data bits so that, after serial-to-parallel conversion, each code channel conveys those bits far apart in time. The result is that the proposed system with several order of magnitude less implementational complexity, achieves quite close performance of the conventional systems comprised of Walsh modulation and multiple convolutional encoders and interleavers in all the code channels.

  • PDF

Agglomerative Hierarchical Clustering Analysis with Deep Convolutional Autoencoders (합성곱 오토인코더 기반의 응집형 계층적 군집 분석)

  • Park, Nojin;Ko, Hanseok
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.1
    • /
    • pp.1-7
    • /
    • 2020
  • Clustering methods essentially take a two-step approach; extracting feature vectors for dimensionality reduction and then employing clustering algorithm on the extracted feature vectors. However, for clustering images, the traditional clustering methods such as stacked auto-encoder based k-means are not effective since they tend to ignore the local information. In this paper, we propose a method first to effectively reduce data dimensionality using convolutional auto-encoder to capture and reflect the local information and then to accurately cluster similar data samples by using a hierarchical clustering approach. The experimental results confirm that the clustering results are improved by using the proposed model in terms of clustering accuracy and normalized mutual information.

The Construction and Viterbi Decoding of New (2k, k, l) Convolutional Codes

  • Peng, Wanquan;Zhang, Chengchang
    • Journal of Information Processing Systems
    • /
    • v.10 no.1
    • /
    • pp.69-80
    • /
    • 2014
  • The free distance of (n, k, l) convolutional codes has some connection with the memory length, which depends on not only l but also on k. To efficiently obtain a large memory length, we have constructed a new class of (2k, k, l) convolutional codes by (2k, k) block codes and (2, 1, l) convolutional codes, and its encoder and generation function are also given in this paper. With the help of some matrix modules, we designed a single structure Viterbi decoder with a parallel capability, obtained a unified and efficient decoding model for (2k, k, l) convolutional codes, and then give a description of the decoding process in detail. By observing the survivor path memory in a matrix viewer, and testing the role of the max module, we implemented a simulation with (2k, k, l) convolutional codes. The results show that many of them are better than conventional (2, 1, l) convolutional codes.

Anomaly Detection In Real Power Plant Vibration Data by MSCRED Base Model Improved By Subset Sampling Validation (Subset 샘플링 검증 기법을 활용한 MSCRED 모델 기반 발전소 진동 데이터의 이상 진단)

  • Hong, Su-Woong;Kwon, Jang-Woo
    • Journal of Convergence for Information Technology
    • /
    • v.12 no.1
    • /
    • pp.31-38
    • /
    • 2022
  • This paper applies an expert independent unsupervised neural network learning-based multivariate time series data analysis model, MSCRED(Multi-Scale Convolutional Recurrent Encoder-Decoder), and to overcome the limitation, because the MCRED is based on Auto-encoder model, that train data must not to be contaminated, by using learning data sampling technique, called Subset Sampling Validation. By using the vibration data of power plant equipment that has been labeled, the classification performance of MSCRED is evaluated with the Anomaly Score in many cases, 1) the abnormal data is mixed with the training data 2) when the abnormal data is removed from the training data in case 1. Through this, this paper presents an expert-independent anomaly diagnosis framework that is strong against error data, and presents a concise and accurate solution in various fields of multivariate time series data.

Efficient CT Image Denoising Using Deformable Convolutional AutoEncoder Model

  • Eon Seung, Seong;Seong Hyun, Han;Ji Hye, Heo;Dong Hoon, Lim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.3
    • /
    • pp.25-33
    • /
    • 2023
  • Noise generated during the acquisition and transmission of CT images acts as a factor that degrades image quality. Therefore, noise removal to solve this problem is an important preprocessing process in image processing. In this paper, we remove noise by using a deformable convolutional autoencoder (DeCAE) model in which deformable convolution operation is applied instead of the existing convolution operation in the convolutional autoencoder (CAE) model of deep learning. Here, the deformable convolution operation can extract features of an image in a more flexible area than the conventional convolution operation. The proposed DeCAE model has the same encoder-decoder structure as the existing CAE model, but the encoder is composed of deformable convolutional layers and the decoder is composed of conventional convolutional layers for efficient noise removal. To evaluate the performance of the DeCAE model proposed in this paper, experiments were conducted on CT images corrupted by various noises, that is, Gaussian noise, impulse noise, and Poisson noise. As a result of the performance experiment, the DeCAE model has more qualitative and quantitative measures than the traditional filters, that is, the Mean filter, Median filter, Bilateral filter and NL-means method, as well as the existing CAE models, that is, MAE (Mean Absolute Error), PSNR (Peak Signal-to-Noise Ratio) and SSIM. (Structural Similarity Index Measure) showed excellent results.

Performance Evaluation of RAM IEEE 802.11a Convolutional Encoder according to Puncturing Pattern (WLAN IEEE 802.11a Convolutional 부호기의 Puncturing Pattern에 따른 성능 분석)

  • 조영규;정차근
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2002.11a
    • /
    • pp.21-24
    • /
    • 2002
  • 본 논문에서는 광대역 무선 데이터 서비스용으로 표준화된 WLAN IEEE 802.11a Convolutional 부호기의 펑처링 패턴에 따른 성능을 분석하고, 간략화 된 구현 방법을 제시한다. 제한된 주파수 대역에서 Convolutional 부호기의 구조를 변경함이 없이 정보 전송율을 향상시키기 위해 부호기에서 출력되는 비트열을 일정한 비트 간격마다 1비트씩 생략하는 펑처링 기법이 주로 이용된다. IEEE 802.11a WLAN 채널코딩에서도 부호율이 l/2, 구속장 7인 Convolutional 부호기를 기반으로 2/3와 3/4과 같은 높은 부호율을 얻기 위해 펑처링 기법을 채택하고 있다. 본 논문에서는, 단일 하드웨어 구조를 사용해서 1/2의 기본 부호율로부터 전송율에 따라 WLAN IEEE 802.11a의 펑처링에 의한 부호율 2/3와 3/4에서, 부호기의 구조를 간략화하기 위해 펑처링 패턴 구조에 따른 부호화 성능을 조사하고 단일하드웨어 구조를 사용해서 이들 부호율의 시스템을 구현할 수 있도록 한다. 모의 실험에서는 부가 백색 가우시안 잡음 현상이 존재하는 채널을 모델링 해서, BPSK와 QPSK의 변조방법에 대해 WLAN IEEE 802.11a 펑처링 패턴에 따른 부호화 성능을 제시한다.

  • PDF

U-net with vision transformer encoder for polyp segmentation in colonoscopy images (비전 트랜스포머 인코더가 포함된 U-net을 이용한 대장 내시경 이미지의 폴립 분할)

  • Ayana, Gelan;Choe, Se-woon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.10a
    • /
    • pp.97-99
    • /
    • 2022
  • For the early identification and treatment of colorectal cancer, accurate polyp segmentation is crucial. However, polyp segmentation is a challenging task, and the majority of current approaches struggle with two issues. First, the position, size, and shape of each individual polyp varies greatly (intra-class inconsistency). Second, there is a significant degree of similarity between polyps and their surroundings under certain circumstances, such as motion blur and light reflection (inter-class indistinction). U-net, which is composed of convolutional neural networks as encoder and decoder, is considered as a standard for tackling this task. We propose an updated U-net architecture replacing the encoder part with vision transformer network for polyp segmentation. The proposed architecture performed better than the standard U-net architecture for the task of polyp segmentation.

  • PDF