• Title/Summary/Keyword: Convolutional encoding

Search Result 45, Processing Time 0.032 seconds

CNN-based Fast Split Mode Decision Algorithm for Versatile Video Coding (VVC) Inter Prediction

  • Yeo, Woon-Ha;Kim, Byung-Gyu
    • Journal of Multimedia Information System
    • /
    • v.8 no.3
    • /
    • pp.147-158
    • /
    • 2021
  • Versatile Video Coding (VVC) is the latest video coding standard developed by Joint Video Exploration Team (JVET). In VVC, the quadtree plus multi-type tree (QT+MTT) structure of coding unit (CU) partition is adopted, and its computational complexity is considerably high due to the brute-force search for recursive rate-distortion (RD) optimization. In this paper, we aim to reduce the time complexity of inter-picture prediction mode since the inter prediction accounts for a large portion of the total encoding time. The problem can be defined as classifying the split mode of each CU. To classify the split mode effectively, a novel convolutional neural network (CNN) called multi-level tree (MLT-CNN) architecture is introduced. For boosting classification performance, we utilize additional information including inter-picture information while training the CNN. The overall algorithm including the MLT-CNN inference process is implemented on VVC Test Model (VTM) 11.0. The CUs of size 128×128 can be the inputs of the CNN. The sequences are encoded at the random access (RA) configuration with five QP values {22, 27, 32, 37, 42}. The experimental results show that the proposed algorithm can reduce the computational complexity by 11.53% on average, and 26.14% for the maximum with an average 1.01% of the increase in Bjøntegaard delta bit rate (BDBR). Especially, the proposed method shows higher performance on the sequences of the A and B classes, reducing 9.81%~26.14% of encoding time with 0.95%~3.28% of the BDBR increase.

A Study of Big Time Series Data Compression based on CNN Algorithm (CNN 기반 대용량 시계열 데이터 압축 기법연구)

  • Sang-Ho Hwang;Sungho Kim;Sung Jae Kim;Tae Geun Kim
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.18 no.1
    • /
    • pp.1-7
    • /
    • 2023
  • In this paper, we implement a lossless compression technique for time-series data generated by IoT (Internet of Things) devices to reduce the disk spaces. The proposed compression technique reduces the size of the encoded data by selectively applying CNN (Convolutional Neural Networks) or Delta encoding depending on the situation in the Forecasting algorithm that performs prediction on time series data. In addition, the proposed technique sequentially performs zigzag encoding, splitting, and bit packing to increase the compression ratio. We showed that the proposed compression method has a compression ratio of up to 1.60 for the original data.

Real - Time Applications of Video Compression in the Field of Medical Environments

  • K. Siva Kumar;P. Bindhu Madhavi;K. Janaki
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.11
    • /
    • pp.73-76
    • /
    • 2023
  • We introduce DCNN and DRAE appraoches for compression of medical videos, in order to decrease file size and storage requirements, there is an increasing need for medical video compression nowadays. Using a lossy compression technique, a higher compression ratio can be attained, but information will be lost and possible diagnostic mistakes may follow. The requirement to store medical video in lossless format results from this. The aim of utilizing a lossless compression tool is to maximize compression because the traditional lossless compression technique yields a poor compression ratio. The temporal and spatial redundancy seen in video sequences can be successfully utilized by the proposed DCNN and DRAE encoding. This paper describes the lossless encoding mode and shows how a compression ratio greater than 2 (2:1) can be achieved.

Comparison Analysis of Deep Learning-based Image Compression Approaches (딥 러닝 기반 이미지 압축 기법의 성능 비교 분석)

  • Yong-Hwan Lee;Heung-Jun Kim
    • Journal of the Semiconductor & Display Technology
    • /
    • v.22 no.1
    • /
    • pp.129-133
    • /
    • 2023
  • Image compression is a fundamental technique in the field of digital image processing, which will help to decrease the storage space and to transmit the files efficiently. Recently many deep learning techniques have been proposed to promise results on image compression field. Since many image compression techniques have artifact problems, this paper has compared two deep learning approaches to verify their performance experimentally to solve the problems. One of the approaches is a deep autoencoder technique, and another is a deep convolutional neural network (CNN). For those results in the performance of peak signal-to-noise and root mean square error, this paper shows that deep autoencoder method has more advantages than deep CNN approach.

  • PDF

A Study on the Digital Design for Voice Modem Using the Multicarrier DS-CDMA in Powerline Channels (전력선 채널에서 멀티캐리어 DS-CDMA를 이용한 전력선 음성모뎀의 디지털부 구현에 관한 연구)

  • 이상준;김민걸;이종성;구시경;박광철;오정현;김기두
    • Proceedings of the IEEK Conference
    • /
    • 2000.06a
    • /
    • pp.77-80
    • /
    • 2000
  • In this paper, we implemented the voice modem using the multicarrier DS-CDMA in powerline channels. Both TMS320C5402 of Texas Instrument and FPGA FLEX 10K EPF10K100ARC240 of ALTERA are used to realize the proposed system. For robustness in the powerline channel, we used multicarrier DS-CDMA modulation, convolutional encoding/Viterbi decoding, and interleaving. Finally, we showed satisfactory performance in the laboratory experiment.

  • PDF

Novel Reconfigurable Coprocessor for Communication Systems (통신 시스템을 위한 고성능 재구성 가능 코프로세서의 설계)

  • Jung Chul Yoon;Sunwoo Myung Hoon
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.42 no.6 s.336
    • /
    • pp.39-48
    • /
    • 2005
  • This paper proposes a reconfigurable coprocessor for communication systems, which can perform high speed computations and various functions. The proposed reconfigurable coprocessor can easily implement communication operations, such as scrambling, interleaving, convolutional encoding, Viterbi decoding, FFT, etc. The proposed architecture has been modeled by VHDL and synthesized using the SEC 0.18$\mu$m standard cell library. The gate count is about 35,000 gates and the critical path is 3.84ns. The proposed coprocessor can reduced about $33\%$ for FFT operations and complex MAC, $37\%$ for Viterbi operations, and $48\%\~84\%$ for scrambling and convolutional encoding for the IEEE 802.11a WLAN standard compared with existing DSPs. The proposed coprocessor shows Performance improvements compared with existing DSP chips for communication algorithms.

Design and implementation of a base station modulator ASIC for CDMA cellular system (CDMA 이동통신 시스템용 기지국 변조기 ASIC 설계 및 구현)

  • Kang, In;Hyun, Jin-Il;Cha, Jin-Jong;Kim, Kyung-Soo
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.34C no.2
    • /
    • pp.1-11
    • /
    • 1997
  • We developed a base station modulator ASIC for CDMA digital cellular system. In CDMA digital cellular system, the modulation is performed by convolutional encoding and QPSK with spread spectrum. The function blocks of base station modulator are CRC, convolutional encoder, interleaver pseudo-moise scrambler, power control bit puncturing, walsh cover, QPSK, gain controller, combiner and multiplexer. Each function block was designed by the logic synthesis of VHDL codes. The VHDL code was described at register transfer level and the size of code is about 8,000 lines. The circuit simulation and logic simulation were performed by COMPASS tools. The chip (ES-C2212B CMB) contains 25,205 gates and 3 Kbit SRAM, and its chip size is 5.25 mm * 5,45 mm in 0.8 mm CMOS cell-based design technology. It is packaged in 68 pin PLCC and the power dissipation at 10MHz is 300 mW at 5V. The ASIC has been fully tested and successfully working on the CDMA base station system.

  • PDF

Convolutional auto-encoder based multiple description coding network

  • Meng, Lili;Li, Hongfei;Zhang, Jia;Tan, Yanyan;Ren, Yuwei;Zhang, Huaxiang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.4
    • /
    • pp.1689-1703
    • /
    • 2020
  • When data is transmitted over an unreliable channel, the error of the data packet may result in serious degradation. The multiple description coding (MDC) can solve this problem and save transmission costs. In this paper, we propose a deep multiple description coding network (MDCN) to realize efficient image compression. Firstly, our network framework is based on convolutional auto-encoder (CAE), which include multiple description encoder network (MDEN) and multiple description decoder network (MDDN). Secondly, in order to obtain high-quality reconstructed images at low bit rates, the encoding network and decoding network are integrated into an end-to-end compression framework. Thirdly, the multiple description decoder network includes side decoder network and central decoder network. When the decoder receives only one of the two multiple description code streams, side decoder network is used to obtain side reconstructed image of acceptable quality. When two descriptions are received, the high quality reconstructed image is obtained. In addition, instead of quantization with additive uniform noise, and SSIM loss and distance loss combine to train multiple description encoder networks to ensure that they can share structural information. Experimental results show that the proposed framework performs better than traditional multiple description coding methods.

Comparison of Fine Grained Classification of Pet Images Using Image Processing and CNN (영상 처리와 CNN을 이용한 애완동물 영상 세부 분류 비교)

  • Kim, Jihae;Go, Jeonghwan;Kwon, Cheolhee
    • Journal of Broadcast Engineering
    • /
    • v.26 no.2
    • /
    • pp.175-183
    • /
    • 2021
  • The study of the fine grained classification of images continues to develop, but the study of object recognition for animals with polymorphic properties is proceeding slowly. Using only pet images corresponding to dogs and cats, this paper aims to compare methods using image processing and methods using deep learning among methods of classifying species of animals, which are fine grained classifications. In this paper, Grab-cut algorithm is used for object segmentation by method using image processing, and method using Fisher Vector for image encoding is proposed. Other methods used deep learning, which has achieved good results in various fields through machine learning, and among them, Convolutional Neural Network (CNN), which showed outstanding performance in image recognition, and Tensorflow, an open-source-based deep learning framework provided by Google. For each method proposed, 37 kinds of pet images, a total of 7,390 pages, were tested to verify and compare their effects.

Deep Learning based Raw Audio Signal Bandwidth Extension System (딥러닝 기반 음향 신호 대역 확장 시스템)

  • Kim, Yun-Su;Seok, Jong-Won
    • Journal of IKEEE
    • /
    • v.24 no.4
    • /
    • pp.1122-1128
    • /
    • 2020
  • Bandwidth Extension refers to restoring and expanding a narrow band signal(NB) that is damaged or damaged in the encoding and decoding process due to the lack of channel capacity or the characteristics of the codec installed in the mobile communication device. It means converting to a wideband signal(WB). Bandwidth extension research mainly focuses on voice signals and converts high bands into frequency domains, such as SBR (Spectral Band Replication) and IGF (Intelligent Gap Filling), and restores disappeared or damaged high bands based on complex feature extraction processes. In this paper, we propose a model that outputs an bandwidth extended signal based on an autoencoder among deep learning models, using the residual connection of one-dimensional convolutional neural networks (CNN), the bandwidth is extended by inputting a time domain signal of a certain length without complicated pre-processing. In addition, it was confirmed that the damaged high band can be restored even by training on a dataset containing various types of sound sources including music that is not limited to the speech.