DOI QR코드

DOI QR Code

Comparative Analysis of Deep Learning Researches for Compressed Video Quality Improvement

압축 영상 화질 개선을 위한 딥 러닝 연구에 대한 분석

  • Lee, Young-Woon (Department of IT Engineering, Sookmyung Women's University) ;
  • Kim, Byung-Gyu (Department of IT Engineering, Sookmyung Women's University)
  • Received : 2019.04.04
  • Accepted : 2019.05.10
  • Published : 2019.05.30

Abstract

Recently, researches using Convolutional Neural Network (CNN)-based approaches have been actively conducted to improve the reduced quality of compressed video using block-based video coding standards such as H.265/HEVC. This paper aims to summarize and analyze the network models in these quality enhancement studies. At first the detailed components of CNN for quality enhancement are overviewed and then we summarize prior studies in the image domain. Next, related studies are summarized in three aspects of network structure, dataset, and training methods, and present representative models implementation and experimental results for performance comparison.

최근 CNN (Convolutional Neural Network) 기반의 화질 개선 기술이 H.265/HEVC와 같은 블록 기반 영상 압축 표준을 사용하여 압축된 영상의 화질을 향상시키는 데 적극적으로 사용되어 왔다. 이 논문은 이러한 영상 압축 기술을 위한 화질 개선 연구의 추세를 요약하고 분석하는 것을 목표로 한다. 먼저, 화질 개선을 위한 CNN의 구성 요소를 살펴보고 이미지 도메인에서의 사전 연구를 요약한다. 다음으로 네트워크 구조, 데이터셋 및 학습 방법의 세 가지 측면에서 관련 연구들을 정리하고 성능 비교를 위한 구현 및 실험결과를 제시하고자 한다.

Keywords

BSGHC3_2019_v24n3_420_f0001.png 이미지

그림 1. 시각적인 화질 차이 비교 Fig. 1. Comparison of visual quality difference

표 1. CNN 구조 비교 Table 1. Comparison of CNN Structures

BSGHC3_2019_v24n3_420_t0001.png 이미지

표 2. 학습 조건 비교 Table 2. Comparison of Training Conditions

BSGHC3_2019_v24n3_420_t0002.png 이미지

표 3. 구현 환경 정보 Table 3. Implementation Environment

BSGHC3_2019_v24n3_420_t0003.png 이미지

표 4. HM 부호화 영상 및 CNN 기반 화질 개선 영상 간 PSNR 비교 Table 4. PSNR Comparison between compressed video using HM and improved video with CNN

BSGHC3_2019_v24n3_420_t0004.png 이미지

References

  1. Gary J Sullivan, Jens-Rainer Ohm, Woo-Jin Han, Thomas Wiegand, et al., "Overview of the high efficiency video coding (hevc) standard," IEEE Transactions on circuits and systems for video technology, vol.22, no. 12, pp. 1649-1668, 2012. https://doi.org/10.1109/TCSVT.2012.2221191
  2. Chao Dong, Chen Change Loy, Kaiming He, and Xiaoou Tang, "Image super-resolution using deep convolutional networks," IEEE transactions on pattern analysis and machine intelligence, vol. 38, no. 2, pp. 295-307, 2016. https://doi.org/10.1109/TPAMI.2015.2439281
  3. Ke Yu, Chao Dong, Chen Change Loy, and Xiaoou Tang, "Deep convolution networks for compression artifacts reduction," arXiv preprint arXiv:1608.02778,2016.
  4. Jiwon Kim, Jung Kwon Lee, and Kyoung Mu Lee, "Accurate image super-resolution using very deep convolutional networks," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp.1646-1654.
  5. Woon-Sung Park and Munchurl Kim, "Cnn-based in-loop filtering for coding efficiency improvement," in Image, Video, and Multidimensional Signal Processing Workshop (IVMSP), 2016 IEEE 12th. IEEE, 2016, pp.1-5.
  6. Tingting Wang, Mingjin Chen, and Hongyang Chao, "A novel deep learning-based method of improving coding efficiency from the decoder-end for hevc," in Data Compression Conference (DCC), 2017. IEEE, 2017, pp.410-419.
  7. Yuanying Dai, Dong Liu, and Feng Wu, "A convolutional neural network approach for post-processing in hevc intra coding," in International Conference on Multimedia Modeling. Springer, 2017, pp. 28-39.
  8. Xiandong Meng, Chen Chen, Shuyuan Zhu, and Bing Zeng, "A new hevc in-loop filter based on multi-channel long-short-term dependency residual networks," in 2018 Data Compression Conference. IEEE, 2018, pp. 187-196.
  9. Xiaodan Song, Jiabao Yao, Lulu Zhou, Li Wang, Xiaoyang Wu, Di Xie, and Shiliang Pu, "A practical convolutional neural network as loop filter for intra frame," arXiv preprint arXiv:1805.06121, 2018.
  10. Pablo Arbelaez, Michael Maire, Charless Fowlkes, and Jitendra Malik, "Contour detection and hierarchical image segmentation," IEEE transactions on pattern analysis and machine intelligence, vol. 33, no. 5, pp. 898-916, 2011. https://doi.org/10.1109/TPAMI.2010.161
  11. Marcin Marszalek, Ivan Laptev, and Cordelia Schmid, "Actions in context," in Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. IEEE, 2009, pp. 2929-2936.
  12. JEM 7.0, https://jvet.hhi.fraunhofer.de/trac/vvc/browser/jem/branches/HM-16.6-JEM-7.0-dev, 2019, [Online; accessed February 3, 2019].
  13. Visual genume (VG), http://visualgenome.org/, 2019, [Online; accessed February 3, 2019].
  14. DIV2K, https://data.vision.ee.ethz.ch/cvl/DIV2K/, 2019, [Online; accessed February 3, 2019].
  15. ILSVRC2012, http://www.image-net.org/challenges/LSVRC/2012/, 2019, [Online; accessed February 3, 2019].
  16. Andrea Vedaldi and Karel Lenc, "Matconvnet: Convolutional neural networks for matlab," in Proceedings of the 23rd ACM international conference on Multimedia. ACM, 2015, pp. 689-692.
  17. Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell, "Caffe: Convolutional architecture for fast feature embedding," in Proceedings of the 22nd ACM international conference on Multimedia. ACM, 2014, pp. 675-678.
  18. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun, "Delving deep into rectifiers: Surpassing human-level performance on imagenet classification," in Proceedings of the IEEE international conference on computer vision, 2015, pp. 1026-1034.
  19. Hang Zhao, Orazio Gallo, Iuri Frosio, and Jan Kautz, "Loss functions for image restoration with neural networks," IEEE Transactions on Computational Imaging, vol. 3, no. 1, pp. 47-57, 2017. https://doi.org/10.1109/TCI.2016.2644865
  20. Martin Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al., "Tensorflow: A system for large-scale machine learning," in 12th{USENIX} Symposium on Operating Systems Design and Implementation ({OSDI}16), 2016, pp. 265-283.
  21. Diederik P Kingma and Jimmy Ba, "Adam: A method for stochastic optimization," arXiv preprint arXiv: 1412.6980, 2014.
  22. Duc-Tien Dang-Nguyen, Cecilia Pasquini, Valentina Conotter, and Giulia Boato, "Raise: A raw images dataset for digital image forensics," in Proceedings of the 6th ACM Multimedia Systems Conference. ACM, 2015, pp. 219-224.