Browse > Article
http://dx.doi.org/10.5909/JBE.2018.23.3.383

A Technical Analysis on Deep Learning based Image and Video Compression  

Cho, Seunghyun (Realistic AV Research Group, Electronics and Telecommunications Research Institute)
Kim, Younhee (Realistic AV Research Group, Electronics and Telecommunications Research Institute)
Lim, Woong (Realistic AV Research Group, Electronics and Telecommunications Research Institute)
Kim, Hui Yong (Realistic AV Research Group, Electronics and Telecommunications Research Institute)
Choi, Jin Soo (Realistic AV Research Group, Electronics and Telecommunications Research Institute)
Publication Information
Journal of Broadcast Engineering / v.23, no.3, 2018 , pp. 383-394 More about this Journal
Abstract
In this paper, we investigate image and video compression techniques based on deep learning which are actively studied recently. The deep learning based image compression technique inputs an image to be compressed in the deep neural network and extracts the latent vector recurrently or all at once and encodes it. In order to increase the image compression efficiency, the neural network is learned so that the encoded latent vector can be expressed with fewer bits while the quality of the reconstructed image is enhanced. These techniques can produce images of superior quality, especially at low bit rates compared to conventional image compression techniques. On the other hand, deep learning based video compression technology takes an approach to improve performance of the coding tools employed for existing video codecs rather than directly input and process the video to be compressed. The deep neural network technologies introduced in this paper replace the in-loop filter of the latest video codec or are used as an additional post-processing filter to improve the compression efficiency by improving the quality of the reconstructed image. Likewise, deep neural network techniques applied to intra prediction and encoding are used together with the existing intra prediction tool to improve the compression efficiency by increasing the prediction accuracy or adding a new intra coding process.
Keywords
Machine learning; Deep learning; Image; Video; Compression;
Citations & Related Records
연도 인용수 순위
  • Reference
1 J. Kang, S. Kim, and K. M. Lee, "Multi-modal Multi-scale Convolutional Neural Network based In-loop Filter Design for Next Generation Video Codec," Proceeding of IEEE International Conference on Image Processing, Beijing, China, pp. 16-30, Sept. 2017.
2 T. Wang, M. Chen, and H. Chao, "A Novel Deep Learning-Based Method of Improving Coding Efficiency from the Decoder-end for HEVC," Proceeding of Data Compression Conference, Snowbird, USA pp. 410-419, April 2017.
3 L. Zhou, X. Song, J. Yao, L. Wang, and F. Chen, "Convolution Neural Network Filter (CNNF) for Intra Frame," JVET-I0022, Joint Video Exploration Team of ISO/IEC and ITU-T, Gwangju, Korea, Jan. 2018.
4 C. Dong, Y. Deng, C. C. Loy, and X. Tan, "Compression Artifacts Reduction by a Deep Convolutional Network," Proceeding of IEEE International Conference on Computer Vision, Santiago, Chile, pp. 576-584, Dec. 2015.
5 G. Toderici, D. Vincent, N. Johnston, S. J. Hwang, D. Minnen, J. Shor, and M. Covell, "Full Resolution Image Compression with Recurrent Neural Networks," Proceeding of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 5435-5443, July 2017.
6 N. Johnston, D. Vincent, D. Minnen, M. Covell, S. Singh, T. Chinen, S. J. Hwang, J. Shor, and G. Toderici, "Improved Lossy Image Compression with Priming and Spatially Adaptive Bit Rates for Recurrent Networks," https://arxiv.org/abs/1703.10114 (Submitted on Mar 29, 2017)
7 J. Jiang, "Image compression with neural networks," Signal Processing: Image Communication Vol. 14, No.9, pp. 737-760, July 1999.   DOI
8 J. Kim, J. K. Lee, and K. M. Lee, "Accurate image super-resolution using very deep convolutional networks," Proceeding of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp. 1646-1654, June 2016.
9 P. Svoboda, M.Hradis, D.Barina, and P.Zemcik, "Compression Artifacts Removal Using Convolutional Neural Networks," Journal of WSCG, Vol. 24, No.2, pp. 63-72, 2016.
10 K. Zhang, W. Zuo, Y. Chen, D. Meng, and L. Zhang, "Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising," IEEE Transactions on Image Processing, Vol. 26, No.7, pp. 3142-3155, 2017.   DOI
11 JEM7.0, https://jvet.hhi.fraunhofer.de/svn/svn_HMJEMSoftware/branches/HM-16.6-JEM-7.0-dev/.
12 J. Li, B. Li, J. Xu, and R. Xiong, "Intra Prediction Using Fully Connected Network for Video Coding," Proceeding of IEEE International Conference on Image Processing, Beijing, China, pp. 1-5, Sept. 2017.
13 S. Cho, J. Lee, W. Lim, Y. Kim, J. Seok, H. Y. Kim, and J. Choi, "HEVC Intra Prediction through Convolutional Neural Network," 30th Workshop on Image Processing and Image Understanding, Jeju, Korea, Feb. 2018.
14 J. Lainema, F. Bossen, W.-J. Han, J. Min, and K. Ugur, "Intra coding of the HEVC standard," IEEE Trans. on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1792-1801, 2012.   DOI
15 Y. Li, D. Liu, H. Li, L. Li, F. Wu, H. Zhang, and H. Yang, "Convolutional Neural Network-Based Block Up-sampling for Intra Frame Coding," IEEE Transactions on Circuits and Systems for Video Technology, (Early Access), July 2017.
16 C. Dong, C. C. Loy, K. He, and X. Tang, "Learning a deep convolutional network for image super-resolution," in European Conference on Computer Vision, pp. 184-199, Springer, 2014.
17 ITU-T and ISO/IEC JTC 1, "High Efficiency video coding," ITU-T Recommendation H.265 and ISO/IEC 23008-2 (MPEG-H Part 2), Third edition: April 2015.
18 O. Rippel and L. Bourdev, "Real-Time Adaptive Image Compression," Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, PMLR 70:2922-2930, Aug. 2017.
19 G. Toderici, S. M. O'Malley, S. J. Hwang, D. Vincent, D. Minnen, S. Baluja, M. Covell, and R. Sukthankar, "Variable rate image compression with recurrent neural networks," Proceeding of International Conference on Learning Representations, San Juan, Puerto Rico, May 2016.
20 L. Theis, W. Shi, A. Cunningham, and F. Huszar, "Lossy Image Compression with Compressive Autoencoders," Proceeding of International Conference on Learning Representations, Toulon, France, April 2017.
21 WebP - A new image format for the Web, https://developers.google.com/speed/webp/BPG image format, http://bellard.org/bpg
22 C.-M. Fu, E. Alshina, A. Alshin, Y.-W. Huang, C.-Y. Chen, and C.-Y. Tsai, C.-W. Hsu, S.-M. Lei, J.-H. Park, and W.-J. Han, "Sample Adaptive Offset in the HEVC Standard," IEEE Trans. on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1755-1764, 2012.   DOI
23 I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, "Generative adversarial nets", Proceeding of Neural Information Processing Systems, Montreal, Canada, Dec. 2014.
24 Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, "Image quality assessment: from error visibility to structural similarity," IEEE Transactions on Image Processing, Vol. 13, No.4, pp. 600-612, April 2004.   DOI
25 A. Norkin, G. Bjontegaard, A. Fuldseth, M. Narroschke, M. Ikeda, K. Andersson, M. Zhou, and G. V. der Auwera, "HEVC Deblocking Filter," IEEE Trans. on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1746-1754, 2012.   DOI
26 Y. Dai, D. Liu, and F. Wu, "A Convolutional Neural Network Approach for Post-Processing in HEVC Intra Coding," Proceeding of the 23rd International Conference on Multimedia Modeling, Reykjavik, Iceland, pp. 28-39, Jan. 2017.