Browse > Article
http://dx.doi.org/10.5909/JBE.2019.24.5.870

Video Compression Standard Prediction using Attention-based Bidirectional LSTM  

Kim, Sangmin (Department of Electronics and Computer Engineering, Hanyang University)
Park, Bumjun (Department of Electronics and Computer Engineering, Hanyang University)
Jeong, Jechang (Department of Electronics and Computer Engineering, Hanyang University)
Publication Information
Journal of Broadcast Engineering / v.24, no.5, 2019 , pp. 870-878 More about this Journal
Abstract
In this paper, we propose an Attention-based BLSTM for predicting the video compression standard of a video. Recently, in NLP, many researches have been studied to predict the next word of sentences, classify and translate sentences by their semantics using the structure of RNN, and they were commercialized as chatbots, AI speakers and translator applications, etc. LSTM is designed to solve the gradient vanishing problem in RNN, and is used in NLP. The proposed algorithm makes video compression standard prediction possible by applying BLSTM and Attention algorithm which focuses on the most important word in a sentence to a bitstream of a video, not an sentence of a natural language.
Keywords
Deep Learning; Attention algorithm; LSTM; NLP; Codec;
Citations & Related Records
연도 인용수 순위
  • Reference
1 J. L. Elman, "Finding structure in time," Cognitive science, Vol.14, No.2, pp.179-211, March 1990.   DOI
2 Y. LeCun, "A Theoretical Framework for Back-Propagation," Proceedings of the 1988 connectionist models summer school, Pittsburgh, Vol.1, pp.21-28, 1988.
3 F. J. Pineda, "Generalization of back-propagation to recurrent neural networks," Physical review letters, Vol.59, No.19, pp.2229-2232, November 1987.   DOI
4 S. Hochreiter, and J. Schmidhuber, "Long short-term memory," Neural computation, Vol.9, No.8, pp.1735-1780, November 1997.   DOI
5 P. Zhou, W. Shi, J. Tian, Z. Qi, B. Li, H. Hao, and B. Xu, "Attention-based bidirectional long short-term memory networks for relation classification," Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL) (Volume 2: Short Papers), Berlin, Germany, pp. 207-212, 2016.
6 D. Bahdanau, K. Cho, and Y. Bengio, "Neural machine translation by jointly learning to align and translate," Proceeding of International Conference on Learning Representations (ICLR), San Diego, pp. 1-15, 2015.
7 Y. Wang, M. Huang, L. Zhao, and Xiaoyan Zhu, "Attention-based LSTM for aspect-level sentiment classification," Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP), Austin, pp.606-615, 2016.
8 Z. Yang, D. Yang, C. Dyer, X. He, A. Smola, and E. Hovy, "Hierarchical Attention Networks for Document Classification," Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, San Diego, pp. 1480-1489, 2016.
9 S. Hochreiter, "The vanishing gradient problem during learning recurrent neural nets and problem solutions," International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, Vol.6, No.02, pp.107-116, April 1998.   DOI
10 R. Pascanu, T. Mikolov, and Y. Bengio, "On the difficulty of training recurrent neural networks," International conference on machine learning (ICML), Atlanta, pp.1310-1318, 2013.
11 M. Schuster, and K. K. Paliwal, "Bidirectional recurrent neural networks," IEEE Transactions on Signal Processing, Vol.45, No.11, pp.2673-2681, November 1997.   DOI
12 S. Zhang, D. Zheng, X. Hu, and M. Yang, "Bidirectional long short-term memory networks for relation classification," Proceeding of the 29th Pacific Asia conference on language, information and computation (PACLIC), San Diego, pp.73-78, 2015.
13 Y. Kim, "Convolutional Neural Networks for Sentence Classification," Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, pp.1746-1751, 2014.
14 A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, "Attention Is All You Need," Advances in neural information processing systems (NIPS), Long Beach, pp.5998-6008, 2017.
15 S. Wee, and J. Jeong, "RNN-based bitstream feature extraction method for codec classification," International Workshop on Advanced Image Technology (IWAIT) 2019, Singapore, Singapore, Vol.11049, p. 110493N, 2019.
16 A. Luque, A. Carrasco, A. Martin, and A. Heras, "The impact of class imbalance in classification performance metrics based on the binary confusion matrix", Pattern Recognition, Vol.91, pp.216-231, 2019.   DOI
17 Download H. 264 High Profile Video streams, http://ftp.arl.mil/-mike/ping/html (accessed Jun. 25, 2019).
18 Test Sequences encoded in the H.264/MPEG-4 standard, https://pi4.informatik.uni-mannheim.de/-kiess/test_sequences/download/(accessed Jun. 25, 2019).
19 T. R. Gardos, "H.263+: THE NEW ITU-T RECOMMENDATION FOR VIDEO CODING AT LOW BIT RATES," Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP'98 (Cat. No. 98CH36181), Seattle, Vol.6, 1998.
20 S. Ookubo, H.264/AVC TEXTBOOK, (Translated by Jechang Jeong), HONGRUNG PUBLISHING COMPANY, pp.330-333, 2007.