Browse > Article
http://dx.doi.org/10.6109/jkiice.2020.24.3.384

Indoor Scene Classification based on Color and Depth Images for Automated Reverberation Sound Editing  

Jeong, Min-Heuk (School of Electronics and Information Engineering, Korea Aerospace University)
Yu, Yong-Hyun (School of Electronics and Information Engineering, Korea Aerospace University)
Park, Sung-Jun (School of Electronics and Information Engineering, Korea Aerospace University)
Hwang, Seung-Jun (School of Electronics and Information Engineering, Korea Aerospace University)
Baek, Joong-Hwan (School of Electronics and Information Engineering, Korea Aerospace University)
Abstract
The reverberation effect on the sound when producing movies or VR contents is a very important factor in the realism and liveliness. The reverberation time depending the space is recommended in a standard called RT60(Reverberation Time 60 dB). In this paper, we propose a scene recognition technique for automatic reverberation editing. To this end, we devised a classification model that independently trains color images and predicted depth images in the same model. Indoor scene classification is limited only by training color information because of the similarity of internal structure. Deep learning based depth information extraction technology is used to use spatial depth information. Based on RT60, 10 scene classes were constructed and model training and evaluation were conducted. Finally, the proposed SCR + DNet (Scene Classification for Reverb + Depth Net) classifier achieves higher performance than conventional CNN classifiers with 92.4% accuracy.
Keywords
Scene Classification; Depth Estimation; Reverberation; Convolutional Neural Network;
Citations & Related Records
연도 인용수 순위
  • Reference
1 S. Karen, and Z. Andrew. "Very deep convolutional networks for large-scale image recognition," arXiv preprint arXiv:1409.1556, 2014.
2 K. He, X. Zhang, S. Ren, and J. Sun. "Deep residual learning for image recognition," Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, Las Vegas: NV pp. 770-778, 2016.
3 International Organization for Standardization. Acoustics: Measurement of the reverberation time of rooms with reference to other acoustical parameters. ISO, 1997.
4 K. Alex, S. Ilya, and G. E. Hinton. "Imagenet classification with deep convolutional neural networks," Advances in Neural Information Processing Systems, Lake Tahoe: NV, pp. 1097-1105, 2012.
5 B. Zhou, A. Lapedriza, A. Khosla, A. Oliva, and A. Torralba. "Places: A 10 million image database for scene recognition," IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 6, pp. 1452-1464, Jul. 2017.   DOI
6 A. Yashwanth, S. Shammer, R. Sairam, and G. Chamundeeswari. "A novel approach for indoor-outdoor scene classification using transfer learning," International Journal of Advance Research, Ideas and Innovations in Technology, vol. 5, no. 2, pp. 1756-1762, Mar. 2019.
7 V. Casser, S. Pirk, R. Mahjourian, and A. Angelova. "Unsupervised monocular depth and ego-motion learning with structure and semantics," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Long Beach: CA, 2019.
8 J. Hu, M. Ozay, Y. Zhang, and T. Okatani, "Revisiting single image depth estimation: Toward higher resolution maps with accurate object boundaries," IEEE Winter Conference on Applications of Computer Vision, Waikoloa Village: HI, pp. 1043-1051, 2019.