Knowledge Distillation for Unsupervised Depth Estimation

Song, Jimin;Lee, Sang Jun;

doi:10.14372/IEMEK.2022.17.4.209

대한임베디드공학회논문지 (IEMEK Journal of Embedded Systems and Applications)

제17권4호
/
Pages.209-215
/
2022
/
1975-5066(pISSN)

대한임베디드공학회 (Institute of Embedded Engineering of Korea)

DOI QR Code

비지도학습 기반의 뎁스 추정을 위한 지식 증류 기법

Knowledge Distillation for Unsupervised Depth Estimation

송지민 ;
이상준

Song, Jimin (Jeonbuk National University) ;
Lee, Sang Jun (Jeonbuk National University)

투고 : 2022.07.04
심사 : 2022.08.03
발행 : 2022.08.31

https://doi.org/10.14372/IEMEK.2022.17.4.209 인용 PDF KSCI

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

This paper proposes a novel approach for training an unsupervised depth estimation algorithm. The objective of unsupervised depth estimation is to estimate pixel-wise distances from camera without external supervision. While most previous works focus on model architectures, loss functions, and masking methods for considering dynamic objects, this paper focuses on the training framework to effectively use depth cue. The main loss function of unsupervised depth estimation algorithms is known as the photometric error. In this paper, we claim that direct depth cue is more effective than the photometric error. To obtain the direct depth cue, we adopt the technique of knowledge distillation which is a teacher-student learning framework. We train a teacher network based on a previous unsupervised method, and its depth predictions are utilized as pseudo labels. The pseudo labels are employed to train a student network. In experiments, our proposed algorithm shows a comparable performance with the state-of-the-art algorithm, and we demonstrate that our teacher-student framework is effective in the problem of unsupervised depth estimation.

키워드

과제정보

본 연구는 삼성전자의 지원을 받아 수행된 결과임.

참고문헌

D. Eigen, C. Puhrsch, R. Fergus, "Depth map Prediction from a Single Image using a Multi-scale deep Network," Advances in Neural Information Processing Systems, pp. 2366-2374, 2014.
J. Uhrig, N. Schneider, L. Schneider, U. Franke, T. Brox, A. Geiger, "Sparsity Invariant cnns." International Conference on 3D Vision, pp. 11-20, 2017.
H. Fu, M. Gong, C. Wang, K. Batmanghelich, D. Tao, "Deep Ordinal Regression Network for Monocular Depth Estimation," Proceedings of the IEEE/CVF International Conference on Computer Vision and Pattern Recognition, pp. 2002-2011, 2018.
J. H. Lee, M. K. Han, D. W. Ko, I. H. Suh, "From big to Small: Multi-scale Local Planar Guidance for Monocular Depth Estimation," arXiv preprint arXiv:1907.10326, 2019.
G. Huang, Z. Liu, L. V. D. Maaten, K. Q. Weinberger, "Densely Connected Convolutional Networks," Proceedings of the IEEE/CVF International Conference on Computer Vision and Pattern Recognition, pp. 4700-4708, 2017.
C. Godard, O. M. Aodha, G. J. Brostow, "Unsupervised Monocular Depth Estimation with Left-right Consistency," Proceedings of the IEEE/CVF International Conference on Computer Vision and Pattern Recognition, pp. 270-279, 2017.
C. Godard, O. M. Aodha, M. Firman, G. J. Brostow, "Digging into Self-supervised Monocular Depth Estimation," Proceedings of the IEEE/CVF International Conference on Computer Vision and Pattern Recognition, pp. 3828-3838, 2019.
M. Jaderberg, K. Simonyan, A. Zisserman, K. kavukcuoglu, "Spatial Transformer Networks," Advances in Neural Information Processing Systems, pp. 2017-2025, 2015.
H. Jiang, L. Ding, Z. Sun, R. Huang, "Dipe: Deeper into Photometric Errors for Unsupervised Learning of Depth and Ego-motion from Monocular Videos," International Conference on Intelligent Robots and Systems, pp. 10061-10067, 2020.
A. Varma, H. Chawla, B. Zonooz, E. Arani, "Transformers in Self-Supervised Monocular Depth Estimation with Unknown Camera Intrinsics," arXiv preprint arXiv:2202.03131, 2022.
A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, N. Houlsby, "An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale," arXiv preprint arXiv:2010.11929, 2020.
R. Ranftl, A. Bochkovskiy, V. Koltun, "Vision Transformers for Dense Prediction," Proceedings of the IEEE/CVF International Conference on Computer Vision and Pattern Recognition, pp. 12179-12188, 2021.
Y. Wang, X. Li, M. Shi, K. Xian, Z. Cao, "Knowledge Distillation for fast and Accurate Monocular Depth Estimation on Mobile Devices," Proceedings of the IEEE/CVF International Conference on Computer Vision and Pattern Recognition, pp. 2457-2465, 2021.
G. Hinton, O. Vinyals, J. Dean, "Distilling the Knowledge in a Neural Network," arXiv preprint arXiv:1503.02531, 2015.
K. S. Song, K. J. Yoon, "Learning Monocular Depth Estimation via Selective Distillation of Stereo Knowledge," arXiv preprint arXiv:2205.08668, 2022.
A. Pilzer, S. Lathuiliere, N. Sebe, E. Ricci, "Refine and Distill: Exploiting Cycle-inconsistency and Knowledge Distillation for Unsupervised Monocular Depth Estimation," Proceedings of the IEEE/CVF International Conference on Computer Vision and Pattern Recognition, pp. 9768-9777, 2019.
J. Hu, C. Fan, H. Jiang, Xi. Guo, Y. Gao, X. Lu, T. L. Lam, "Boosting Light-weight Depth Estimation via Knowledge Distillation," arXiv preprint arXiv:2105.06143, 2021.
H. Chawla, A. Varma, E. Arani, B. Zonooz, "Multimodal Scale Consistency and Awareness for Monocular Self-supervised Depth Estimation," International Conference on Robotics and Automation, pp. 5140-5146, 2021.
J. Yan, H. Zhao, P. Bu, Y. S. Jin, "Channel-Wise Attention-Based Network for Self-Supervised Monocular Depth Estimation," International conference on 3D Vision, pp. 464-473, 2021.
Z. Wang, A. C. Bovik, H. R. Sheikh, E. P. Simoncelli, "Image Quality Assessment: from Error Visibility to Structural Similarity," Transactions on Image Processing, Vol. 13, No. 4, pp. 600-612, 2004. https://doi.org/10.1109/TIP.2003.819861
C. Wang, J. M. Buenaposada, R. Zhu, S. Lucey, "Learning Depth from Monocular Videos using Direct Methods." Proceedings of the IEEE/CVF International Conference on Computer Vision and Pattern Recognition, pp. 2022-2030, 2018.
W. Yuan, X. Gu, Z. Dai, S. Zhu, P. Tan, "New Crfs: Neural Window Fully-connected Crfs for Monocular Depth Estimation," arXiv preprint arXiv:2203.01502, 2022.

대한임베디드공학회논문지 (IEMEK Journal of Embedded Systems and Applications)

비지도학습 기반의 뎁스 추정을 위한 지식 증류 기법

Knowledge Distillation for Unsupervised Depth Estimation

초록

키워드

과제정보

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)