Browse > Article
http://dx.doi.org/10.6109/jkiice.2020.24.12.1574

2D and 3D Hand Pose Estimation Based on Skip Connection Form  

Ku, Jong-Hoe (Department of Computer Engineering, Pusan National University)
Kim, Mi-Kyung (Software Education Center, Pusan National University)
Cha, Eui-Young (Department of Computer Engineering, Pusan National University)
Abstract
Traditional pose estimation methods include using special devices or images through image processing. The disadvantage of using a device is that the environment in which the device can be used is limited and costly. The use of cameras and image processing has the advantage of reducing environmental constraints and costs, but the performance is lower. CNN(Convolutional Neural Networks) were studied for pose estimation just using only camera without these disadvantage. Various techniques were proposed to increase cognitive performance. In this paper, the effect of the skip connection on the network was experimented by using various skip connections on the joint recognition of the hand. Experiments have confirmed that the presence of additional skip connections other than the basic skip connections has a better effect on performance, but the network with downward skip connections is the best performance.
Keywords
Hand joint estimation; Deep learning; Encoder-Decoder; Skip connection; U-Net;
Citations & Related Records
연도 인용수 순위
  • Reference
1 K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770-778, 2016.
2 F. Xiong, B. Zhang, Y. Xiao, Z. Cao, T. Yu, J. T. Zhou, and J. Yuan, "A2j: Anchor-to-joint regression network for 3d articulated pose estimation from a single depth image." In Proceedings of the IEEE International Conference on Computer Vision, pp. 793-802, 2019.
3 F. Gomez-Donoso, S. Orts-Escolano, and M. Cazorla, "Large-scale multiview 3d hand pose dataset," Image and Vision Computing, vol. 81, 25-33, 2019.   DOI
4 T. Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollar, "Focal loss for dense object detection," In Proceedings of the IEEE international conference on computer vision, pp. 2980-2988, 2017.
5 X. Zhou, D. Wang, and P. Krahenbuhl, "Objects as points," arXiv preprint arXiv:1904.07850, 2019.
6 G. Huang, Z. Liu, L. V. D. Maaten, and K. Q. Weinberger, "Densely connected convolutional networks," In Proceedings of the IEEE conference on computer vision and pattern recognitio, pp. 4700-4708, 2017.
7 F. Mueller, F. Bernard, O. Sotnychenko, D. Mehta, S. Sridhar, D. Casas, and C. Theobalt, "Ganerated hands for real-time 3d hand tracking from monocular rgb," In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 49-59, 2018.
8 L.Ge, Y. Cai, J. Weng, and J. Yuan, "Hand pointnet: 3d hand pose estimation using point sets," In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8417-8426, 2018.
9 Z. Cao, T. Simon, S. E. Wei, and Y. Sheikh, "Realtime multi-person 2D pose estimation using part affinity fields," In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7291-7299, 2017.
10 K. Sun, B. Xiao, D. Liu, and J. Wang, "Deep high-resolution representation learning for human pose estimation," In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5693-5703, 2019.
11 O. Ronneberger, P. Fischer, and T. Brox, "U-net: Convolutional networks for biomedical image segmentation," In International Conference on Medical image computing and computerassisted interventio. Springer, Cham, pp. 234-241, 2015.
12 S. Woo, J. Park, J. Y. Lee, and I. So Kweon, "Cbam: Convolutional block attention module," In Proceedings of the European conference on computer vision (ECCV), pp. 3-19, 2018.