Browse > Article
http://dx.doi.org/10.3745/KTSDE.2022.11.12.517

Semantic Occlusion Augmentation for Effective Human Pose Estimation  

Hyun-Jae, Bae (성균관대학교 소프트웨어학과)
Jin-Pyung, Kim ((재)차세대융합기술연구원)
Jee-Hyong, Lee (성균관대학교 소프트웨어학과)
Publication Information
KIPS Transactions on Software and Data Engineering / v.11, no.12, 2022 , pp. 517-524 More about this Journal
Abstract
Human pose estimation is a method of estimating a posture by extracting a human joint key point. When occlusion occurs, the joint key point extraction performance is lowered because the human joint is covered. The occlusion phenomenon is largely divided into three types of actions: self-contained, covered by other objects, and covered by background. In this paper, we propose an effective posture estimation method using a masking phenomenon enhancement technique. Although the posture estimation method has been continuously studied, research on the occlusion phenomenon of the posture estimation method is relatively insufficient. To solve this problem, the author proposes a data augmentation technique that intentionally masks human joints. The experimental results in this paper show that the intentional use of the blocking phenomenon enhancement technique is strong against the blocking phenomenon and the performance is increased.
Keywords
Data Augmentation; Occlusion; Human Pose Estimation; Deep Learning;
Citations & Related Records
Times Cited By KSCI : 5  (Citation Analysis)
연도 인용수 순위
1 S. Yun, D. Han, S. J. Oh, S. Chun, J. Choe, and Y. Yoo, "Cutmix: Regularization strategy to train strong classifiers with localizable features," Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019.
2 W. Li et al., "Rethinking on multi-stage networks for human pose estimation," arXiv preprint arXiv:1901.00148, 2019. 
3 L. Ladicky, P. H. S. Torr, and A. Zisserman, "Human pose estimation using a joint pixel-wise and part-wise formulation," In 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp.3578-3585, 2013.
4 B., Xiao, H., Wu, and Y. Wei, "Simple baselines for human pose estimation and tracking," Proceedings of the European Conference on Computer Vision (ECCV). 2018.
5 T.-Y. Lin et al., "Microsoft COCO: common objects in context," CoRR, arXiv preprint arXiv:1405.0312, 2014.
6 H. J. Bae, G. J. Jang, Y. H. Kim, and J. P. Kim, "LSTM (long short-term memory)-based abnormal behavior recognition using AlphaPose," KIPS Transactions on Software and Data Engineering, Vol.10, No.5, pp.187-194, 2021.
7 R. Girdhar, G. Gkioxari, L. Torresani, M. Paluri, and D. Tran, "Detect-and-Track: Efficient pose estimation in videos," In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018.
8 Z. Li, X. Chen, W. Zhou, Y. Zhang, and J. Yu, "Pose2body: Pose-guided human parts segmentation," In 2019 IEEE International Conference on Multimedia and Expo (ICME), IEEE, pp.640-645, 2019.
9 K. Sun, B. Xiao, D. Liu, and J. Wang, "Deep HighResolution representation learning for human pose estimation," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019.
10 Z. Cao, G. Hidalgo, T. Simon, S.-E. Wei, and Y. Sheikh, "Openpose: Realtime multi-person 2d pose estimation using part affinity fields," CoRR, arXiv preprint arXiv:1812. 08008, 2018.
11 M. Andriluka, L. Pishchulin, P. Gehler, and B. Schiele, "2d human pose estimation: New benchmark and state of the art analysis," In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2014.
12 N. D. Reddy, M. Vo, and S. G. Narasimhan. "Occlusion-net: 2d/3d occluded keypoint localization using graph networks," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019
13 H. Guo, Y. Mao, and R. Zhang, "Mixup as locally linear out-of-manifold regularization," CoRR, arXiv preprint arXiv:1809.02499, 2018.
14 Y. Chen, Z. Wang, Y. Peng, Z. Zhang, G. Yu, and J. Sun. "Cascaded pyramid network for multi-person pose estimation," In the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2018.
15 T. Golda, T. Kalb, A. Schumann, and J. Beyerer, "Human pose estimation for real-world crowded scenarios," arXiv preprint arXiv:1907.06922, 2019.
16 I. Sarandi, T. Linder, K. O. Arras, and B. Leibe, "How robust is 3D human pose estimation to occlusion?," arXiv preprint arXiv: 1808.09316, 2018.
17 R. Pytel, O. S. Kayhan, and J. C. van Gemert, "Tilting at windmills: Data augmentation for deep pose estimation does not help with occlusions," 2020 25th International Conference on Pattern Recognition (ICPR), IEEE, 2021.
18 Z. Fang and A. M. Lopez, "Intention recognition of pedestrians and cyclists by 2d pose estimation," arXiv preprint arXiv:1910.03858, 2019.
19 P. A. Dias, D. Malafronte, H. Medeiros, and F. Odone, "Gaze estimation for assisted living environments," In the IEEE Winter Conference on Applications of Computer Vision, pp.290-299, 2020.
20 Y. Huang, B. Sun, H. Kan, J. Zhuang, and Z. Qin, "Followmeup sports: New benchmark for 2d human keypoint recognition," arXiv preprint arXiv:1911.08344, 2019.
21 P. S. R. Kishore, S. Das, P. S. Mukherjee, and U. Bhattacharya, "Cluenet : A deep framework for occluded pedestrian pose estimation," 12 2019.
22 U. Rafi, J. Gall, and B. Leibe, "A semantic occlusion model for human pose estimation from a single depth image," In 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp.67-74, 2015.
23 B. Cheng, B. Xiao, J. Wang, H. Shi, T. S. Huang, and L. Zhang, "Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation," In the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2020.
24 C. Shorten and T. M. Khoshgoftaar, "A survey on image data augmentation for deep learning," Journal of Big Data, Vol.6, No.1, pp.60, 2019.
25 L. Taylor and G. Nitschke, "Improving deep learning using generic data augmentation," arXiv preprint arXiv:1708.06020, 2017.
26 L. Ke, M.-C. Chang, H. Qi, and S. Lyu, "Multiscale structure-aware network for human pose estimation," CoRR, arXiv preprint arXiv:1803.09894, 2018.