Browse > Article
http://dx.doi.org/10.3837/tiis.2019.04.017

2.5D human pose estimation for shadow puppet animation  

Liu, Shiguang (School of Computer Science and Technology, Division of Intelligence and Computing, Tianjin University)
Hua, Guoguang (School of Information and Electrical Engineering, Hebei University of Engineering)
Li, Yang (School of Computer Science and Technology, Division of Intelligence and Computing, Tianjin University)
Publication Information
KSII Transactions on Internet and Information Systems (TIIS) / v.13, no.4, 2019 , pp. 2042-2059 More about this Journal
Abstract
Digital shadow puppet has traditionally relied on expensive motion capture equipments and complex design. In this paper, a low-cost driven technique is presented, that captures human pose estimation data with simple camera from real scenarios, and use them to drive virtual Chinese shadow play in a 2.5D scene. We propose a special method for extracting human pose data for driving virtual Chinese shadow play, which is called 2.5D human pose estimation. Firstly, we use the 3D human pose estimation method to obtain the initial data. In the process of the following transformation, we treat the depth feature as an implicit feature, and map body joints to the range of constraints. We call the obtain pose data as 2.5D pose data. However, the 2.5D pose data can not better control the shadow puppet directly, due to the difference in motion pattern and composition structure between real pose and shadow puppet. To this end, the 2.5D pose data transformation is carried out in the implicit pose mapping space based on self-network and the final 2.5D pose expression data is produced for animating shadow puppets. Experimental results have demonstrated the effectiveness of our new method.
Keywords
Human pose estimation; shadow puppet; mapping network; 2.5 pose data; CNN;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 V. Ramakrishna, T. Kanade, and Y. Sheikh, "Tracking human pose by tracking symmetric parts," in Proc. of the 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3728-3735, 2013.
2 M. Andriluka, L. Pishchulin, P. Gehler, B. Schiele, "2d human pose estimation: New benchmark and state of the art analysis," In Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3686-3693, 2014.
3 S. Johnson, M. Everingham, "Clustered pose and nonlinear appearance models for human pose estimation," in Proc. of British Machine Vision Conference (BMVC), pp. 1-11, 2010.
4 D. Park and D. Ramanan, "N-best maximal decoders for part models," in Proc. of the IEEE International Conference on Computer Vision (ICCV), pp. 2627-2634, 2011.
5 H. Zhang, Y. Song, Z. Chen, "Chinese shadow puppetry with an interactive interface using the Kinect sensor," in Proc. of the 12th international conference on Computer Vision (ECCV), pp. 352-361, 2012.
6 S. W. Hsu, T. Y. Li, "Planning character motions for shadow play animations," in Proc. of the International Conference on Computer Animation and Social Agents (CASA), pp. 184-190, 2005.
7 L. Fu, J. P. Cai, "Research and application of digital shadow-play bones animation," Computer Engineering and Design, vol. 34, no. 1, pp. 241-246, 2013.
8 L. Leite and V. Orvalho, "Anim-actor: understanding inter with digital puppetry using low-cost motion capture," in Proc. of the 8th International Conference on Advances in Computer Entertainment Technology, pp.65, 2011.
9 M. Lin, Z. Hu, S. Liu, "eHeritage of shadow puppetry: creation and manipulation," in Proc. of the 21st ACM international conference on Multimedia, pp. 183-192, 2013.
10 K. Tan, A. Talib, M. Osman, "Real-time visual simulation and interactive animation of shadow play puppets using openGL," World Academy of Science Engineering and Technology, vol. 47, no. 2008, pp. 212-218, 2008.
11 D. H. Kim, M. Y. Sung, J. S. Park, "Realtime control for motion creation of 3d avatars," in Proc. of the 6th Pacific-Rim Conference on Advances in Multimedia Information Processing, pp. 25-36, 2005.
12 R. Held, A. Gupta, B. Curless, "3D puppetry:a kinect-based interface for 3D animation," in Proc. of the 25th annual ACM symposium on User interface software and technology, pp423-434, 2012.
13 H. J. Shin, J. Lee, S. Y. Shin, "Computer puppetry: An importance-based approach," ACM Transactions on Graphics (TOG), vol. 20, no. 2, pp. 67-94, 2001.   DOI
14 C. Ionescu, J. Carreira, C. Sminchisescu, "Iterated second-order label sensitive pooling for 3d human pose estimation," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1661-1668, 2014.
15 B. Tekin, I. Katircioglu, M. Salzmann, V. Lepetit, and P. Fua, "Structured prediction of 3D human pose with deep neural networks," in Proc. of International Conference British Machine Vision Conference (BMVC), pp.130.1-130.11, 2016.
16 L. Luis, V. Orvalho, " Shape your body: control a virtual silhouette using body motion," in Proc. of ACM CHI Extended Abstracts on Human Factors in Computing Systems, pp. 1913-1918, 2012.
17 Y. S. Iin, C. K. Shie, S. C. Chen, "Action recognition for human-marionette interaction," in Proc. of the ACM international conference on Multimedia, pp. 39-48, 2012.
18 H. Yasin, U. Iqbal, B. Kruger, A. Weber, J. Gall, "A dual-source approach for 3D pose estimation from a single image," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2010, no. 1, pp. 4948-4956, 2016.
19 CH Chen, D Ramanan, "3D human pose estimation=2D pose estimation+matching," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5759-5767, 2017.
20 X Zhou, M Zhu, S Leonardos, et al. "Sparseness meets deepness: 3D human pose estimation from monocular video," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 4966-4975, 2016.
21 F Moreno-Noguer, "3D human pose estimation from a single image via distance matrix regression," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1561-1570, 2017.
22 B. Tekin, P. Marguez-Neila, M. Salzmann, "Learning to fuse 2D and 3D image cues for monocular body pose estimation," in Proc. of the IEEE international Conference on Computer Vision (ICCV), pp. 3961-3970, 2007.
23 N Bruce Xiaohan, P. Wei, S. C. Zhu, "Monocular 3D Human Pose Estimation by Predicting Depth on Joints," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3467-3475, 2017.
24 X. Chen, A. L. Yuille, "Articulated pose estimation by a graphical model with image dependent pairwise relations," Advances in Neural Information Processing Systems, pp. 1736-1744, 2014.
25 D. Zhang and M. Shah, "Human pose estimation in videos," in Proc. of the IEEE International Conference on Computer Vision (ICCV), pp. 2012-2020, 2015.
26 Y. Li and S. G. Liu, "Temporal-coherency-aware human pose estimation in video via pre-trained res-net and flow-cnn," in Proc. of International Conference on Computer Animation and Social Agents (CASA), pp. 150-159, 2017.
27 A. O. Balan, L. Sigal, M. J. Black, J. E. Davis, H. W. Haussecker, "Detailed human shape and pose from images," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1-8, 2007.
28 L. Sigal, M. Isard, H. Haussecker, M. J. Black, "Loose-limbed people: Estimating 3D human pose and motion using non-parametric belief propagation," International Journal of Computer Vision, vol. 98, no. 1, pp. 15-48, 2012.   DOI
29 S. Gammeter, A. Ess, T. Jaggli, K. Schindler, B. Leibe, L. V. Gool, "Articulated multi-body tracking under egomotion," in Proc. of the 10th European Conference on Computer Vision, pp. 816-830, 2008.
30 J. Gall, B. Rosenhahn, T. Brox, H. P. Seidel, "Optimization and filtering for human motion capture," International Journal of Computer Vision, vol. 87, no. 1-2, pp. 75-92, 2010.   DOI
31 Pavlakos, Georgios, et al., "Learning to Estimate 3D Human Pose and Shape from a Single Color Image," arXiv preprint arXiv:1805.04092, 2018.
32 A. Agarwal, B. Triggs, "3D human pose from silhouettes by relevance vector regression," in Proc. of the 2004 IEEE computer society conference on Computer Vision and Pattern Recognition (CVPR), pp.882-888, 2004.
33 C. Bregler, A. Hertzmann, H. Biermann, "Recovering non-rigid 3D shape from image streams," in Proc. of the 2000 IEEE computer society conference on Computer Vision and Pattern Recognition (CVPR), pp.690-696, 2000.
34 S Liu, Y Li, G Hua. "Human Pose Estimation in Video via Structured Space Learning and Halfway Temporal Evaluation," IEEE Transactions on Circuits and Systems for Video Technology, pp.1, 2018.
35 A. Kanaujia, C. Sminchisescu, D. Metaxas, "Semi-supervised hierarchical models for 3d human pose reconstruction," in Proc. of the 2007 IEEE computer society conference on Computer Vision and Pattern Recognition (CVPR), pp. 1-8, 2007.
36 BX Nie, P Wei, SC Zhu. "Monocular 3D human pose estimation by predicting depth on joints," in Proc. of the IEEE International Conference on Computer Vision (ICCV), pp. 3467-3475, 2017.
37 J Martinez, R Hossain, J Romero et al. "A simple yet effective baseline for 3d human pose estimation," in Proc. of the IEEE International Conference on Computer Vision (ICCV), pp. 2659-2668, 2017.
38 B. Tekin, A. Rozantsev, V. Lepetit, and P. Fua, "Direct prediction of 3D body poses from motion compensated sequences," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 991-1000, 2016.
39 C. Theobalt, "VNect: real-time 3D human pose estimation with a single RGB camera," ACM Transactions on Graphics (TOG), vol. 36, no. 4, pp. 44, 2017.
40 V. Ramakrishna, D. Munoz, M. Hebert, J. A. Bagnell, and Y. Sheikh, "Pose machines: Articulated pose estimation via inference machines," In Proc. of ECCV, pp. 33-47, 2014.
41 J. Shotton, T. Sharp, A. Kipman, A. Fitzgibbon, M. Finoc-chio, A. Blake, M. Cook, R. Moore, "Real-time human pose recognition in parts from single depth images," Communications of the ACM, vol. 56, no. 1, pp. 116-124, 2013.   DOI
42 S. Li, A. B. Chan, "3d human pose estimation from monocular images with deep convolutional neural network," in Proc. of Asian Conference on Computer Vision, pp. 332-347, 2014.
43 X. Chu, W. Ouyang, H. Li, X. Wang, "Structured feature learning for pose estimation," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4715-4723, 2016.
44 KM. Lee, HM. Won, "Dynamic gesture recognition using a model-based temporal self-similarity and its application to taebo gesture recognition," KSII Transactions on Internet and Information Systems, vol. 7, no. 11, pp. 2824-2838, 2013.   DOI
45 A. Sirota, D. Sheinker, O. Yossef, "Controlling a virtual marionette using a web camera," Mahmoudzeidan Com, vol. 28, no. 5, pp. vii-ix, 2004.
46 I. Takeo, I. Yuki, "Implementing as-rigid-as-possible shape manipulation and surface flattening," Journal of Graphics Gpu and Game Tools, vol. 14, no. 1, pp. 17-30, 2009.   DOI
47 H. Zhang, Y. Song, Z. Chen, "Chinese shadow puppetry with an interactive interface using the Kinect sensor," in Proc. of International Conference on Computer Vision, vol.7583, pp. 352-361, 2012.
48 F. P. L. Chen, "AVisions for the masses: Chinese shadow plays from shaanxi and shanxi," East Asia Program, vol. 64, no. 2, pp. 324-327, 2004.
49 A. Barlev, A. M. Bruckstein, G. Elber, "Virtual marionettes: a system and paradigm for real-time 3D animation," Visual Computer, vol. 21, no. 7, pp. 488-501, 2005.   DOI
50 T. Kohonen, "Self-organized formation of topologically correct feature maps," Biological Cybernetics, vol. 43, no. 1, pp. 59-69, 1982.   DOI
51 D. Weinland, M.Ozuysal, and P. Fua, "Making action recognition robust to occlusions and viewpoint changes," in Proc. of the 11th European conference on computer vision conference on Computer Vision (ECCV), pp. 635-648, 2010.
52 X. Fan, K. Zheng, Y. Zhou, S. Wang, "Pose locality constrained representation for 3D human pose reconstruction," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 174-188, 2014.
53 R. Urtasun and T. Darrell, "Sparse probabilistic regression for activity-independent human pose inference," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1-8, 2008.
54 A. Klaser, M. Marszalek, C. Schmid, "A spatio-temporal descriptor based on 3d-gradients," in Proc. of British Machine Vision Conference (BMVC), pp.99.1-99.10, 2008.
55 L. Sigal, A. O. Balan, M. J. Black, "Humaneva: Synchronized video and motion capture dataset and baseline algorithm for evaluation of articulated human motion," International Journal of Computer Vision, vol. 87, no. 1-2, pp. 4-27, 2010.   DOI