• Title/Summary/Keyword: monocular multi-frame model

Search Result 1, Processing Time 0.013 seconds

Multi-View 3D Human Pose Estimation Based on Transformer (트랜스포머 기반의 다중 시점 3차원 인체자세추정)

  • Seoung Wook Choi;Jin Young Lee;Gye Young Kim
    • Smart Media Journal
    • /
    • v.12 no.11
    • /
    • pp.48-56
    • /
    • 2023
  • The technology of Three-dimensional human posture estimation is used in sports, motion recognition, and special effects of video media. Among various methods for this, multi-view 3D human pose estimation is essential for precise estimation even in complex real-world environments. But Existing models for multi-view 3D human posture estimation have the disadvantage of high order of time complexity as they use 3D feature maps. This paper proposes a method to extend an existing monocular viewpoint multi-frame model based on Transformer with lower time complexity to 3D human posture estimation for multi-viewpoints. To expand to multi-viewpoints our proposed method first generates an 8-dimensional joint coordinate that connects 2-dimensional joint coordinates for 17 joints at 4-vieiwpoints acquired using the 2-dimensional human posture detector, CPN(Cascaded Pyramid Network). This paper then converts them into 17×32 data with patch embedding, and enters the data into a transformer model, finally. Consequently, the MLP(Multi-Layer Perceptron) block that outputs the 3D-human posture simultaneously updates the 3D human posture estimation for 4-viewpoints at every iteration. Compared to Zheng[5]'s method the number of model parameters of the proposed method was 48.9%, MPJPE(Mean Per Joint Position Error) was reduced by 20.6 mm (43.8%) and the average learning time per epoch was more than 20 times faster.

  • PDF