DOI QR코드

DOI QR Code

SAAnnot-C3Pap: Ground Truth Collection Technique of Playing Posture Using Semi Automatic Annotation Method

SAAnnot-C3Pap: 반자동 주석화 방법을 적용한 연주 자세의 그라운드 트루스 수집 기법

  • 박소현 (숙명여자대학교 빅데이터 활용 연구센터) ;
  • 김서연 (메가존클라우드 데이터팀) ;
  • 박영호 (숙명여자대학교 IT공학과)
  • Received : 2021.11.18
  • Accepted : 2022.06.15
  • Published : 2022.10.31

Abstract

In this paper, we propose SAAnnot-C3Pap, a semi-automatic annotation method for obtaining ground truth of a player's posture. In order to obtain ground truth about the two-dimensional joint position in the existing music domain, openpose, a two-dimensional posture estimation method, was used or manually labeled. However, automatic annotation methods such as the existing openpose have the disadvantages of showing inaccurate results even though they are fast. Therefore, this paper proposes SAAnnot-C3Pap, a semi-automated annotation method that is a compromise between the two. The proposed approach consists of three main steps: extracting postures using openpose, correcting the parts with errors among the extracted parts using supervisely, and then analyzing the results of openpose and supervisely. Perform the synchronization process. Through the proposed method, it was possible to correct the incorrect 2D joint position detection result that occurred in the openpose, solve the problem of detecting two or more people, and obtain the ground truth in the playing posture. In the experiment, we compare and analyze the results of the semi-automated annotation method openpose and the SAAnnot-C3Pap proposed in this paper. As a result of comparison, the proposed method showed improvement of posture information incorrectly collected through openpose.

본 논문에서는 연주자 자세의 그라운드 트루스 획득을 위한 반자동 주석 방법인 SAAnnot-C3Pap를 제안한다. 기존 음악 도메인에서 2차원 관절 위치에 대한 그라운드 트루스를 획득하기 위하여 2차원 자세 추정 방법인 오픈포즈를 활용하거나 수작업으로 라벨링 하였다. 하지만 기존의 오픈포즈와 같은 자동 주석 방법은 빠르지만 부정확한 결과를 보인다는 단점이 있고, 사용자가 직접 주석을 생성하는 수작업 주석화의 경우 많은 노동력이 필요하다는 한계점이 있다. 따라서 본 논문에서는 그 둘의 절충 방안인 반자동 주석화 방법인 SAAnnot-C3Pap을 제안한다. 제안하는 SAAnnot-C3Pap은 크게 3가지 과정으로 오픈포즈를 사용하여 자세를 추출하고, 추출된 부분 중 오류가 있는 부분을 슈퍼바이즐리를 사용하여 수정한 뒤, 오픈포즈와 슈퍼바이즐리의 결과값을 동기화하는 과정을 수행한다. 제안하는 방법을 통하여 오픈포즈에서 발생하는 잘못된 2차원 관절 위치 검출 결과를 교정할 수 있었고, 2명 이상의 사람을 검출하는 문제를 해결하였으며, 연주 자세 그라운드 트루스 획득이 가능하였다. 실험에서는 반자동 주석 방법인 오픈포즈와 본 논문에서 제안하는 SAAnnot-C3Pap의 결과를 비교·분석한다. 비교 결과, 제안하는 SAAnnot-C3Pap는 오픈포즈로 잘못 수집된 자세 정보를 개선한 결과를 보였다.

Keywords

Acknowledgement

이 성과는 정부(과학기술정보통신부)의 재원으로 한국연구재단의 지원을 받아 수행된 연구임(No. NRF-2021R1C1C2004282).

References

  1. S. H. Park and Y. H .Park, "Audio-visual tensor fusion network for piano player posture classification," Applied Sciences, Vol.2020, No.10, pp.6857-6871, 2020. https://doi.org/10.3390/app10196857
  2. Github-CMU-Perceptual-Computing-Lab/openpose [Internet], https://github.com/CMU-Perceptual-Computing-Lab/openpose.
  3. E. Shlizerman, L. Dery, H. Schoen, and I. Kemelmacher-Shlizerman, "Audio to Body Dynamics," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Utah, pp.7574-7583, 2018.
  4. Supervisely - Web platform for computer vision. Annotation ... [Internet], https://supervise.ly.
  5. K. W. Lee, "Video Data Retrieval System using Annotation and Feture Information," Journal of Korea Academia-Industrial Cooperation Society, Vol.7, No.6, pp.1129-1133, 2006.
  6. Y. H. Lee, K. J. Oh, H. M. Deok, S. B. Park, and G. S. Jo, "A collaborative video annotation and browsing system using linked data," Journal of Intelligence and Information Systems, No.5, pp.455-460, 2011.
  7. J. S. Kim, T. E. Kim, and M. H. Kim, "A video annotation system that automatically generates spatio-temporal relationships among objects," KIISE Transactions on Computing Practices, Vol.18, No.10, pp.701-710, 2012.
  8. K. W. Lee, T. H. Oh, and K. M. Cho, "A multimedia database system using method of automatic annotation update," Journal of Korea Academia-Industrial cooperation Society, Vol.11, No.6, pp.417-420, 2006.
  9. W. S, Sohn, J. K. Kim, S. B. Lim, and Y. C. Choy, "A method of context based free-form annotation in XML documents," Journal of KIISE: Software and Applications, Vol.30, No.9.10, pp.850-861, 2003.
  10. H. G. An and J. J. Koh, "Development of multimedia annotation and retrieval system using MPEG-7 based semantic metadata model," The KIPS Transactions: Part D, Vol.14, No.6, pp.573-584, 2007.
  11. J. S. Kim, K. Y. Kim, H. I. Kim, and Y. S. Kim, "A video annotation system with automatic human detection from video surveillance data," KIISE Transactions on Computing Practices, Vol.18, No.11, pp.808-812, 2012.
  12. D. W. Jang, J. W. Lee, and J. S. Lee, "Development of semiautomatic annotation tool for building land cover image data set," Journal of Broadcast Engineering, Vol.2019, No.11, pp.69-70, 2019
  13. K. H. Park, B. C. Ko, and J. Y. Nam, "Medical image automatic annotation using multi-class SVM and annotation code array," The KIPS Transactions: Part B, Vol.16, No.4, pp.281-288, 2009.
  14. K. G. Lee and M. Cremer, "Automatic labeling of training data for vocal/non-vocal discrimination," The Journal of the Korean Electro-Acoustic Music Society, No.7, pp.29-41, 2009.
  15. G. Won et al., "Development of multi-modal sensor data annotation system for elderly behavior analysis," IEIE Transactions on Smart Processing & Computing, Vol.2019, No.11, pp.781-783, 2019.
  16. S. W. Lim and G. M. Park, "Development of python-based annotation tool program for constructing object recognition deep-learning model," Journal of Broadcast Engineering, Vol.25, No.3, pp.386-398, 2020. https://doi.org/10.5909/JBE.2020.25.3.386
  17. E. K. Lee, Y. W. Kim, and S. S. Kim, "Research: Annotation technique development based on apparel attributes for visual apparel search technology," Fashion & Textile Research Journal, Vol.17, No.5, pp.731-740, 2015. https://doi.org/10.5805/SFTI.2015.17.5.731
  18. Y. J. Na, Y. L. Cho, and J. H. Kim, "AnsNGS: An annotation system to sequence variations of next generation sequencing data for disease-related phenotypes," Healthcare Informatics Research, Vol.19, No.1, pp.50-55, 2013. https://doi.org/10.4258/hir.2013.19.1.50
  19. M. H. Jeon, Y. J. Lee, Y. S. Shin, H. S. Jang, T. K. Yeu, and A. Y. Kim, "Annotation Tool = synthesizing image and automated annotation tool for CNN based under water object detection," Journal of Korea Robotics Society, Vol.14, No.2, pp.139-149, 2019. https://doi.org/10.7746/jkros.2019.14.2.139
  20. S. S. Yoon, H. Y. Moon, and Y. W. Rhee, "A WWW images automatic annotation based on multi-cues integration," Journal of the Korea Society of Computer and Information, Vol.13, No.4, pp.79-86, 2008.
  21. J. Y. Choi, W. De Neve, Y. M. Ro, and K. N. Plataniotis, "Automatic face annotation in personal photo collections using context-based unsupervised clustering and face information fusion," IEEE Transactions on Circuits and Systems for Video Technology, Vol.20, No.10, pp.1292-1309, 2010. https://doi.org/10.1109/TCSVT.2010.2058470
  22. F. Meng et al., "Procrustes: A python library to find transformations that maximize the similarity between matrice," Computer Physics Communications, Vol.276, pp.108334, 2022. https://doi.org/10.1016/j.cpc.2022.108334
  23. G. H. Golub and C. F. Van Loan, "Matrix Computations," 3rd ed. Baltimore, MD: Johns Hopkins, 1996.
  24. Eric W. Weisstein, "Frobenius Norm," From MathWorld--A Wolfram Web Resource.
  25. "Simple examples of multidimensional scaling (MDS, Multidimensional Scaling) and Procrustes Analysis on the joint points of human posture (python)", ProgrammerSought [Internet], https://www.programmersought.com/article/74856266632.