DOI QR코드

DOI QR Code

영상신호를 입력으로 하는 3D ResNet기반 유아 행동 인식 기법

3D ResNet-based Children's Behavior Recognition Method Using Video Image Sequence

  • 박재석 (가천대학교 산업경영공학과) ;
  • 차기주 (가천대학교 유아교육학과) ;
  • 최아영 (가천대학교 AI.소프트웨어학부)
  • 투고 : 2023.04.02
  • 심사 : 2023.06.02
  • 발행 : 2023.06.30

초록

본 연구에서는 다수의 유아가 등장하는 영상 내의 행동을 인식하기 위하여 딥러닝 기반의 유아 행동 인식 기술을 개발하였다. 유아들의 경우 동일한 행동이라도 표현과 방법이 다양하여 다양한 종류의 입력에 강건하게 분석될 수 있는 딥러닝 모델에 대한 개발이 필요하다. 본 연구에서는 입력 신호를 딥러닝의 입력에 맞도록 처리하고 3D ResNet을 사용하여 행동 인식 알고리즘을 제안하였다. 50명의 유아를 대상으로 13개 행동을 수행하는 영상 자료를 수집하였으며, 실험결과 13개의 행동 인식에 평균 72.21% 정확도를 보였다. 행동 중 서 있기 90.74%, 밀고 당기기 88.89%, 앉기 90.74%의 행동 인식률을 보였다. 향후 본 연구 결과물을 통해 일상생활에서 유아들의 행동 패턴을 자동으로 분석하고 서비스하는 연구에 활용될 수 있다.

In this study we propose a deep learning model to detect children behavior from the video sequence. In the case of children it is necessary to develop a deep learning model that can be robustly analyzed for various types of inputs with various expressions and methods even for the same behavior. In this study we propose an action recognition algorithm using 3D ResNet based on input sequence image. For data collection image data of performing 13 actions were acquired for 50 children and as a result of the experiment an average of 72.21% accuracy was shown in recognizing 13 actions. In the case of actions such as standing pushing and pulling and sitting the recognition rate was around 90% accuracy In the future, the results of this study can be used for research that automatically analyzes and provides services for behavior patterns of young children in daily life.

키워드

과제정보

이 논문은 2020년 대한민국 교육부와 한국연구재단의 지원을 받아 수행된 연구임.(2020S1A5A2A03041734)

참고문헌

  1. Bisong, E., and Bisong, E. (2019). Matplotlib and seaborn. Building Machine Learning and Deep Learning Models on Google Cloud Platform: A Comprehensive Guide for Beginners, 151-165.
  2. Chen, K., Wang et al. (2019). MMDetection: Open mmlab detection toolbox and benchmark. arXiv preprint arXiv:1906.07155.
  3. Duan, H., Zhao, Y., Chen, K., Lin, D., and Dai, B. (2022). Revisiting skeleton-based action recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and P attern Recognition, June 18-24, LA, USA, pp. 2969-2978.
  4. Hara, K., Kataoka, H., and Satoh, Y. (2017). Learning spatio-temporal features with 3d residual networks for action recognition. In Proceedings of the IEEE international conference on computer vision workshops, Oct. 22-29, Venice, Italy, pp. 3154-3160.
  5. Jobanputra, C., Bavishi, J., and Doshi, N. (2019). Human activity recognition: A survey. P rocedia Computer Science, 155, 698-703. https://doi.org/10.1016/j.procs.2019.08.100
  6. Jung, D. J. and Yun, J. O. (2011) Human Activity Recognition using Model-based Gaze Direction Estimation, J ournal of the Korea Society Industrial Information System, 16(4), 9-18. https://doi.org/10.9723/jksiis.2011.16.4.009
  7. Kale, G. V. (2019) Human activity recognition on real time and offline dataset. Int. J . Intell. Syst. Appl. Eng. 7(1), 60, 8211-65.
  8. Kim, K., Jalal, A., and Mahmood, M. (2019). Vision-based human activity recognition system using depth silhouettes: A smart home system for monitoring the residents. J ournal of Electrical Engineering & Technology, 14, 2567-2573.
  9. MMCV Contributors. 2018. MMCV: OpenMMLab Computer Vision Foundation. https://mmcv.readthedocs.io/en/2.x/get_started/introduction.html.
  10. Olalere, F., Brouwers, V., Doyran, M., Poppe, R., and Salah, A. A. (2021). Video-Based Sports Activity Recognition for Children. In IEEE Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), December 14-17, Tokyo, Japan, pp. 1563-1570.
  11. Ren, S., He, K., Girshick, R., and Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28.
  12. Shi, L., Zhang, Y., Cheng, J., and Lu, H. (2019). Two-stream adaptive graph convolutional networks for skeleton-based action recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition June 15-20, CA, US, pp. 12026-12035.
  13. Ren, S., He, K., Girshick, R., and Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28.
  14. Shahroudy, A., Liu, J., Ng, T. T., and Wang, G. (2016). Ntu rgb+d: A large scale dataset for 3d human activity analysis. In Proceedings of the IEEE conference on computer vision and pattern recognition, June 27-30, Las Vegas, NV, USA, pp. 1010-1019.
  15. Staudenmayer, J., Pober, D., Crouter, S., Bassett, D., and Freedson, P. (2009). An artificial neural network to estimate physical activity energy expenditure and identify physical activity type from an accelerometer. J ournal of applied physiology. 107(4), 1300-1307. https://doi.org/10.1152/japplphysiol.00465.2009
  16. Wang, J. et al. (2020). Deep high-resolution representation learning for visual recognition. IEEE transactions on pattern analysis and machine intelligence, 43(10), 3349-3364. https://doi.org/10.1109/TPAMI.2020.2983686
  17. Zhang, H. B., Zhang, Y. X., Zhong, B., Lei, Q., Yang, L., Du, J. X., and Chen, D. S. (2019). A comprehensive survey of vision-based human action recognition methods. Sensors, 19(5), 1005.