DOI QR코드

DOI QR Code

Aural-visual two-stream based infant cry recognition

Aural-visual two-stream 기반의 아기 울음소리 식별

  • Bo, Zhao (Dept. of Computer Information Science, Korea University) ;
  • Lee, Jonguk (Dept. of Computer Convergence Software, Korea University) ;
  • Atif, Othmane (Dept. of Computer Information Science, Korea University) ;
  • Park, Daihee (Dept. of Computer Convergence Software, Korea University) ;
  • Chung, Yongwha (Dept. of Computer Convergence Software, Korea University)
  • 박철 (고려대학교 컴퓨터정보학과) ;
  • 이종욱 (고려대학교 컴퓨터융합소프트웨어학과) ;
  • 오스만 (고려대학교 컴퓨터정보학과) ;
  • 박대희 (고려대학교 컴퓨터융합소프트웨어학과) ;
  • 정용화 (고려대학교 컴퓨터융합소프트웨어학과)
  • Published : 2021.05.12

Abstract

Infants communicate their feelings and needs to the outside world through non-verbal methods such as crying and displaying diverse facial expressions. However, inexperienced parents tend to decode these non-verbal messages incorrectly and take inappropriate actions, which might affect the bonding they build with their babies and the cognitive development of the newborns. In this paper, we propose an aural-visual two-stream based infant cry recognition system to help parents comprehend the feelings and needs of crying babies. The proposed system first extracts the features from the pre-processed audio and video data by using the VGGish model and 3D-CNN model respectively, fuses the extracted features using a fully connected layer, and finally applies a SoftMax function to classify the fused features and recognize the corresponding type of cry. The experimental results show that the proposed system classification exceeds 0.92 in F1-score, which is 0.08 and 0.10 higher than the single-stream aural model and single-stream visual model.

Keywords

Acknowledgement

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2018R1D1A3B07044938 and NRF-2020R1I1A3070835) and by BK21 FOUR (Fostering Outstanding Universities for Research) funded by the Ministry of Education (MOE, Korea) and National Research Foundation of Korea (NRF)