DOI QR코드

DOI QR Code

Aural-visual two-stream 기반의 아기 울음소리 식별

Aural-visual two-stream based infant cry recognition

  • 박철 (고려대학교 컴퓨터정보학과) ;
  • 이종욱 (고려대학교 컴퓨터융합소프트웨어학과) ;
  • 오스만 (고려대학교 컴퓨터정보학과) ;
  • 박대희 (고려대학교 컴퓨터융합소프트웨어학과) ;
  • 정용화 (고려대학교 컴퓨터융합소프트웨어학과)
  • Bo, Zhao (Dept. of Computer Information Science, Korea University) ;
  • Lee, Jonguk (Dept. of Computer Convergence Software, Korea University) ;
  • Atif, Othmane (Dept. of Computer Information Science, Korea University) ;
  • Park, Daihee (Dept. of Computer Convergence Software, Korea University) ;
  • Chung, Yongwha (Dept. of Computer Convergence Software, Korea University)
  • 발행 : 2021.05.12

초록

Infants communicate their feelings and needs to the outside world through non-verbal methods such as crying and displaying diverse facial expressions. However, inexperienced parents tend to decode these non-verbal messages incorrectly and take inappropriate actions, which might affect the bonding they build with their babies and the cognitive development of the newborns. In this paper, we propose an aural-visual two-stream based infant cry recognition system to help parents comprehend the feelings and needs of crying babies. The proposed system first extracts the features from the pre-processed audio and video data by using the VGGish model and 3D-CNN model respectively, fuses the extracted features using a fully connected layer, and finally applies a SoftMax function to classify the fused features and recognize the corresponding type of cry. The experimental results show that the proposed system classification exceeds 0.92 in F1-score, which is 0.08 and 0.10 higher than the single-stream aural model and single-stream visual model.

키워드

과제정보

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2018R1D1A3B07044938 and NRF-2020R1I1A3070835) and by BK21 FOUR (Fostering Outstanding Universities for Research) funded by the Ministry of Education (MOE, Korea) and National Research Foundation of Korea (NRF)