DOI QR코드

DOI QR Code

Design and Implementation of CNN-based HMI System using Doppler Radar and Voice Sensor

도플러 레이다 및 음성 센서를 활용한 CNN 기반 HMI 시스템 설계 및 구현

  • Oh, Seunghyun (School of Electronics and Information Engineering, Korea Aerospace University) ;
  • Bae, Chanhee (School of Electronics and Information Engineering, Korea Aerospace University) ;
  • Kim, Seryeong (School of Electronics and Information Engineering, Korea Aerospace University) ;
  • Cho, Jaechan (School of Electronics and Information Engineering, Korea Aerospace University) ;
  • Jung, Yunho (School of Electronics and Information Engineering, Korea Aerospace University)
  • Received : 2020.08.31
  • Accepted : 2020.09.16
  • Published : 2020.09.30

Abstract

In this paper, we propose CNN-based HMI system using Doppler radar and voice sensor, and present hardware design and implementation results. To overcome the limitation of single sensor monitoring, the proposed HMI system combines data from two sensors to improve performance. The proposed system exhibits improved performance by 3.5% and 12% compared to a single radar and voice sensor-based classifier in noisy environment. In addition, hardware to accelerate the complex computational unit of CNN is implemented and verified on the FPGA test system. As a result of performance evaluation, the proposed HMI acceleration platform can be processed with 95% reduction in computation time compared to a single software-based design.

본 논문에서는 도플러 레이다와 음성 센서를 이용한 CNN 기반 HMI 시스템을 제안하고, 가속을 위한 하드웨어 설계 및 구현 결과를 제시한다. 단일 센서 모니터링의 한계를 극복하기 위해 제안된 HMI 시스템은 두 센서의 데이터를 융합 처리하여 분류 성능을 개선했다. 제안된 시스템은 다양한 노이즈 환경에서 단일 레이다 및 음성 센서 기반 분류기에 비해 3.5% 및 12% 향상된 성능을 나타냈다. 또한, CNN의 복잡한 연산부를 가속하기 위해 설계된 하드웨어를 FPGA 디바이스 상에서 구현 및 검증하였다. 성능 평가 결과, 제안된 HMI 가속 플랫폼은 단일 소프트웨어 기반 구조에 비해 연산 시간을 95% 단축 가능한 것을 확인하였다.

Keywords

References

  1. J. Yu, Z. F. Wang, "A video, text, and speech-driven realistic 3-D virtual head for human-machine interface," IEEE Trans. Cybernetics, vol.45, no.5, pp.977-988, 2015. DOI: 10.1109/TCYB.2014.2341737
  2. Y. Zhang et al, "Static and dynamic human aram/hand gesture capturing and recognition via multiinformation fusion of flexible strain sensors," IEEE Sensors Journal, vol.20, no.12, pp.6450-6459, 2020. DOI: 10.1109/jsen.2020.2965580
  3. M. Kim, J. Cho, S. Lee, Y. Jung, "IMU sensorbased hand gesture recognition for human machine interfaces," MDPI Sensors, vol.19, no.18, pp.1-13, 2019. DOI: 10.3390/s19183827
  4. L. Baraldi, F. Paci, G. Serra, L. Benini, R. Cucchaiara, "Gesture Recognition Using Wearable Vision Sensors to Enhance Visitors' Museum Experience," IEEE Sensors Journal, vol.15, no.5, 2015. DOI: 10.1109/JSEN.2015.2411994
  5. S. Skaria, A. A. Hourani, M. Lech, R. J. Evans, "Hand-Gesture Recognition Using Two-Antenna Doppler Radar with Deep Convolutional Neural Networks," IEEE Sensors Journal, vol.19, no.8, pp.3041-3048, 2019. DOI: 10.1109/JSEN.2019.2892073
  6. Z. Wang, Y. Wu, Q. Niu, "Multi-sensor fusion in automated driving: a survey," IEEE Access, vol.8, pp.2847-2868, 2019. DOI: 10.1109/ACCESS.2019.2962554
  7. F. Garcia, D. Martin, A. Escalera, J. M. Armingol, "Sensor fusion methodology for vehicle detection," IEEE Intelligent Transportation System Magazine, vol.9, no.1, pp.123-133, 2017. DOI: 10.1109/MITS.2016.2620398
  8. F. Hafeez et al., "Insights and strategies for an autonomous vehicle with a sensor fusion innovation: a fictional outlook," IEEE Access, vol.8, pp. 135162-135175, 2020. DOI: 10.1109/ACCESS.2020.3010940
  9. D. Kang, D. Kum, "Camera and radar sensor fusion for robust vehicle localization via vehicle part localization," IEEE Access, vol.8, pp.75223-75236, 2020. DOI: 10.1109/ACCESS.2020.2985075
  10. L. Mezai, F. Hanchouf, "Score-level fusion of face and voice using particle swarm optimization and belief functions," IEEE Trans. Human-machine System, vol.45, no.6, pp.761-772, 2015. DOI: 10.1109/THMS.2015.2438005
  11. C. Mateo, J. Talavera, "Short-time Fourier transform with the window size fixed in the frequency domain," Digital Signal Processing, vol.77, pp.13-21, 2018. https://doi.org/10.1016/j.dsp.2017.11.003
  12. Y. Lecun, L. Bottou, Y. Bengio, P. Haffner, "Gradient-based learning applied to document recognition," Proc. IEEE, vo.86, no.11, pp.2278-2324, 1998. https://doi.org/10.1109/5.726791
  13. K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proc. IEEE Conf. Comput. Vis. Pattern Recog. (CVPR), 2016, pp.770-778.
  14. K. Simonyan, A. Zisserman, "Very deep convolutional networks for large-scale image recognition," in Proc. ICLR, 2015.
  15. Z. Gu et al, "Blind separation of doppler human gesture signals based on continuous wave radar sensors," IEEE Trans. Instrumentation and Measurement, vo.68, no.7, pp.2659-2661, Feb. 2019. https://doi.org/10.1109/tim.2019.2896364
  16. E. M. Lima et al., "Analysis of the influence of the window used in the short-time Fourier transform for high impedance fault detection," In ICHQP, Oct. 2016, pp.350-355.
  17. P. Warden, "Speech commands: a dataset for limited-vocabulary speech recognition," arXiv Computation and Language, pp.1-11, Apr. 2018.

Cited by

  1. WiFi 신호를 활용한 CNN 기반 사람 행동 인식 시스템 설계 및 구현 vol.25, pp.4, 2021, https://doi.org/10.12673/jant.2021.25.4.299