DOI QR코드

DOI QR Code

Energy-Efficient DNN Processor on Embedded Systems for Spontaneous Human-Robot Interaction

  • Received : 2021.01.07
  • Accepted : 2021.04.22
  • Published : 2021.06.30

Abstract

Recently, deep neural networks (DNNs) are actively used for action control so that an autonomous system, such as the robot, can perform human-like behaviors and operations. Unlike recognition tasks, the real-time operation is essential in action control, and it is too slow to use remote learning on a server communicating through a network. New learning techniques, such as reinforcement learning (RL), are needed to determine and select the correct robot behavior locally. In this paper, we propose an energy-efficient DNN processor with a LUT-based processing engine and near-zero skipper. A CNN-based facial emotion recognition and an RNN-based emotional dialogue generation model is integrated for natural HRI system and tested with the proposed processor. It supports 1b to 16b variable weight bit precision with and 57.6% and 28.5% lower energy consumption than conventional MAC arithmetic units for 1b and 16b weight precision. Also, the near-zero skipper reduces 36% of MAC operation and consumes 28% lower energy consumption for facial emotion recognition tasks. Implemented in 65nm CMOS process, the proposed processor occupies 1784×1784 um2 areas and dissipates 0.28 mW and 34.4 mW at 1fps and 30fps facial emotion recognition tasks.

Keywords

Acknowledgement

This research was supported by the MSIT(Ministry of Science and ICT), Korea, under the ITRC(Information Technology Research Center) support program(IITP-2020-0-01847) supervised by the IITP(Institute for Information & Communications Technology Planning & Evaluation).

References

  1. V. Mnih et al., "Human-level control through deep reinforcement learning," in Nature, vol. 518, no. 7540, p. 529, 2015. https://doi.org/10.1038/nature14236
  2. J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. J. a. p. a. Klimov, "Proximal policy optimization algorithms," 2017.
  3. V. Mnih et al., "Playing atari with deep reinforcement learning," arXiv preprint arXiv:1312.5602, 2013.
  4. J. Lee, C. Kim, S. Kang, D. Shin, S. Kim, and H.-J. Yoo, "UNPU: A 50.6 TOPS/W unified deep neural network accelerator with 1b-to-16b fully-variable weight bit-precision," in 2018 IEEE International Solid-State Circuits Conference-(ISSCC), 2018: IEEE, pp. 218-220.
  5. K. Ueyoshi et al., "QUEST: A 7.49 TOPS multi-purpose logquantized DNN inference engine stacked on 96MB 3D SRAM using inductive-coupling technology in 40nm CMOS," in 2018 IEEE International Solid-State Circuits Conference-(ISSCC), 2018: IEEE, pp. 216-218.
  6. Z. Yuan et al., "STICKER: A 0.41-62.1 TOPS/W 8bit Neural Network Processor with Multi-Sparsity Compatible Convolution Arrays and Online Tuning Acceleration for Fully Connected Layers," in 2018 IEEE Symposium on VLSI Circuits, 2018: IEEE, pp. 33-34.