AB9: A neural processor for inference acceleration |
Cho, Yong Cheol Peter
(AI Processor Research Team, AI SoC Research Department, Electronics and Telecommunications Research Institute)
Chung, Jaehoon (AI Processor Research Team, AI SoC Research Department, Electronics and Telecommunications Research Institute) Yang, Jeongmin (AI Processor Research Team, AI SoC Research Department, Electronics and Telecommunications Research Institute) Lyuh, Chun-Gi (AI Processor Research Team, AI SoC Research Department, Electronics and Telecommunications Research Institute) Kim, HyunMi (AI Processor Research Team, AI SoC Research Department, Electronics and Telecommunications Research Institute) Kim, Chan (AI Processor Research Team, AI SoC Research Department, Electronics and Telecommunications Research Institute) Ham, Je-seok (AI Processor Research Team, AI SoC Research Department, Electronics and Telecommunications Research Institute) Choi, Minseok (AI Processor Research Team, AI SoC Research Department, Electronics and Telecommunications Research Institute) Shin, Kyoungseon (AI Processor Research Team, AI SoC Research Department, Electronics and Telecommunications Research Institute) Han, Jinho (AI Processor Research Team, AI SoC Research Department, Electronics and Telecommunications Research Institute) Kwon, Youngsu (AI Processor Research Team, AI SoC Research Department, Electronics and Telecommunications Research Institute) |
1 | A. Ignatov et al., AI benchmark: All about deep learning on smartphones in 2019, in Proc. IEEE/CVF Int. Conf. Comput. Vision Workshop (Seoul, Rep. of Korea), Oct. 2019, pp. 3617-3635. |
2 | ETRI Technology, Aldebaran microcontroller SoC for mobile robot (low power MCU core technology), 2017, available at https://www.etri.re.kr/eng/bbs/view.etri?b_board_id=ENG03&b_idx=16719 |
3 | J. Han et al., A 1GHz fault tolerant processor with dynamic lockstep and self-recovering cache for ADAS SoC complying with ISO26262 in automotive electronics, in Proc. IEEE Asian Solid-State Circuits Conf. (Seoul, Rep. of Korea), Nov. 2017, pp. 313-316. |
4 | Y. Jia, Learning semantic image representations at a large scale, Ph.D. Thesis, EECS Department, Univ. of California, Berkeley, May 2014. |
5 | S. Gupta et al., Deep learning with limited numerical precision, Int. Conf. Mach. Learn. 37 (2015), 1737-1746. |
6 | J. Redmon and A. Farhadi, Yolo9000: Better, faster, stronger, 2016, available at https://arxiv.org/abs/1612.08242, preprint. |
7 | J. Kim, J. K. Lee, and K. M. Lee, Accurate image super-resolution using very deep convolutional networks, in Proc. IEEE Conf. Comput. Vision Pattern Recognit. (Las Vegas, NV, USA), 2016, pp. 1646-1654. |
8 | Coral, Edge TPU performance benchmarks, available at https://coral.ai/docs/edget pu/benchmarks/ |
9 | AI-Benchmark, available at http://www.ai-bench mark.com |
10 | J. Johnson. Benchmarks for popular CNN models, available at https://github.com/jcjoh nson/cnn-bench marks |
11 | T. Narayan and Intel AI Academy, A comparison of performance of deep learning models on Edge using Intel Movidius Neural Compute Stick and Raspberry PI3, available at https://medium.com/intel-student-ambassadors/object-detection-a-comparison-of-performance-of-deep-learning-models-on-edge-using-intel-f66eb7f45b17 |
12 | S. Hossain and D. Lee, Deep learning-based real-time multiple-object detection and tracking from aerial imagery via a flying robot with GPU-based embedded devices, Sensors 19 (2019), no. 15, 3371:1-3424. |
13 | J. Guerreiro et al., Modeling and decoupling the GPU power consumption for cross-domain DVFS, IEEE Trans. Parallel Distrib. Syst. 30 (2019), no. 11, 2494-2506. DOI |