인공신경망 연산을 위한 하드웨어 가속기 최신 연구 동향

  • Published : 2016.09.26

Abstract

Keywords

References

  1. Norm Jouppi, "Google supercharges machine learning tasks with TPU custom chip", goo.gl/mXNQFI, 2016
  2. Chen, Tianshi, et al. "Diannao: A small-footprint high-throughput accelerator for ubiquitous machine-learning." ACM Sigplan Notices. Vol. 49. No. 4. ACM, 2014.
  3. NVIDIA, "NVIDIA(R) Tesla(R) P100 - The Most Advanced Data Center Accelerator Ever Built. Featuring Pascal GP100, the World's Fastest GPU", 2016.
  4. Chetlur, Sharan, et al. "cudnn: Efficient primitives for deep learning." arXiv preprint arXiv:1410.0759, 2014.
  5. Albericio, J., et al. "Cnvlutin: Ineffectual-Neuron-Free Deep Neural Network Computing," Proceedings of ISCA. Vol. 43. 2016.
  6. Han, Song, Huizi Mao, and William J. Dally. "Deep compression: Compressing deep neural network with pruning, trained quantization and huffman coding." CoRR, abs/1510.00149 2, 2015.
  7. Han, Song, et al. "EIE: efficient inference engine on compressed deep neural network." arXiv preprint arXiv:1602.01528 (2016).
  8. Chi, Ping, et al. "PRIME: A Novel Processing- In-Memory Architecture for Neural Network Computation in ReRAM-based Main Memory." Proceedings of ISCA. Vol. 43. 2016.
  9. Kim, Duckhwan, et al. "Neurocube: A Programmable Digital Neuromorphic Architecture with High-Density 3D Memory." Proceedings of ISCA. Vol. 43. 2016.
  10. Chen, Yu-Hsin, Joel Emer, and Vivienne Sze. "Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks." Proceedings of ISCA. Vol. 43. 2016.
  11. Chen, Yunji, et al. "Dadiannao: A machine-learning supercomputer." Proceedings of the 47th Annual lEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 2014.
  12. Liu, Daofu, et al. "Pudiannao: A polyvalent machine learning accelerator." ACM SIGARCH Computer Architecture News. Vol. 43. No. 1. ACM, 2015.
  13. Du, Zidong, et al. "ShiDianNao: shifting vision processing closer to the sensor." ACM SIGARCH Computer Architecture News. Vol. 43. No. 3. ACM, 2015.
  14. Liu, Shaoli, et al. "Cambricon: An instruction set architecture for neural networks." Proceedings of the 43rd ACM/IEEE International Symposium on Computer Architecture. 2016.
  15. Park, Seongwook, et al. "4.6 A1. 93TOPS/W scalable deep learning/inference processor with tetra-parallel MIMD architecture for big-data applications." 2015 IEEE International Solid-State Circuits Conference(ISSCC) Digest of Technical Papers. IEEE, 2015.
  16. Sim, Jaehyeong, et al. "14.6 A 1.42 TOPS/W deep convolutional neural network recognition processor for intelligent IoE systems." 2016 IEEE International Solid-State Circuits Conference (ISSCC). IEEE, 2016.
  17. Kim, Yong-Deok, et al. "Compression of deep convolutional neural networks for fast and low power mobile applications." arXiv preprint arXiv: 1511.06530 (2015).
  18. LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. "Deep learning." Nature 521.7553 (2015): 436-444. https://doi.org/10.1038/nature14539