Browse > Article

인공신경망 연산을 위한 하드웨어 가속기 최신 연구 동향  

Park, Jong-Hyeon (연세대학교)
Kim, Min-Sik (연세대학교)
Kim, Yun-Su (연세대학교)
Lee, Gyeong-Min (연세대학교)
Yun, Myeong-Guk (연세대학교)
No, Won-U (연세대학교)
Keywords
Citations & Related Records
연도 인용수 순위
  • Reference
1 Chetlur, Sharan, et al. "cudnn: Efficient primitives for deep learning." arXiv preprint arXiv:1410.0759, 2014.
2 Albericio, J., et al. "Cnvlutin: Ineffectual-Neuron-Free Deep Neural Network Computing," Proceedings of ISCA. Vol. 43. 2016.
3 Han, Song, et al. "EIE: efficient inference engine on compressed deep neural network." arXiv preprint arXiv:1602.01528 (2016).
4 Chi, Ping, et al. "PRIME: A Novel Processing- In-Memory Architecture for Neural Network Computation in ReRAM-based Main Memory." Proceedings of ISCA. Vol. 43. 2016.
5 Kim, Duckhwan, et al. "Neurocube: A Programmable Digital Neuromorphic Architecture with High-Density 3D Memory." Proceedings of ISCA. Vol. 43. 2016.
6 Chen, Yu-Hsin, Joel Emer, and Vivienne Sze. "Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks." Proceedings of ISCA. Vol. 43. 2016.
7 Chen, Yunji, et al. "Dadiannao: A machine-learning supercomputer." Proceedings of the 47th Annual lEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 2014.
8 Liu, Daofu, et al. "Pudiannao: A polyvalent machine learning accelerator." ACM SIGARCH Computer Architecture News. Vol. 43. No. 1. ACM, 2015.
9 Du, Zidong, et al. "ShiDianNao: shifting vision processing closer to the sensor." ACM SIGARCH Computer Architecture News. Vol. 43. No. 3. ACM, 2015.
10 Norm Jouppi, "Google supercharges machine learning tasks with TPU custom chip", goo.gl/mXNQFI, 2016
11 Chen, Tianshi, et al. "Diannao: A small-footprint high-throughput accelerator for ubiquitous machine-learning." ACM Sigplan Notices. Vol. 49. No. 4. ACM, 2014.
12 Liu, Shaoli, et al. "Cambricon: An instruction set architecture for neural networks." Proceedings of the 43rd ACM/IEEE International Symposium on Computer Architecture. 2016.
13 Park, Seongwook, et al. "4.6 A1. 93TOPS/W scalable deep learning/inference processor with tetra-parallel MIMD architecture for big-data applications." 2015 IEEE International Solid-State Circuits Conference(ISSCC) Digest of Technical Papers. IEEE, 2015.
14 Sim, Jaehyeong, et al. "14.6 A 1.42 TOPS/W deep convolutional neural network recognition processor for intelligent IoE systems." 2016 IEEE International Solid-State Circuits Conference (ISSCC). IEEE, 2016.
15 Kim, Yong-Deok, et al. "Compression of deep convolutional neural networks for fast and low power mobile applications." arXiv preprint arXiv: 1511.06530 (2015).
16 LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. "Deep learning." Nature 521.7553 (2015): 436-444.   DOI
17 NVIDIA, "NVIDIA(R) Tesla(R) P100 - The Most Advanced Data Center Accelerator Ever Built. Featuring Pascal GP100, the World's Fastest GPU", 2016.
18 Han, Song, Huizi Mao, and William J. Dally. "Deep compression: Compressing deep neural network with pruning, trained quantization and huffman coding." CoRR, abs/1510.00149 2, 2015.