과제정보
The EDA Tool was supported by the IC Design Education Center.
참고문헌
- S. Chetlur, C. Woolley, P. Vandermersch, J. Cohen, J. Tran et al. "cuDNN: Efficient Primitives for Deep Learning", arXiv preprint arXiv:1410.0759, 2014.
- N.P. Jouppi, C. Young, N. Patil, D. Patterson, G. Agrawal et al., "In- Datacenter Performance Analysis of a Tensor Processing Unit", Int. Symp. on Computer Architecture (ISCA), 2017, pp. 1-12.
- S. Markidis, S. W. D. Chien, E. Laure, I. B. Peng, J. S. Vetter, "NVIDIA Tensor Core Programmability, Performance & Precision", Int. Symp. on Parallel and Distributed Processing Symp. Workshops (IPDPSW), 2018, pp. 522-531.
- A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang et al., "Mobilenets: Efficient Convolutional Neural Networks for Mobile Vision Applications", arXiv preprint arXiv:1704.04861, 2017.
- M. Sandler, A. Howard, M. Zhu, A. Zhmoginow, L. C. Chen, "Mo- bileNetV2: Inverted Residuals and Linear Bottlenecks", Computer Vision and Pattern Recognition (CVPR), 2018, pp. 4510-4520.
- A. Howear, M. Sandler, G. Chu, L. C. Chen, B. Chen, M. Tan, "Searching for MobileNetv3", Int. Conf. on Computer Vision (ICCV), 2019, pp. 1314-1324.
- M. Tan, Q. Le, "Efficientnet: Rethinking Model Scaling for Convolu- tional Neural Networks", Proc. Machine Learning Research (PMLR), 2019, pp. 6105-6114.
- Z. Liu, H. Mao, C. Y. Wu, C. Feichtenhofer, T. Darrell, S. Xie, "A ConvNet for the 2020s", arXiv preprint arXiv:2201.03545, 2022.
- Z. Dai, H. Liu, QV. Le, M. Tan, A. Howear, M. Sandler, G. Chu, L. C. Chen, B. Chen, M. Tan, "CoAtNet: Marrying Convolution and Attention for All Data Sizes", Advances in Neural Information Processing Systems 34, 2021, pp. 3965-3977.
- S. Ghodrati, B. H. Ahn, J. Kim, S. Kinzer, B. R. Yatham et al., "Planaria: Dynamic Architecture Fission for Spatial Multi-Tenant Acceleration of Deep Neural Networks", Int. Symp. on Microarchitecture (MICRO), 2020, pp. 681-697.
- J. Lee, J. Choi, J. Kim, J. Lee, Y. Kim, "Dataflow Mirroring: Archi- tectural Support for Highly Efficient Fine-Grained Spatial Multitasking on Systolic Array NPUs", Design Automation Conf. (DAC), 2021, pp. 247-252.
- H. Cho, "RiSA: A Reinforced Systolic Array for Depthwise Convolu- tions and Embedded Tensor Reshaping", Trans. Embedded Computing Systems (TECS) 20.5s, 2021, pp. 1-20. https://doi.org/10.1145/3476984
- R. Xu, S. Ma, Y. Wang, Y. Guo, "CMSA: Configurable Multi-directional Systolic Array for Convolutional Neural Networks", International Con- ference on Computer Design (ICCD), 2020, pp. 494-497.
- L. Bai, Y. Zhao and X. Huang, "A CNN Accelerator on FPGA Using Depthwise Separable Convolution", IEEE Trans. Circuits and Syst. II, Exp. Briefs, vol. 65, no. 10, pp. 1415-1419, Oct. 2018.
- R. Xu, S. Ma, Y. Wang, Y. Guo, "HeSA: Heterogeneous Systolic Array Architecture for Compact CNNs Hardware Accelerators", Design, Automation & Test in Europe Conf. & Exhibit. (DATE), 2021, pp. 657- 662.
- H. T. Kung, B. McDanel, S. Q. Zhang, "Adaptive Tiling: Apply Fixed- size Systolic Arrays to Sparse Convolutional Neural Networks", Int. Conf. on Pattern Recognition (ICPR), 2018, pp. 1006-1011.