References
- Y. Lecun et al., "Gradient-based learning applied to document recognition," Proceedings of the IEEE, vol. 86, no. 11, 1998, pp. 2278-2324. https://doi.org/10.1109/5.726791
- A. Krizhevsky, I. Sutskever, and G.E. Hinton, "Imagenet classification with deep convolutional neural networks," Adv. Neural Inf. Process. Syst. vol. 25, 2012, pp. 1097-1105.
- K. He et al., "Deep residual learning for image recognition," in Proc. Conf. Comput. Vis. Pattern Recognit. Las Vegas, NV, USA, June 2016.
- H. Pham et al., "Meta pseudo labels," arXiv preprint, CoRR, 2020, arXiv:2003.10580
- https://mlcommons.org/en/
- Arm, "Arm NN future roadmap," https://developer.arm.com/ip-products/processors/machine-learning/arm-nn
- https://github.com/ARM-software/ComputeLibrary
- https://github.com/NervanaSystems/ngraph
- Intel, "nGrapph developer guide," https://docs.openvinotoolkit.org/latest/openvino_docs_nGraph_DG_Introduction.html
- Android, https://developer.android.com/ndk/guides/neuralnetworks
- https://github.com/RadeonOpenCompute/ROCm
- https://github.com/NVIDIA/TensorRT
- https://github.com/Xilinx/Vitis-AI
- https://github.com/tensorflow/tensorflow
- https://github.com/pytorch/pytorch
- https://github.com/onnx/onnx
- https://github.com/microsoft/onnxruntime
- https://github.com/xianyi/OpenBLAS
- https://github.com/math-atlas/math-atlas
- https://github.com/oneapi-src/oneDNN
- https://developer.nvidia.com/cuda-toolkit
- https://developer.nvidia.com/CUDnn
- C. Nugteren, "CLBlast: A tuned OpenCL BLAS library," in Proc. Int. Workshop OpenCL, Oxford, UK, May 2018, 5:1-10.
- https://github.com/NVIDIA/cutlass
- https://www.tensorflow.org/xla
- C. Lattner et al., "MLIR: A compiler infrastructure for the end of Moore's law," arXiv preprint, CoRR, 2020, arXiv:2002.11054
- https://github.com/onnx/onnx-mlir
- T.D. Le et al., "Compiling ONNX neural network models using MLIR," arXiv preprint, CoRR, 2020, arXiv:2008.08272
- Y.C.P. Cho et al., "AB9: A neural processor for inference acceleration," ETRI J. vol. 42, no. 4, 2020, pp. 491-504. https://doi.org/10.4218/etrij.2020-0134
- J. Han, M. Choi, and Y. Kwon, "40-TFLOPS artificial intelligence processor with function-safe programmable many-cores for ISO26262 ASIL-D," ETRI J. vol. 42, no. 4, 2020, pp. 468-479. https://doi.org/10.4218/etrij.2020-0128
- H.M. Kim, C.G. Lyuh, and Y. Kwon, "Automated optimization for memory-efficient high-performance deep neural network accelerators," ETRI J. vol. 42, no. 4, 2020, pp. 505-517. https://doi.org/10.4218/etrij.2020-0125
- 이미영 외, "인공지능프로세서 기술 동향," 전자통신동향분석, 제35권 제3호, 2020, pp. 66-75. https://doi.org/10.22648/ETRI.2020.J.350307
- 한진호, 권영수, "병렬 컴퓨팅 기반 인공지능 프로세서 기술동향," IITP 주간기술동향, 제1964호, 2020, pp. 16-29.