Search | Korea Science

Jinse, Kwon;Jemin, Lee;Yongin, Kwon;Jeman, Park;Misun, Yu;Taeho, Kim;Hyungshin, Kim
- IEMEK Journal of Embedded Systems and Applications
- /
- v.17 no.6
- /
- pp.367-374
- /
- 2022
The acceleration of neural networks has become an important topic in the field of computer vision. An accelerator is absolutely necessary for accelerating the lightweight model. Most accelerator-supported operators focused on direct convolution operations. If the accelerator does not provide GEMM operation, it is mostly replaced by CPU operation. In this paper, we proposed an optimization technique for 2-stage tiling-based GEMM routines on VTA. We improved performance of the matrix multiplication routine by maximizing the reusability of the input matrix and optimizing the operation pipelining. In addition, we applied the proposed technique to the DarkNet framework to check the performance improvement of the matrix multiplication routine. The proposed GEMM method showed a performance improvement of more than 2.4 times compared to the non-optimized GEMM method. The inference performance of our DarkNet framework has also improved by at least 2.3 times.
https://doi.org/10.14372/IEMEK.2022.17.6.367 인용 PDF KSCI