[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.17662/ksdim.2021.17.4.053

Sparse Matrix Compression Technique and Hardware Design for Lightweight Deep Learning Accelerators

Kim, Sunhee (상명대학교, 시스템반도체공학과)
Shin, Dongyeob (한국전자기술연구원, 스마트네트워크연구센터)
Lim, Yong-Seok (한국전자기술연구원, 스마트네트워크연구센터)

Publication Information

Journal of Korea Society of Digital Industry and Information Management / v.17, no.4, 2021 , pp. 53-62 More about this Journal

Abstract

Deep learning models such as convolutional neural networks and recurrent neual networks process a huge amounts of data, so they require a lot of storage and consume a lot of time and power due to memory access. Recently, research is being conducted to reduce memory usage and access by compressing data using the feature that many of deep learning data are highly sparse and localized. In this paper, we propose a compression-decompression method of storing only the non-zero data and the location information of the non-zero data excluding zero data. In order to make the location information of non-zero data, the matrix data is divided into sections uniformly. And whether there is non-zero data in the corresponding section is indicated. In this case, section division is not executed only once, but repeatedly executed, and location information is stored in each step. Therefore, it can be properly compressed according to the ratio and distribution of zero data. In addition, we propose a hardware structure that enables compression and decompression without complex operations. It was designed and verified with Verilog, and it was confirmed that it can be used in hardware deep learning accelerators.

Keywords

Accelerator; Bitmap; Compression; Decompression; Sparse Matrix;

Citations & Related Records

Reference

1	Zhu, M. H. and Gupta, S., "To prune, or not to prune: exploring the efficacy of pruning for model compression," International Conference on Learning Representations(ICLR), Vancouver Canada, 2018.
2	M. Rhu, M. O'Connor, N. Chatterjee, J. Pool, Y. Kwon and S. W. Keckler, "Compressing DMA Engine: Leveraging Activation Sparsity for Training Deep Neural Networks," IEEE International Symposium on High Performance Computer Architecture(HPCA), Vienna, Austria, 2018, pp. 78-91.
3	오광일, 김성은, 배영환, 박경환, 권영수, "인공지능 뉴로모픽 반도체 기술 동향," 전자통신동향분석, 제35권, 제3호, 2020, pp.76-84. DOI
4	Chen, Y., Xie, Y., Song, L., Chen, F. and Tang, T., "A Survey of Accelerator Architectures for Deep Neural Networks," Engineering, Vol.6, 2020, pp.264-274. DOI
5	추연호, 최영규, "딥러닝을 이용한 IOT 기기 인식시스템", 반도체디스플레이기술학회지, 제18권, 제2호, 2019, pp.1-5.
6	김윤빈, "딥러닝을 활용한 IoT 센서 데이터 처리에 관한 연구", 고려대학교 대학원 컴퓨터정보학과 국내석사 학위 논문, 2019.
7	Han, S., Mao, H. and Dally, W. J., "Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding," International Conference on Learning Representations(ICLR), San Juan, Puerto Rico, 2016,
8	Lin, C. H. , Cheng, C. C., Tsai, Y. M., Hung, S. J., Kuo, Y. T., Wang, P. H., Tsung P. K., Hsu, J. Y., Lai, W. C., Liu, C. H., Wang, S. Y., Kuo, C. H., Chang, C. Y., Lee, M. H., Lin, T. Y. and Chen, C. C.,, "A 3.4-to-13.3TOPS/W 3.6TOPS Dual-Core Deep-Learning Accelerator for Versatile AI Applications in 7nm 5G Smartphone SoC," 2020 IEEE International Solid- State Circuits Conference(ISSCC), San Francisco, CA, USA, 2020, pp.134-136.
9	안세현, 강석주, "경량화된 딥러닝 구조를 이용한 실시간 초고해상도 영상 생성 기술," 방송공학회논문지, 제26권 제2호, 2021, pp.167-174. DOI
10	이용주, 문용혁, 박준용, 민옥기, "경량 딥러닝 기술 동향," 전자통신동향분석, 제34권, 제2호, 2019, pp.40-50. DOI
11	Kanellopoulos, K., Vijaykumar, N., Giannoula, C., Azizi, R., Koppula, S., Ghiasi, N. M., Shahroodi, T., Luna, J. G. and Mutlu, O., "SMASH: Co-designing Software Compression and Hardware-Accelerated Indexing for Efficient Sparse Matrix Operations," Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-52), Columbus, OH, USA, 2019, pp.600-614.
12	Liu, W. and Vinter, B., "CSR5: An Efficient Storage Format for Cross-Platform Sparse Matrix-Vector Multiplication," Proceedings of the 29th ACM International Conference on Supercomputing (ICS '15), New York, USA, 2015, pp.339-350.
13	T. A. Davis and Y. Hu, "The University of Florida Sparse Matrix Collection," https://sparse.tamu.edu/
14	장중봉, 최승원, "채널 상태 정보를 이용한 딥 러닝 기반 실내 위치 확인 시스템", 디지털산업정보학회 논문지, 제16권, 제4호, 2020, pp.1-7.
15	이경하, 김은희, 딥러닝 모델 경량화 기술 분석, 발한국과학기술정보연구원, 대전, 2020. p.89.
16	Park, Y., Kang, Y., Kim, S., Kwon, E. and Kang, S., "GRLC: Grid-based Run-length Compression for Energy-efficient CNN Accelerator," Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design, Boston, MA, USA, 2020, pp.91-96.
17	김진호, "딥러닝 신경망을 이용한 문자 및 단어단위의 영문 차량 번호판 인식", 디지털산업정보학회 논문지, 제16권, 제4호, 2020, pp.19-28.

KSCI

Sparse Matrix Compression Technique and Hardware Design for Lightweight Deep Learning Accelerators 경량 딥러닝 가속기를 위한 희소 행렬 압축 기법 및 하드웨어 설계

Sparse Matrix Compression Technique and Hardware Design for Lightweight Deep Learning Accelerators