CNN 모델의 최적 양자화를 위한 웹 서비스 플랫폼

Web Service Platform for Optimal Quantization of CNN Models

  • 노재원 (한국외국어대학교 컴퓨터공학부) ;
  • 임채민 (한국외국어대학교 컴퓨터공학부) ;
  • 조상영 (한국외국어대학교 컴퓨터공학부)
  • Roh, Jaewon (Division of Computer Engineering, Hankuk University of Foreign Studies) ;
  • Lim, Chaemin (Division of Computer Engineering, Hankuk University of Foreign Studies) ;
  • Cho, Sang-Young (Division of Computer Engineering, Hankuk University of Foreign Studies)
  • 투고 : 2021.12.04
  • 심사 : 2021.12.14
  • 발행 : 2021.12.31

초록

Low-end IoT devices do not have enough computation and memory resources for DNN learning and inference. Integer quantization of real-type neural network models can reduce model size, hardware computational burden, and power consumption. This paper describes the design and implementation of a web-based quantization platform for CNN deep learning accelerator chips. In the web service platform, we implemented visualization of the model through a convenient UI, analysis of each step of inference, and detailed editing of the model. Additionally, a data augmentation function and a management function of files that store models and inference intermediate results are provided. The implemented functions were verified using three YOLO models.

키워드

과제정보

본 논문은 과학기술정통신부 및 정보통신기획평가원의 SW 중심대학지원사업의 연구결과로 수행되었음(2019-0-01816). 본 논문은 2021학년도 한국외국어대학교 교내학술연구비 지원에 의하여 이루어진 것임.

참고문헌

  1. S. U. Park, "Artificial Intelligent Technology and Market Trend", The magazine of Korea Institute of Information and Communication Engineering, Vol. 19, No. 2, pp. 11-22, 2018. https://doi.org/10.14801/jkiit.2018.16.11.11
  2. G. Nguyen, S. Dlugolinsky, and M. Bobaket, V. Tran, A. Garcia, I. Heredia, P. Malik, and L. Hluchy, "Machine Learning and Deep Learning Frameworks and Libraries for Large-scale Data Mining: a Survey", Artificial Intelligence Review, Vol. 52, pp. 77-124, 2019. https://doi.org/10.1007/s10462-018-09679-z
  3. L. Yeager, J. Bernauer, A. Gray, and M. Houston, "Digits: the Deep Learning GPU Training System", ICML 2015 AutoML Workshop, pp. 1-4, 2015.
  4. TensorBoard: www.tensorflow.org/tensorboard.
  5. H. Wu, P. Judd, X. Zhang, M. Isaev, and P. Micikevicius, "Integer quantization for deep learning inference: Principles and empirical evaluation", arXiv:2004.09602, 2020.
  6. A. Jain, S. Bhattacharya, M. Masuda, V. Sharma, and Y. Wang, "Efficient execution of quantized deep learning models: A compiler approach", arXiv preprint arXiv:2006.10226, 2020.
  7. Z. Yao, Z. Dong, Z. Zheng, A. Gholami, J. Yu, E. Tan, L. Wang, Q. Huang, Y. Wang, M. W Mahoney, and K. Keutzer, "HAWQV3: Dyadic neural network quantization", arXiv preprint arXiv:2011.10680, 2020.
  8. Md A. Raihan, N. Goli, and Tor M. Admodt, "Modeling Deep Learning Accelerator Enabled GPUs", IEEE ISPASS, pp.29-92, 2019.
  9. F. Farshchi, Q. Huang, and H. Yun, "Integrating NVIDIA Deep Learning Accelerator (NVDLA) with RISC-V SoC on FireSim", EMC2'19, Washington D.,C., USA, Feb., 2019.
  10. A. Marchisio, M. A. Hanif, F. Khalid, G. Plastiras, C.Kyrkou, T. Theocharides, and M. Shafique, "Deep Learning for Edge Computing: Current Trends, Cross-Layer Optimizations, and Open Research Challenges", IEEE Computer Society Annual Symposium on VLSI, Miami, FL, USA, pp. 553-559, 2019.
  11. S.-H. Kang, S.-Y. Cho, and S.-H. Lim, "Quantization Simulator for Deep Learning Accelerator", Proceedings of CICS'20, KIEE, pp. 487-488, 2020.
  12. Tensorflow Lite: https://www.tensorflow.org/lite.
  13. PyTorch Quantization (online): https://pytorch.org/blog/introduction-to-quantization-on-pytorch.
  14. Apache TVM: https://tvm.apache.org.
  15. Darknet (online): http://pjreddie.com/darknet/.
  16. J. Redmon and A. Farhadi, "YOLO9000: Better, faster, stronger", In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, July, pp. 7263-7271, 2017.
  17. Sharp(online): https://sharp.pixelplumbing.com/.
  18. Y.-H. Lee and Y. Kim, "Comparison of CNN and YOLO for Object Detection", Journal of KSDT, Vol. 19, No. 1, pp. 85-92, 2020.