Browse > Article

Performance Analysis of DNN inference using OpenCV Built in CPU and GPU Functions  

Park, Chun-Su (Computer Education, Sungkyunkwan University)
Publication Information
Journal of the Semiconductor & Display Technology / v.21, no.1, 2022 , pp. 75-78 More about this Journal
Abstract
Deep Neural Networks (DNN) has become an essential data processing architecture for the implementation of multiple computer vision tasks. Recently, DNN-based algorithms achieve much higher recognition accuracy than traditional algorithms based on shallow learning. However, training and inference DNNs require huge computational capabilities than daily usage purposes of computers. Moreover, with increased size and depth of DNNs, CPUs may be unsatisfactory since they use serial processing by default. GPUs are the solution that come up with greater speed compared to CPUs because of their Parallel Processing/Computation nature. In this paper, we analyze the inference time complexity of DNNs using well-known computer vision library, OpenCV. We measure and analyze inference time complexity for three cases, CPU, GPU-Float32, and GPU-Float16.
Keywords
Computer Vision; OpenCV; Deep Neural Networks; CPU; GPU acceleration;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Shalini, K., et al. "Comparative analysis on Deep Convolution Neural Network models using Pytorch and OpenCV DNN frameworks for identifying optimum fruit detection solution on RISC-V architecture." IEEE Mysore Sub Section International Conference (MysuruCon), pp. 738-743, 2021.
2 Xiang, Yecheng, and Hyoseung Kim, "Pipelined data-parallel CPU/GPU scheduling for multi-DNN real-time inference." IEEE Real-Time Systems Symposium (RTSS). pp. 392-405, 2019.
3 https://en.wikipedia.org/wiki/OpenCV
4 https://learnopencv.com/deep-learning-with-opencvsdnn-module-a-definitive-guide/
5 Hangun, Batuhan, and Onder Eyecioglu. "Performance comparison between OpenCV built in CPU and GPU functions on image processing operations." International Journal of Engineering Science and Application, vol. 1, no. 2, pp. 34-41, 2017.
6 Velasco-Montero, Delia, et al. "Performance analysis of real-time DNN inference on Raspberry Pi." Real-Time Image and Video Processing, vol. 10670, pp. 10670F-1-9, 2018.
7 Bochkovskiy, Alexey, Chien-Yao Wang, and Hong-Yuan Mark Liao. "Yolov4: Optimal speed and accuracy of object detection." arXiv preprint arXiv:2004.10934, 2020.
8 Ali Farhadi and Joseph Redmon. "Yolov3: An incremental improvement." Computer Vision and Pattern Recognition, vol. 1804, pp. 1-6, 2018.
9 Xiaoyun Wang, et al. "Accelerating DNN Inference with GraphBLAS and the GPU." IEEE High Performance Extreme Computing Conference (HPEC), pp. 1-6, 2019.
10 Adrian Kaehler and Gary Bradski. "Learning OpenCV 3: computer vision in C++ with the OpenCV library." O'Reilly Media, Inc., 2016.
11 https://github.com/AlexeyAB/darknet
12 P. Jiang, D. Ergu, F. Liu, Y. Cai, and B. Ma. "A Review of Yolo Algorithm Developments." Procedia Computer Science, pp. 1066-1073, 2022.
13 C. Y. Wang, A. Bochkovskiy, and H. Y. H. Liao. "Scaled-yolov4: Scaling cross stage partial network." In Proceedings of the IEEE/cvf conference on computer vision and pattern recognition. pp. 13029-13038, 2021.