Browse > Article
http://dx.doi.org/10.3745/KTCCS.2022.11.10.323

A Study on the System for AI Service Production  

Hong, Yong-Geun (대전대학교 AI융합학과)
Publication Information
KIPS Transactions on Computer and Communication Systems / v.11, no.10, 2022 , pp. 323-332 More about this Journal
Abstract
As various services using AI technology are being developed, much attention is being paid to AI service production. Recently, AI technology is acknowledged as one of ICT services, a lot of research is being conducted for general-purpose AI service production. In this paper, I describe the research results in terms of systems for AI service production, focusing on the distribution and production of machine learning models, which are the final steps of general machine learning development procedures. Three different Ubuntu systems were built, and experiments were conducted on the system, using data from 2017 validation COCO dataset in combination of different AI models (RFCN, SSD-Mobilenet) and different communication methods (gRPC, REST) to request and perform AI services through Tensorflow serving. Through various experiments, it was found that the type of AI model has a greater influence on AI service inference time than AI machine communication method, and in the case of object detection AI service, the number and complexity of objects in the image are more affected than the file size of the image to be detected. In addition, it was confirmed that if the AI service is performed remotely rather than locally, even if it is a machine with good performance, it takes more time to infer the AI service than if it is performed locally. Through the results of this study, it is expected that system design suitable for service goals, AI model development, and efficient AI service production will be possible.
Keywords
Artificial Intelligence; Object Detection; AI Inference; AI Production; Edge Computing;
Citations & Related Records
Times Cited By KSCI : 4  (Citation Analysis)
연도 인용수 순위
1 T. Brown et al., "Language models are few-shot learners," Advances in Neural Information Processing Systems, Vol.33, pp.1877-1901, 2020.
2 Tensorflow serving [Internet], https://www.tensorflow.org/tfx/guide/serving.
3 Nvidia Trion Server [Internet], https://developer.nvidia.com/nvidia-triton-inference-server.
4 Intel OpenVINO [Internet], https://www.intel.com/content/www/us/en/developer/tools/openvino-toolkit/overview.html.
5 Sungpil Shin, "MLaas(Machine Learning as a Service) Market Trend and Standards for functional requirement," TTA ICT Standard Weekly 1065, 2022.
6 Flask [Internet], https://flask.palletsprojects.com/en/2.0.x.
7 FastAPI [Internet], https://fastapi.tiangolo.com/.
8 ITU-T Y.3531, "Cloud computing - Functional requirements for machine learning as a service," 2020.
9 T-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollar, "Focal loss for dense object detection," In Proceedings of the IEEE International Conference on Computer Vision, pp.2980-2988, 2017.
10 TorchServe [Internet], https://pytorch.org/serve/.
11 Django [Internet], https://www.djangoproject.com/.
12 W. Yu, F. Liang, X. He, W. Grant Hatcher, C. Lu, J. Lin and X. Yang, "A survey on the edge computing for the Internet of Things," IEEE Access, Vol.6, pp.6900-6919, 2017.
13 E. H. Kim, K. Ha Lee, and W. Kyung Sung, "Technology trends of deep-learning model lightweight," Communication of KIISE, Vol.38, No.8, pp.18-29, 2020.
14 F. Wang, M. Zhang, X. Wang, X. Ma, and J. Liu, "Deep learning for edge computing applications: A state-of-the-art survey," IEEE Access, Vol.8, pp.58322-58336, 2020.   DOI
15 Y. Jun Choi and H. S. Eom, "Deep learning model compression for embedded system," KIISE KCC 2019, pp.1044-1046, 2019.
16 A. G. Howard et al., "Mobilenets: Efficient convolutional neural networks for mobile vision applications," arXiv preprint arXiv:1704.04861, 2017.
17 S. Ren, K. He, R. Girshick, and J. Sun, "Faster r-cnn: Towards real-time object detection with region proposal networks," Advances in Neural Information Processing Systems, Vol.28, 2015.
18 M. Algabri, H. Mathkour, M. Abdelkader Bencherif, M. Alsulaiman, and M. Amine Mekhtiche, "Towards deep object detection techniques for phoneme recognition," IEEE Access, Vol.8, pp.54663-54680,2020.   DOI
19 R. Girshick, "Fast r-cnn," In Proceedings of the IEEE International Conference on Computer Vision, pp.1440-1448. 2015.
20 J. Dai, Y. Li, K. He, and J. Sun, "R-fcn: Object detection via region-based fully convolutional networks," Advances in Neural Information Processing Systems, Vol.29, 2016.
21 R. Girshick, J. Donahue, T. Darrell, and J. Malik, "Rich feature hierarchies for accurate object detection and semantic segmentation," In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.580-587, 2014.
22 S. H. Park, H. S. Yoon, and K. R. Park, "Faster R-CNN and geometric transformation-based detection of driver's eyes using multiple near-infrared camera sensors," Sensors, Vol.19, No.1, pp.197, 2019.   DOI
23 S. Zhang, L. Wen, X. Bian, Z. Lei, and S. Z. Li, "Single-shot refinement neural network for object detection," In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.4203-4212, 2018.
24 K. Surya Vara Prasad, K. B. D'souza, and V. K. Bhargava, "A downscaled faster-RCNN framework for signal detection and time-frequency localization in wideband RF systems," IEEE Transactions on Wireless Communications, Vol.19, No.7, pp.4847-4862,2020.   DOI
25 J. Redmon and A. Farhadi, "YOLO9000: Better, faster, stronger," In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.7263-7271, 2017.
26 J. Redmon and A. Farhadi, "Yolov3: An incremental improvement," arXiv preprint arXiv:1804.02767, 2018.
27 H. Zhang, L. Qin, J. Li, Y. Guo, Y. Zhou, J. Zhang, and Z. Xu, "Real-time detection method for small traffic signs based on Yolov3," IEEE Access, Vol.8, pp.64145-64156, 2020.   DOI
28 A. G. Howard et al., "Mobilenets: Efficient convolutional neural networks for mobile vision applications," arXiv preprint arXiv:1704.04861, 2017.
29 H. M. Park and T. H. Hwang, "Changes and trends of Edge computing technology," KICS Information and Communication Magazine, Vol.36, No.2, pp.41-47, 2019.
30 W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C-Y Fu, and A. C. Berg, "Ssd: Single shot multibox detector," In European Conference on Computer Vision, pp.21-37. Springer, Cham, 2016.
31 S. Maheshwari, D. Raychaudhuri, I. Seskar, and F. Bronzino, "Scalability and performance evaluation of edge cloud systems for latency constrained applications," In 2018 IEEE/ACM Symposium on Edge Computing (SEC), pp. 286-299. IEEE, 2018.
32 K. H. Kim, Y. G. Hong, and C. S. Pyo, "Standard technology and Trend of Edge computing for IoT and AI," KICS Information and Communication Magazine.
33 J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You only look once: Unified, real-time object detection," In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.779-788, 2016.
34 L. Zhou, W. Min, D. Lin, Q. Han, and R. Liu, "Detecting motion blurred vehicle logo in IoV using filter-DeblurGAN and VL-YOLO," IEEE Transactions on Vehicular Technology, Vol.69, No.4, pp.3604-3614, 2020.   DOI
35 Intel AI Object Detection [Internet], https://github.com/IntelAI/models/blob/master/docs/object_detection/tensorflow_serving/Tutorial.md.