DOI QR코드

DOI QR Code

Proactive Virtual Network Function Live Migration using Machine Learning

머신러닝을 이용한 선제적 VNF Live Migration

  • Jeong, Seyeon (Pohang University of Science and Technology Department of Computer Science Engineering) ;
  • Yoo, Jae-Hyoung (Pohang University of Science and Technology Department of Computer Science Engineering) ;
  • Hong, James Won-Ki (Pohang University of Science and Technology Department of Computer Science Engineering)
  • Received : 2021.07.05
  • Accepted : 2021.08.13
  • Published : 2021.08.31

Abstract

VM (Virtual Machine) live migration is a server virtualization technique for deploying a running VM to another server node while minimizing downtime of a service the VM provides. Currently, in cloud data centers, VM live migration is widely used to apply load balancing on CPU workload and network traffic, to reduce electricity consumption by consolidating active VMs into specific location groups of servers, and to provide uninterrupted service during the maintenance of hardware and software update on servers. It is critical to use VMlive migration as a prevention or mitigation measure for possible failure when its indications are detected or predicted. In this paper, we propose two VNF live migration methods; one for predictive load balancing and the other for a proactive measure in failure. Both need machine learning models that learn periodic monitoring data of resource usage and logs from servers and VMs/VNFs. We apply the second method to a vEPC (Virtual Evolved Pakcet Core) failure scenario to provide a detailed case study.

VM (Virtual Machine) live migration은 VM에서 동작하는 서비스의 downtime을 최소화하면서 해당 VM을 다른 서버 노드로 이전시키는 서버 가상화 기술이다. 클라우드 데이터센터에서는 로드밸런싱, 특정 위치 서버로의 consolidation 통한 전력 소비 감소, 서버 유지보수(maintenance) 작업 중에도 사용자에게 무중단 서비스를 제공하기 위한 목적 등으로 VM live migration 기술이 활발히 사용되고 있다. 또한 고장 및 장애 상황이 예측되거나 그 징후가 탐지되는 경우, 예방 및 완화 수단으로 활용될 수 있다. 본 논문에서 우리는 두 가지 선제적(proactive) VNF live migration 방법을 제안하며, 첫 번째 방법은 서버 로드밸런싱에 VNF live migration 기법을 사용하며 두 번째 방법은 고장 예측에 기반하여 고장 회피 목적으로 VNF live migration을 사용한다. 선제적 migration을 위한 예측에 머신러닝(기계학습)을 활용하며 실험을 통해 그 실효성을 검증한다. 특히 두 번째 방법에 대해 vEPC (Virtual Evolved Packet Core)의 고장 상황을 case study한 결과를 제시한다.

Keywords

Acknowledgement

이 논문은 2021년도 정부(과학기술정보통신부)의 재원으로 정보통신기획평가원의 지원(2018-0-00749, 인공지능 기반 가상네트워크 관리기술 개발)을 받아 수행된 연구임

References

  1. C. Clark, K. Fraser, S. Hand, J. G. Hansen, E. Jul, C. Limpach, I. Pratt, and A. Warfield, "Live migration of virtual machines," in Proceedings of the 2nd conference on Symposium on Networked Systems Design & Implementation-Volume 2, 2005, pp. 273-286.
  2. G. C. Engine, "Setting instance availability policies." [Online]. Available: https://cloud.google.com/compute/docs/instances/setting-instancescheduling-options
  3. ETSI ISG NFV, "Standards for nfv - network functions virtualisation." [Online]. Available: https://www.etsi.org/technologies/nfv
  4. B. Yi, X. Wang, M. Huang, and K. Li, "Design and implementation of network-aware vnf migration mechanism," IEEE Access, vol. 8, pp. 44 346-44 358, 2020. https://doi.org/10.1109/access.2020.2978002
  5. L. Rui, X. Chen, Z. Gao, W. Li, X. Qiu, and L. Meng, "Petri net-based reliability assessment and migration optimization strategy of sfc," IEEE Transactions on Network and Service Management, 2020.
  6. D. Basu, X. Wang, Y. Hong, H. Chen, and S. Bressan, "Learn-asyou-go with megh: Efficient live migration of virtual machines," IEEE Transactions on Parallel and Distributed Systems, vol. 30, no. 8, pp. 1786-1801, 2019. https://doi.org/10.1109/tpds.2019.2893648
  7. A. L. Ibrahimpaˇsi큓, B. Han, and H. D. Schotten, "i-empowered vnf migration as a cost-loss-effective solution for network resilience," in 2021 IEEE Wireless Communications and Networking Conference Workshops (WCNCW). IEEE, 2021, pp. 1-6.
  8. Q. Lin, K. Hsieh, Y. Dang, H. Zhang, K. Sui, Y. Xu, J.-G. Lou, C. Li, Y. Wu, R. Yao et al., "Predicting node failure in cloud service systems," in Proceedings of the 2018 26th ACM Joint Meeting on European oftware Engineering Conference and Symposium on the Foundations of Software Engineering, 2018, pp. 480-490.
  9. Z. Li, Z. Ge, A. Mahimkar, J. Wang, B. Y. Zhao, H. Zheng, J. Emmons, and L. Ogden, "Predictive analysis in network function virtualization," in Proceedings of the Internet Measurement Conference 2018, 2018, pp. 161-167.
  10. H. Huang and S. Guo, "Proactive failure recovery for nfv in distributed edge computing," IEEE Communications Magazine, vol. 57, no. 5, pp. 131-137, 2019. https://doi.org/10.1109/mcom.2019.1701366
  11. S. Lange, N. Van Tu, S.-Y. Jeong, D.-Y. Lee, H.-G. Kim, J. Hong, J.-H. Yoo, and J. W.-K. Hong, "A network intelligence architecture for efficient vnf lifecycle management," IEEE Transactions on Network and Service Management, 2020.
  12. D. Lee, J.-H. Yoo, and J. W.-K. Hong, "Deep q-networks based auto-scaling for service function chaining," in 2020 16th International Conference on Network and Service Management (CNSM). IEEE, 2020, pp. 1-9.
  13. J. Hong, S. Park, J.-H. Yoo, and J. W.-K. Hong, "Machine learning based sla-aware vnf anomaly detection for virtual network management," in 2020 16th International Conference on Network and Service Management (CNSM). IEEE, 2020, pp. 1-7.
  14. OpenStack Foundation, "Welcome to zun's documentation!" [Online]. Available: https://docs.openstack.org/zun/latest/
  15. collectd, "collectd - the system statistics collection daemon." [Online]. Available: https://collectd.org/
  16. N. El-Sayed, I. A. Stefanovici, G. Amvrosiadis, A. A. Hwang, and B. Schroeder, "Temperature management in data centers: Why some (might) like it hot," in Proceedings of the 12th ACMSIGMETRICS/PERFORMANCE joint international conference on Measurement and Modeling of Computer Systems, 2012, pp. 163-174.
  17. G. Wang, L. Zhang, and W. Xu, "What can we learn from four years of data center hardware failures?" in 2017 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). IEEE, 2017, pp. 25-36.
  18. M. Baker-Harvey, "Google compute engine uses live migration technology to service infrastructure without application downtime." [Online]. Available: https://cloudplatform.googleblog.com/2015/03/Google-Compute-Engine-uses-Live-Migration-technology-to-serviceinfrastructure-without-application-downtime.html
  19. C. I. King, "Stress-ng," URL: http://kernel.ubuntu. com/git/cking/stressng. git/(visited on28/03/2018), 2017.
  20. S. Hussain, O. Chowdhury, S. Mehnaz, and E. Bertino, "Lteinspector: A systematic approach for adversarial testing of 4g lte," in Networ and Distributed Systems Security (NDSS) Symposium 2018, 2018.
  21. M. T. Raza, S. Lu, and M. Gerla, "vepc-sec: Securing lte network functions virtualization on public cloud," IEEE Transactions on Information Forensics and Security, vol. 14, no. 12, pp. 3287-3297, 2019. https://doi.org/10.1109/tifs.2019.2908800
  22. srsRAN, "srsran - your own mobile network." [Online]. Available: https://www.srsran.com/
  23. NextEPC, "Build your own 5g and lte networks with nextepc." [Online]. Available: https://nextepc.com/
  24. Open5GS, "Open source project of 5gc and epc (release-16)." [Online]. Available: https://open5gs.org/
  25. 3GPP, "Non-access-stratum (nas) protocol for evolved packet system (eps); stage 3." [Online]. Available : https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=1072