DOI QR코드

DOI QR Code

K-비동기식 연합학습의 동적 윈도우 조절과 모델 안정성 향상 알고리즘

Dynamic Window Adjustment and Model Stability Improvement Algorithm for K-Asynchronous Federated Learning

  • 김효상 (충북대학교 정보통신공학 전공) ;
  • 김태준 (충북대학교 정보통신공학부)
  • 투고 : 2023.04.26
  • 심사 : 2023.08.16
  • 발행 : 2023.08.30

초록

연합학습은 동기식 연합학습과 비동기식 연합학습으로 구분된다. 그 중에서 비동기식 연합학습은 동기식 연합학습 보다 시간적인 이득이 있으나 좋은 모델 성능을 얻기 위한 도전 과제가 남아있다. 특히 non-IID 학습 데이터셋에서 성능열화 방지, 적절한 클라이언트 선택 및 오래된 그래디언트 정보 관리는 모델 성능 개선에 있어 중요하다. 본 논문에서는 K-비동기식 연합학습을 다루고 있으며 non-IID 데이터셋을 통해 학습한다. 또한 기존 방식이 선택할 클라이언트 수에 있어서 정적인 K개를 사용한 것과 달리 동적으로 K 값을 조절하는 알고리즘을 제안하여 학습 시간을 줄일 수 있었다. 추가적으로, 오래된 그래디언트를 다루는 방식을 활용해 모델 성능 개선을 이루었음을 보여준다. 마지막으로 강한 모델 안정성을 얻기 위해 모델 성능을 평가하는 방식을 활용하였다. 실험 결과를 통해 전체 알고리즘을 활용했을 때 학습 시간 단축, 모델 정확도 향상, 모델 안정성 향상의 이득을 얻을 수 있음을 보여준다.

Federated Learning is divided into synchronous federated learning and asynchronous federated learning. Asynchronous federated learning has a time advantage over synchronous federated learning, but asynchronous federated learning still has some challenges to obtain better performance. In particular, preventing performance degradation in non-IID training datasets, selecting appropriate clients, and managing stale gradient information are important for improving model performance. In this paper, we deal with K-asynchronous federated learning by using non-IID datasets. In addition, unlike traditional method using static K, we proposed an algorithm that adaptively adjusts K and we can reduce the learning time. Additionally, the we show that model performance is improved by using stale gradient handling method. Finally, we use a method of judging model performance to obtain strong model stability. Experiment results show that overall algorithm can obtain advantages of reducing training time, improving model accuracy, and improving model stability.

키워드

참고문헌

  1. Ammad-Ud-Din, M., Ivannikova, E., Khan, S. A., Oyomno, W., Fu, Q., Tan, K. E., & Flanagan, A. (2019). Federated collaborative filtering for privacy-preserving personalized recommendation system. arXiv preprint arXiv:1901.09888.
  2. Chen, Y., Ning, Y., Slawski, M., & Rangwala, H. (2020). Asynchronous Online Federated Learning for Edge Devices with Non-IID Data. In International Conference on Big Data. https://doi.org/10.1109/bigdata50022.2020.9378161.
  3. Chen, Y., Sun, X., & Jin, Y. (2020). Communication-Efficient Federated Deep Learning With Layerwise Asynchronous Model Update and Temporally Weighted Aggregation. IEEE Transactions on Neural Networks and Learning Systems, 31(10), 4229-4238. https://doi.org/10.1109/tnnls.2019.2953131.
  4. Chen, Z., Liao, W., Hua, K., Lu, C., & Yu, W. (2021). Towards asynchronous federated learning for heterogeneous edge-powered internet of things. Digital Communications and Networks, 7(3), 317-326. https://doi.org/10.1016/j.dcan.2021.04.001.
  5. Cohen, G., Afshar, S., Tapson, J., & Van Schaik, A. (2017, May). EMNIST: Extending MNIST to handwritten letters. In 2017 international joint conference on neural networks (IJCNN) (pp. 2921-2926). IEEE.
  6. Dutta, S., Joshi, G., Ghosh, S., Dube, P., & Nagpurkar, P. (2018). Slow and Stale Gradients Can Win the Race: Error-Runtime Trade-offs in Distributed SGD. In International Conference on Artificial Intelligence and Statistics (pp. 803-812). http://proceedings.mlr.press/v84/dutta18a/dutta18a.pdf.
  7. Hao, J., Zhao, Y., & Zhang, J. (2020). Time Efficient Federated Learning with Semi-asynchronous Communication. In International Conference on Parallel and Distributed Systems. https://doi.org/10.1109/icpads51040.2020.00030.
  8. Hu, C., Chen, Z., & Larsson, E. G. (2021). Device Scheduling and Update Aggregation Policies for Asynchronous Federated Learning. In International Workshop on Signal Processing Advances in Wireless Communications. https://doi.org/10.1109/spawc51858.2021.9593194.
  9. Konecny, J., McMahan, H. B., Yu, F. X., Richtarik, P., Suresh, A. T., & Bacon, D. (2016). Federated learning: Strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492.
  10. Lucas, J. M., & Saccucci, M. S. (1990). Exponentially Weighted Moving Average Control Schemes: Properties and Enhancements. Technometrics, 32(1), 1-12. https://doi.org/10.1080/00401706.1990.10484583.
  11. McMahan, H. B., Moore, E., Ramage, D., Hampson, S., & Arcas, B. a. Y. (2017). Communication-Efficient Learning of Deep Networks from Decentralized Data. In International Conference on Artificial Intelligence and Statistics (pp. 1273-1282). http://proceedings.mlr.press/v54/mcmahan17a/mcmahan17a.pdf.
  12. Shi, G., Li, L., Wang, J., Chen, W., Ye, K., & Xu, C. (2020). HySync: Hybrid Federated Learning with Effective Synchronization. In High Performance Computing and Communications. https://doi.org/10.1109/hpcc-smartcity-dss50907.2020.00080.
  13. Tandon, R., Lei, Q., Dimakis, A. G., & Karampatziakis, N. (2017, July). Gradient coding: Avoiding stragglers in distributed learning. In International Conference on Machine Learning (pp. 3368-3376). PMLR.
  14. Wang, Z., Zhang, Z., Tian, Y., Yang, Q., Shan, H., Wang, W., & Quek, T. Q. (2022). Asynchronous federated learning over wireless communication networks. IEEE Transactions on Wireless Communications, 21(9), 6961-6978. https://doi.org/10.1109/TWC.2022.3153495
  15. Wu, X., & Wang, C. L. (2022, July). KAFL: Achieving High Training Efficiency for Fast-K Asynchronous Federated Learning. In 2022 IEEE 42nd International Conference on Distributed Computing Systems (ICDCS) (pp. 873-883). IEEE.
  16. Xie, C., Koyejo, S., & Gupta, I. (2019). Asynchronous federated optimization. arXiv preprint arXiv:1903.03934.
  17. Yang, Q., Liu, Y., Chen, T., & Tong, Y. (2019). Federated Machine Learning. ACM Transactions on Intelligent Systems and Technology, 10(2), 1-19. https://doi.org/10.1145/3298981.
  18. Zheng, S., Meng, Q., Wang, T., Chen, W., Yu, N., Ma, Z. M., & Liu, T. Y. (2017, July). Asynchronous stochastic gradient descent with delay compensation. In International Conference on Machine Learning (pp. 4120-4129). PMLR.
  19. Zhou, Z., Li, Y., Ren, X., & Yang, S. (2022). Towards Efficient and Stable K-Asynchronous Federated Learning with Unbounded Stale Gradients on Non-IID Data. IEEE Transactions on Parallel and Distributed Systems, 33(12), 3291-3305. https://doi.org/10.1109/tpds.2022.3150579.