DOI QR코드

DOI QR Code

A Comparative Performance Analysis of Spark-Based Distributed Deep-Learning Frameworks

스파크 기반 딥 러닝 분산 프레임워크 성능 비교 분석

  • 장재희 (서울대학교 전기정보공학부) ;
  • 박재홍 (서울대학교 전기정보공학부) ;
  • 김한주 (서울대학교 전기정보공학부) ;
  • 윤성로 (서울대학교 전기정보공학부)
  • Received : 2016.10.07
  • Accepted : 2017.02.07
  • Published : 2017.05.15

Abstract

By piling up hidden layers in artificial neural networks, deep learning is delivering outstanding performances for high-level abstraction problems such as object/speech recognition and natural language processing. Alternatively, deep-learning users often struggle with the tremendous amounts of time and resources that are required to train deep neural networks. To alleviate this computational challenge, many approaches have been proposed in a diversity of areas. In this work, two of the existing Apache Spark-based acceleration frameworks for deep learning (SparkNet and DeepSpark) are compared and analyzed in terms of the training accuracy and the time demands. In the authors' experiments with the CIFAR-10 and CIFAR-100 benchmark datasets, SparkNet showed a more stable convergence behavior than DeepSpark; but in terms of the training accuracy, DeepSpark delivered a higher classification accuracy of approximately 15%. For some of the cases, DeepSpark also outperformed the sequential implementation running on a single machine in terms of both the accuracy and the running time.

딥 러닝(Deep learning)은 기존 인공 신경망 내 계층 수를 증가시킴과 동시에 효과적인 학습 방법론을 제시함으로써 객체/음성 인식 및 자연어 처리 등 고수준 문제 해결에 있어 괄목할만한 성과를 보이고 있다. 그러나 학습에 필요한 시간과 리소스가 크다는 한계를 지니고 있어, 이를 줄이기 위한 연구가 활발히 진행되고 있다. 본 연구에서는 아파치 스파크 기반 클러스터 컴퓨팅 프레임워크 상에서 딥 러닝을 분산화하는 두 가지 툴(DeepSpark, SparkNet)의 성능을 학습 정확도와 속도 측면에서 측정하고 분석하였다. CIFAR-10/CIFAR-100 데이터를 사용한 실험에서 SparkNet은 학습 과정의 정확도 변동 폭이 적은 반면 DeepSpark는 학습 초기 정확도는 변동 폭이 크지만 점차 변동 폭이 줄어들면서 SparkNet 대비 약 15% 높은 정확도를 보였고, 조건에 따라 단일 머신보다도 높은 정확도로 보다 빠르게 수렴하는 양상을 확인할 수 있었다.

Keywords

Acknowledgement

Supported by : 치안과학기술연구개발사업단, 정보통신기술진흥센터

References

  1. Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning," Nature, Vol. 521, No. 7553, pp. 436-444, 2015. https://doi.org/10.1038/nature14539
  2. C. Szegedy et al., "Rethinking the Inception Architecture for Computer Vision," CoRR, Vol. abs/1512.0, 2015.
  3. K. He et al., "Deep Residual Learning for Image Recognition," CoRR, Vol. abs/1512.03385, 2015.
  4. A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks," Adv. Neural Inf. Process. Syst., pp. 1097-1105, 2012.
  5. F. Niu et al., "HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent," Adv. Neural Inf. Process. Syst., No. 1, p. 21, 2011.
  6. J. Dean et al., "Large Scale Distributed Deep Networks," NIPS 2012 Neural Inf. Process. Syst., pp. 1-11, 2012.
  7. M. Abad et al., "TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems," CoRR, Vol. abs/1603.04467, 2016.
  8. M. Zaharia et al., "Spark: cluster computing with working sets," Proc. of the 2nd USENIX conference on Hot topics in cloud computing, 2010, Vol. 10, p. 10.
  9. H. Kim, J. Park, J. Jang, and S. Yoon, "DeepSpark: Spark-Based Deep Learning Supporting Asynchronous Updates and Caffe Compatibility," arXiv Prepr. arXiv1602.08191, 2016.
  10. P. Moritz, R. Nishihara, I. Stoica, and M. I. Jordan, "SparkNet: Training Deep Networks in Spark," arXiv Prepr. arXiv1511.06051, 2015.
  11. A. Krizhevsky and G. Hinton, "Learning multiple layers of features from tiny images," Citeseer, 2009.
  12. S. Zhang, A. Choromanska, and Y. LeCun, "Deep learning with Elastic Averaging SGD," To Appear ICLRws-2015, No. 2012, pp. 1-21, 2015.
  13. M. Li, D. Andersen, J. Park, and A. Smola, "Scaling distributed machine learning with the parameter server," OSDI, 2014.
  14. X. Lian, Y. Huang, Y. Li, and J. Liu, "Asynchronous Parallel Stochastic Gradient for Nonconvex Optimization," in Advances in Neural Information Processing Systems 28, C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, Eds. Curran Associates, Inc., pp. 2737-2745, 2015.
  15. Y. Jia et al., "Caffe: Convolutional Architecture for Fast Feature Embedding," arXiv Prepr. arXiv1408. 5093, 2014.
  16. O. Russakovsky et al., "ImageNet Large Scale Visual Recognition Challenge," Int. J. Comput. Vis., Vol. 115, No. 3, pp. 211-252, 2015. https://doi.org/10.1007/s11263-015-0816-y