Acknowledgement
Grant : 대규모 딥러닝 고속 처리를 위한 HPC 시스템 개발
Supported by : 정보통신기술진흥센터
References
- E.P. Xing and Q. Ho, "A New Look at the System, Algorithm and Theory Foundations of Large-Scale Distributed Machine Learning," KDD 2015 Tutorial.
- L. Rokach, "Ensemble-Based Classifiers," Artif. Intell. Rev., vol. 33, no. 1-2, Feb. 2010, pp. 1-39. https://doi.org/10.1007/s10462-009-9124-7
- J. Ngiam et al., "Multimodal Deep Learning," Proc. Int. Conf. Mach. Learning, Bellevue, USA, 2011, pp. 1-9.
- S.J. Pan and Q. Yang, "A Survey on Transfer Learning," IEEE Trans Knolw. Data Eng., vol. 22, no. 10, 2010, pp. 1345-1359. https://doi.org/10.1109/TKDE.2009.191
- 안신영 외, "딥러닝 분산 처리 기술 동향," 전자통신동향분석, 제31권제3호, 2016, pp. 131-141. https://doi.org/10.22648/ETRI.2016.J.310314
- Training with Multiple GPUs Using Model Parallelism. https://mxnet.incubator.apache.org/faq/model_parallel_lstm.html
- T. Chen et al., "MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems," In Proc. LearningSys, Montreal, Canada, Oct. 10, 2015.
- A. Krizhevsky, "One Weird Trick for Parallelizing Convolutional Neural Networks," 2014, arXiv preprint arXiv: abs/1404.5997.
- K. Zhang, "Data Parallel and Model Parallel Distributed Training with Tensorflow," http://kuozhangub.blogspot.kr/2017/08/data-parallel-and-model-parallel.html
- A. Oland and B. Raj, "Reducing Communication Overhead in Distributed Learning by an Order of Magnitude (Almost)," In IEEE Int. Conf. Acoustics, Speech Signal Process., Brisbane, Australia, 2015, pp. 2219-2223.
- T. Xiao et al., "Fast Parallel Training of Neural Language Models," Int. Joint Conf. Artif. Intell., Melbourne, Australia, Aug. 2017. pp. 4193-4199.
- P. Goyal et al., "Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour," June 2017, arXiv: 1706.02677.
- D. Amodei et al., "Deep Speech 2: End-to-End Speech Recognition in English and Mandarin." ICML, NY, USA, June 2016, pp. 173-182.
- E.P. Xing et al., "Petuum: A New Platform for Distributed Machine Learning on Big Data Eric," IEEE Trans. Big Data, vol. 1, no. 2, 2015, pp. 49-67. https://doi.org/10.1109/TBDATA.2015.2472014
- S. Lee et al., "On Model Parallelization and Scheduling Strategies for Distributed Machine Learning," Int. Conf. Neural Inform. Process. Syst., vol. 2, 2014, pp. 2834-2842.
- J.K. Kim et al., "STRADS: a Distributed Framework for Scheduled Model Parallel Machine Learning," Proc. Eur. Conf. Comput. Syst., London, UK, Apr. 2016, pp. 1-16.
- W. Wang et al., "SINGA: Putting Deep Learning in the Hands of Multimedia Users," In ACM Multimedia, Brisbane, Australia, Oct. 2015, pp. 25-34.
- M. Abadi et al., "TensorFlow: A System for Large-Scale Machine Learning," Proc. USENIX Symp. Oper. Syst. Des. Implement., Savannah, GA, USA, 2016, pp. 265-283.
- J. Yangqing et al., "Caffe: Convolutional Architecture for Fast Feature Embedding," In Proc. Int. Conf. Multimedia, Orlando, FL, USA, Nov. 2014, pp. 675-678.
- S.Y. Ahn et al., "A Novel Shared Memory Framework for Distributed Deep Learning in High-Performance Computing Architecture," accepted in ICSE 2018.
- T.M. Breuel, "The Effects of Hyperparameters on SGD Training of Neural Networks," 2015, arXiv preprint arXiv: 1508.02788.
- P. Goyal et al., "Accurate, Large Minibatch SGD: Training Imagenet in 1 Hour," 2017, arXiv preprint arXiv: 1706.02677.
- J. Dean et al., "Large Scale Distributed Deep Networks," NIPS'12, vol. 1, Dec. 2012, pp. 1223-1231.
- A. Gaunt et al., "AMPNet: Asynchronous Model-Parallel Training for Dynamic Neural Networks," 2018, arXiv preprint arXiv: 1705.09786.
- D. Shrivastava et al., "A Data and Model-Parallel, Distributed and Scalable Framework for Training of Deep Networks in Apache Spark," 2017, arXiv preprint arXiv: 1708.05840v.