Browse > Article

딥 러닝을 위한 HW 시스템 및 SW 라이브러리  

Jeong, U-Geun (서울대학교)
Kim, Jeong-Uk (서울대학교)
Park, Jeong-Ho (서울대학교)
Park, Ji-Yeong (서울대학교)
Sin, Jae-Ho (서울대학교)
Jeong, Jae-Hun (서울대학교)
Jo, Gang-Won (서울대학교)
Kim, Hui-Hun (서울대학교)
Nam, Hyeong-Uk (서울대학교)
Lee, Jae-Jin (서울대학교)
Keywords
Citations & Related Records
연도 인용수 순위
  • Reference
1 Rivera, J., "Gartner Reveals Top Predictions for IT Organizations and Users for 2014 and Beyond," Gartner, 2013. http://www.gartner.com/newsroom/id/2603215
2 Woods, V., "Gartner Identifies the Top 10 Strategic Technology Trends for 2016," Gartner, 2015. http://www.gartner.com/newsroom/id/3143521
3 Google Scholar. https://scholar.google.com/
4 McCulloch, W. S. and Pitts, W., "A logical calculus of the ideas immanent in nervous activity," Bulletin of Mathematical Biophysics, vol. 5, no. 4, pp. 115-133, 1943.   DOI
5 LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., and Jackel, L. D., "Backpropagation applied to handwritten zip code recognition," Neural Computation, vol. 1, no. 4, pp. 541-551, 1989.   DOI
6 Birdsall, J. W., "The Sun Hardware Reference," 1995. http://www.sunhelp.org/faq/sunrefl.html
7 "NVIDIA Tesla P100," NVIDIA Whitepaper, 2016. https://images.nvidia.com/content/pdf/tesla/whitepaper/pascal-architecture-whitepaper.pdf
8 Min, S., Lee, B., and Yoon, S., "Deep Learning in Bioinformatics," arXiv preprint arXiv:1603.06430, 2016.
9 Fehrer, R. and Feuerriegel, S., "Improving Decision Analytics with Deep Learning: The Case of Financial Disclosures," arXiv preprint arXiv:1508.01993, 2015.
10 Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., and Darrell, T., "Caffe: Convolutional Architecture for Fast Feature Embedding," Proceedings of the 22nd ACM International Conference on Multimedia, pp. 675-678, 2014.
11 Abadi, M. et al., "TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems," arXiv preprint arXiv:1603.04467, 2016
12 Bergstra, J., Bastien, F., Breuleux, O., Lamblin, P., Pascanu, R., Delalleau, O., Desjardins, G., Warde-Farley, D., Goodfellow, I., Bergeron, A., and Bengio, Y., "Theano: Deep Learning on GPUs with Python," Journal of Machine Learning Research, vol. 1, pp. 1-48, 2011.
13 Torch: A scientific computing framework for LuaJIT. http://torch.ch/
14 ImageNet. http://image-net.org/
15 Chetlur, S., Woolley, C., Vandermersch, P., Cohen, J., and Tran, J., "cuDNN: Efficient Primitives for Deep Learning," arXiv preprint arXiv:1410.0759, 2014.
16 Mathieu, M., Mikael H., and LeCun, Y., "Fast Training of Convolutional Networks through FFTs," arXiv preprint arXiv:1312.5851, 2013.
17 Jouppi, N., "Google supercharges machine learning tasks with TPU custom chip," Google Cloud Platform Blog, 2016. https://cloudplatform.googleblog.com/2016/05/Google-supercharges-machine-learning-tasks-with-custom-chip.html
18 Chen, Y., Luo, T., Liu, S., Zhang, S., He, L., Wang, J., Li, L., Chen, T., Xu, Z., Sun, N., and Temam, O., "DaDianNao: A Machine-Learning Supercomputer," Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 609-622, 2014.
19 Lacey, G., Taylor, G. W., and Areibi, S., "Deep Learning on FPGAs: Past, Present, and Future," arXiv preprint arXiv: 1602.04283, 2016.
20 Han, S., Liu, X., Mao, H., Pu, J., Pedram, A., Horowitz, M. A., and Dally, W. J., "EIE: Efficient Inference Engine on Compressed Deep Neural Network," arXiv preprint arXiv:1602.01528, 2016.
21 Reagen, B., Whatmough, P., Adolf, R., Rama, S., Lee, H., Lee, S. K., Hernandez-Lobato, J. M., Wei, G.-Y, and Brooks, D., "Minerva: Enabling Low-Power, Highly-Accurate Deep Nerual Network Accelerators," Proceedings of the 43rd International Symposium on Computer Architecture, 2016.
22 Tallada, M. G., "Coarse Grain Parallelization of Deep Neural Networks," Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Article no. 1, 2016.
23 Using the GPU - Theano 0.8.2 documentation. http://deeplearning.net/software/theano/tutorial/using_gpu.html
24 cltorch. https://github.com/hughperkins/cltorch
25 OpenCL Caffe. https://github.com/BVLC/caffe/tree/opencl
26 Ovtcharov, K., Ruwase, O., Fowers, J., Strauss, K., and Chung, E., "Accelerating Deep Convolutional Neural Networks Using Specialized Hardware," Microsoft Research Whitepaper, 2015. https://www.microsoft.com/en-us/research/publication/accelerating-deep-convolutional-neural-networks-using-specalized-hardware/
27 Dean, J., Corrado, G. S., Monga, R., Chen, K., Devin, M., Le, Q. V., Mao, M. Z., Ranzato, M., Senior, A., Tucker, P., Yang, K., and Ng, A. Y. "Large Scale Distributed Deep Networks," Advances in Neural Information Processing Systems, vol. 25, pp. 1232-1240, 2012.
28 Wu, R., Yan, S., Shan, Y., Dang, Q., and Sun, G. "Deep Image: Scaling up Image Recognition," arXiv preprint arXiv: 1501.02876, 2015.
29 Adhikari, R., "Google, Movidius to Bring Deep Learning to Mobile Devices," Tech News World, 2016. http://www.technewsworld.com/story/83052.html
30 Qualcomm Zeroth Platform. https://www.qualcomm.com/invention/cognitive-technologies/zeroth
31 LiKamWa, R., Hou, Y, Gao, J., Polansky, M., and Zhong, L., "RedEye: Analog ConvNet Image Sensor Architecture for Continuous Mobile Vision," Proceedings of the 43rd International Symposium on Computer Architecture, 2016.
32 Caffe tutorial. http://caffe.berkeleyvision.org/tutorial/layers.html
33 Krizhevsky, A., Sutskever, I., and Hinton, G. E., "ImageNet Classification with Deep Convolutional Neural Networks," Advances in Neural Information Processing Systems, 2012.
34 Lavin, A. and Gray, S., "Fast Algorithms for Convolutional Neural Networks," arXiv preprint arXiv: 1509.09308, 2015.
35 Smith, C., Nguyen, C. and De. U., "Distributed Tensor Flow: Scaling Google's Deep Learning Library on Spark," ARIMO, 2016. https://mimo.com/machine-learning/deepleaming/2016/arimo-distributed-tensorflow-on-spark/
36 Vishnu, A., Siegel, C., and Daily, J., "Distributed TensorFlow with MPI," arXiv preprint arXiv: 1603.02339, 2016.
37 Multi node caffe. https://github.com/BVLC/caffe/pull/3441
38 "GPU-Based Deep Learning Inference: A Performance and Power Analysis," NVIDIA Whitepaper, 2015. https://www.nvidia.com/content/tegra!embedded-systems/pdf/jetson_tx1_whitepaper.pdf
39 Elephas: Distributed Deep learning with Keras & Spark. https://github.com/maxpumperla/elephas/
40 tensorflow-opencl. https://github.com/benoitsteiner/tensorflow-opencl
41 OpenCL. https://www.khronos.org/opencl/
42 Song, F. and Dongarra, J. "A Scalable Framework for Heterogeneous GPU-Based Clusters," Proceedings of the twenty-fourth annual ACM symposium on parallelism in algorithms and architectures, pp. 91-100, 2012.
43 Dean, J. and Ghemawat, S. "Map Reduce: Simplified Data Processing on Large Clusters," Communications of the ACM, vol 51, no. 1, pp.107-113, 2008.   DOI
44 Petitet, A., Whaley, R. C., Dongarra, J., and Cleary, A., "HPL - A Portable Implementation of the High-Performance Linpack Benchmark for Distributed Memory Computers," 2016. http://www.netlib.org/benchmark/hpl
45 IPC. https://github.com/twitter/torch-ipc
46 DistLearn. https://github.com/twitter/torch-distlearn
47 Kim, J., Seo, S., Lee, J., Nah, J., Jo, G., and Lee, J., "SnuCL: An OpenCL Framework for Heterogeneous CPU/GPU Clusters," Proceedings of the 26th ACM International Conference on Supercomputing, pp. 341-351, 2012.
48 Kim, J., Jo, G., Jung, J., Kim, J., and Lee, J., "A Distributed OpenCL Framework using Redundant Computation and Data Replication," Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 553-569, 2016.