통합 이종 프로그래밍 환경

  • Published : 2017.10.27

Abstract

Keywords

Acknowledgement

Grant : PF급 이종 초고성능컴퓨터 개발

Supported by : 한국연구재단, 서울대학교

References

  1. D. Kirk and W. Hwu. "Programming Massively Parallel Processors: A Hands-on Approach," Morgan Kaufmann, 2010.
  2. K. Ovtcharov, O. Ruwase, J.-Y. Kim, J. Fowers, K. Strauss, and E. S. Chung. "Accelerating Deep Convolutional Neural Networks Using Specialized Hardware," Microsoft Research Whitepaper, 2015.
  3. J. Ouyang, S. Lin, W. Qi, Y. Wang, B. Yu, and S. Jiang. "SDA: Software-Defined Accelerator for Large-Scale DNN Systems," In Hot Chips, vol. 26, 2014.
  4. J. Kim, T. Dao, J. Jung, J. Joo, and J. Lee. "Bridging OpenCL and CUDA: A Comparative Analysis and Translation," In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, Article No. 82, 2015.
  5. Advanced Micro Devices, Inc. "HIP : C++ Heterogeneous Compute Interface for Portability." https://gpuopen.com/compute-product/hip-convert-cuda-to-portable-c-code, 2017.
  6. Advanced Micro Devices, Inc. "Welcome to MIOpen," https://gpuopen.com/compute-product/miopen, 2017.
  7. S. Chetlur, C. Woolley, P. Vandermersch, J. Cohen, J. Tran, B. Catanzaro, and E. Shelhamer. "cuDNN: Efficient Primitives for Deep Learning," arXiv:1410.0759, 2014.
  8. NVIDIA Corporation. "NVIDIA CUDA Programming Guide, v8.0," 2017.
  9. Khronos Group. "OpenCL: the Open Standard for Parallel Programming of Heterogeneous Systems," https://www.khronos.org/opencl, 2017.
  10. HSA Foundation. "HSA Programmer's Reference Manual Version 1.1.1," http://www.hsafoundation.com/html_spec111/HSA_Library.htm, 2017.
  11. S. Wienke, P. Springer, C. Terboven, and D. an Mey. "OpenACC: First Experiences with Real-world Applications," In Proceedings of the 18th International Conference on Parallel Processing, Euro-Par '12, pages 859-870, 2012.
  12. OpenMP. "OpenMP 4.0 Specifications," http://www.openmp.org/specifications, 2017.
  13. S. Memeti, L. Li, S. Pllana, J. Kolodziej. "Benchmarking OpenCL, OpenACC, OpenMP, and CUDA: programming productivity, performance, and energy consumption," arXiv:1704.05316, 2017.
  14. "Top500 Supercomputer Sites," http://top500.org, 2017.
  15. "Intel FPGA SDK for OpenCL," https://www.aitera.com/opencl, 2017.
  16. "Xilinx SDAccel Development Environment," http://www.xilinx.com/products/design-tools/software-zone/sdaccel.html, 2017
  17. J. Kim, S. Seo, J. Lee, J. Nah, G, Jo and J. Lee. "SnuCL: an OpenCL Framework for Heterogeneous CPU/GPU Clusters," In Proceedings of the 26th International Conference on Supercomputing, pp. 341-352, 2012.
  18. J. Kim, Y. Lee, J. Park, and J. Lee. "Translating OpenMP Device Constructs to OpenCL Using Unnecessary Data Transfer Elimination," In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, Article No. 51, 2016.
  19. J. Kim, H. Kim, J. Lee, and J. Lee. "Achieving a Single Compute Device Image in OpenCL for Multiple GPUs," In Proceedings of the 16th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 277-288, 2011.
  20. J. Lee, M. Samadi, Y. Park, and S. Mahlke. "SKMD: Single Kernel on Multiple Devices for Transparent CPU-GPU Collaboration," ACM Trans. Comput. Syst. 33, 3, Article 9 (August 2015), 27 pages. 2015.
  21. Y. Gao and P. Zhang. "A Survey of Homogeneous and Heterogeneous System Architectures in High Performance Computing," In Proceedings of 2016 IEEE International Conference on Smart Cloud, pp. 170-175, 2016.
  22. J. Shift. "LibreOffice Calc To Get GPU Support," http://i-programmer.info/news/202-number-crunching/6073-libreoffice-calc-to-get-gpu-support.html, 2013.
  23. G. Jo, J. Jung, J. Park, and J. Lee. "Memory-Access-Pattern Analysis Techniques for OpenCL Kernels," In Proceedings of the 30th International Workshop on Languages and Compilers for Parallel Computing, 2017.