하이레벨 GPGPU 프로그래밍 모델과 알고리즘의 기술 동향

  • 발행 : 2014.07.30

초록

키워드

참고문헌

  1. Michael Swaine, "New Chip from Intel Gives High-Quality Displays", March 14, 1983, p.16
  2. NVIDIA, GeForce 256, http://www.nvidia.com/page/geforce256.html
  3. NVIDIA, CUDA, http://www.nvidia.com/cuda
  4. Khronos Group, OpenCL, http://www.khronos.org/opencl/
  5. NVIDIA, "NVIDIA Kepler GK110 Architecture Whitepaper." (2012).
  6. NVIDIA, "TechBrief Dynamic Parallelism in CUDA." (2012).
  7. Harris. "UNIFIED MEMORY IN CUDA 6." NVIDIA. 18 Nov. 2013. (Online) http://devblogs.nvidia.com/parallelforall /unified-memory-in-cuda-6/
  8. Jablin, Thomas B., et al. "Automatic CPU-GPU communication management and optimization." ACM SIGPLAN Notices 46.6 (2011): 142-151.
  9. Guihot, Herve. "RenderScript." Pro Android Apps Performance Optimization. Apress, 2012. 231-263.
  10. OpenACC, http://www.openacc-standard.org/
  11. Gregory, K. Overview and C++ AMP approach. Technical report. Microsoft, Providence, 2011.
  12. Klockner, Andreas, et al. "PyCUDA and PyOpenCL: A scripting-based approach to GPU run-time code generation." Parallel Computing 38.3 (2012): 157-174. https://doi.org/10.1016/j.parco.2011.09.001
  13. Han, Tianyi David, and Tarek S. Abdelrahman. "hiCUDA: High-level GPGPU programming." Parallel and Distributed Systems, IEEE Transactions on 22.1 (2011): 78-90. https://doi.org/10.1109/TPDS.2010.62
  14. Yan, Yonghong, Max Grossman, and Vivek Sarkar. "JCUDA: A programmer-friendly interface for accelerating Java programs with CUDA." Euro-Par 2009 Parallel Processing. Springer Berlin Heidelberg, 2009. 887-899.p
  15. Stratton, John A., Sam S. Stone, and W. Hwu Wen-mei. "MCUDA: An efficient implementation of CUDA kernels for multicore CPUs." Languages and Compilers for Parallel Computing. Springer Berlin Heidelberg, 2008. 16-30.
  16. Hong, Chuntao, et al. "MapCG: writing parallel program portable between CPU and GPU." Proceedings of the 19th international conference on Parallel architectures and compilation techniques. ACM, 2010.
  17. Lee, Seyong, Seung-Jai Min, and Rudolf Eigenmann. "OpenMP to GPGPU: a compiler framework for automatic translation and optimization." ACM Sigplan Notices 44.4 (2009): 101-110. https://doi.org/10.1145/1594835.1504194
  18. Ohshima, Satoshi, Shoichi Hirasawa, and Hiroki Honda. "OMPCUDA: OpenMP execution framework for CUDA based on omni OpenMP compiler." Beyond loop level parallelism in OpenMP: accelerators, tasking and more. Springer Berlin Heidelberg, 2010. 161-173.
  19. Jacob, Ferosh, et al. "CUDACL: A tool for CUDA and OpenCL programmers." High Performance Computing (HiPC), 2010 International Conference on. IEEE, 2010.
  20. Bell, Nathan, and Jared Hoberock. "Thrust: A productivity-oriented library for CUDA." GPU Computing Gems 7 (2011).
  21. Owens, John D., et al. "A Survey of General-Purpose Computation on Graphics Hardware." Computer graphics forum. Vol. 26. No. 1. Blackwell Publishing Ltd, 2007.
  22. Williams, Lance. "Pyramidal parametrics." Acm siggraph computer graphics. Vol. 17. No. 3. ACM, 1983.
  23. Hensley, Justin, et al. "Fast Summed-Area Table Generation and its Applications." Computer Graphics Forum. Vol. 24. No. 3. Blackwell Publishing, Inc, 2005.
  24. Harris, Mark, Shubhabrata Sengupta, and John D. Owens. "Parallel prefix sum (scan) with CUDA." GPU gems 3.39 (2007): 851-876.
  25. Batcher, Kenneth E. "Sorting networks and their applications." Proceedings of the April 30-May 2, 1968, spring joint computer conference. ACM, 1968.
  26. Purcell, Timothy J., et al. "Photon mapping on programmable graphics hardware." Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware. Eurographics Association, 2003.