Analysis of Programming Techniques for Creating Optimized CUDA Software |
Kim, Sung-Soo
(서강대학교 컴퓨터공학과)
Kim, Dong-Heon (서강대학교 컴퓨터공학과) Woo, Sang-Kyu (서강대학교 컴퓨터공학과) Ihm, In-Sung (서강대학교 컴퓨터공학과) |
1 | Victor Podlozhnyuk, Image Convolution with CUDA, NVIDIA CUDA 2.0 SDK document, 2007. |
2 | NVIDIA. NVIDIA CUDA Visual Profiler (Version 2.3), 2009. |
3 | Joe Stam, Convolution Soup, NVIDIA, 2009. |
4 | NVIDIA. NVIDIA CUDA Compute Unified Device Architecture: Technical Brief NVIDIA GeForce GTX 200 GPU Architectural Overview, 2008. |
5 | NVIDIA. Optimizing CUDA, 2009. |
6 | B. Parhami. Introduction to Parallel Processing: Algorithms and Architectures, Plenum Press, New York, pp.377-379, 1999. |
7 | Sobel, I., Feldman,G., A 3x3 Isotropic Gradient Operator for Image Processing, presented at a talk at the Stanford Artificial Project, 1968. |
8 | Mark Segal, Kurt Akeley, The OpenGL Graphics System: A Specification(Version 2.1 - December 1), 2006. |
9 | Shane Ryoo, Christopher I. Rodrigues, Sara S. Baghsorkhi, Sam S. Stone, David B. Kirk, and Wen-mei W. Hwu, Optimization Principles and Application Performance Evaluation of a Multithreaded GPU Using CUDA, Proc. 13th ACM SIGPLAN Symp. Principles and Practice of Parallel Programming, ACM Press, 2008. |
10 | Shuai Che, Michael Boyer, Jiayuan Meng, David Tarjan, Jeremy W. Sheaffer, and kevin Skadron, A Performance Study of General-Purpose Applicaions on Graphics Processors Using CUDA, Journal of Parallel and Distributed Computing, University of Virginia, 2008. |
11 | NVIDIA. http://www.nvidia.com/object/product_geforc e_gtx_280_us.html, 2009. |
12 | NVIDIA. NVIDIA CUDA Compute Unified Device Architecture: Programming Guide (Version 2.3), 2009. |
13 | Maryam Moazeni, Alex Bui, and Majid Sarrafzadeh, A Memory Optimization Technique for Software- Managed Scratchpad Memory in GPUs, University of California, 2009. |