Browse > Article

Analysis on Memory Characteristics of Graphics Processing Units for Designing Memory System of General-Purpose Computing on Graphics Processing Units  

Choi, Hongjun (전남대학교 전자컴퓨터공학부)
Kim, Cheolhong (전남대학교 전자컴퓨터공학부)
Publication Information
Smart Media Journal / v.3, no.1, 2014 , pp. 33-38 More about this Journal
Abstract
Even though the performance of microprocessor is improved continuously, the performance improvement of computing system becomes hard to increase, in order to some drawbacks including increased power consumption. To solve the problem, general-purpose computing on graphics processing units(GPGPUs), which execute general-purpose applications by using specialized parallel-processing device representing graphics processing units(GPUs), have been focused. However, the characteristics of applications related with graphics is substantially different from the characteristics of general-purpose applications. Therefore, GPUs cannot exploit the outstanding computational resources sufficiently due to various constraints, when they execute general-purpose applications. When designing GPUs for GPGPU, memory system is important to effectively exploit the GPUs since typically general-purpose applications requires more memory accesses than graphics applications. Especially, external memory access requiring long latency impose a big overhead on the performance of GPUs. Therefore, the GPU performance must be improved if hierarchical memory architecture which can reduce the number of external memory access is applied. For this reason, we will investigate the analysis of GPU performance according to hierarchical cache architectures in executing various benchmarks.
Keywords
GPU; GPGPU; Memory; Hierarchical Cache Architecture;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 F. Warg, J. Nilsson and M. Ekman, "An in-depth look at computer performance growth," Workshop on architectural support for security and anti-virus, pp. 144- 147, 2005.
2 V. W. Lee, C. K. Kim, J. Chhugani, M. Deisher, D. H. Kim, A. D. Nguyen, N. Satish, M. Smelyanskiy, S. Chennupaty, P. Hammarlund, R. Singhal and P. Dubey, "Debunking the 100X GPU vs. CPU Myth: An Evaluation of Throughput Computing on CPU and GPU," International Symposium on Computer Architecture, pp. 451-460, 2010.
3 General-purpose computation on graphics hardware, available at http://gpgpu.org.
4 J. D. Owens, D. Luebke, N. Govindaraju, M. Harris, J. Kruger, A. E. Lefohn and T. Purcell, "A survey of general-purpose computation on graphics hardware," Computer Graphics Forum, Vol. 26, No. 1, pp. 80-113, 2007.   DOI   ScienceOn
5 AMD, AMD Accelerated Parallel Processing OpenCL Programming Guide, 2012.
6 NVIDIA, NVIDIA's Next Generation CUDA Compute Architecture: Fermi, 2009.
7 P. Conway and B. Hughes, "The AMD Opteron Northbridge Architecture," IEEE Micro, Vol. 27, No. 2, pp. 10-21, 2007.   DOI
8 P. Kongetira, K. Aingaran, and K. Olukotun, "Niagara: A 32-Way Multithreaded Sparc Processor," IEEE Micro, Vol. 25, No. 2, pp. 21-29, 2005.   DOI   ScienceOn
9 S. Rusu, T. Simon, H. Muljono, J. Stinson, D. Ayers, J. Chang, R. Varada, M. Ratta, S. Kottapalli, and S. Vora, "A 45 nm 8-Core Enterprise Xeon Processor," Journal of Solid-State Circuits, Vol. 45, No.1, pp. 7-14, 2010.   DOI
10 NVIDA Co. Ltd., available at http://www.nvidia.com/
11 QuadroFX5800, available at http://www.nvidia.com/object/product_quadro_fx_5800_us.html
12 H. J. Choi and C. H. Kim, "Performance Evaluation of the GPU Architecture Executing Parallel Applications," Journal of the Korea Contents Association, Vol.12, No.5. pp. 10-21, 2012.
13 H. J. Choi and C. H. Kim, "Analysis of Impact of Correlation Between Hardware Configuration and Branch Handling Methods Executing General Purpose Applications," Journal of the Korea Contents Association, Vol.13, No.3. pp. 9-21, 2013.
14 W. W. L. Fung and T. M. Aamodt, "Thread Block Compaction for Efficient SIMT Control Flow," In Proceedings of the 17th International Symposium on High Performance Computer Architecture, pp. 25-36, 2011.
15 A Bakhoda, G. L. Yuan, W. W. L. Fung, H. Wong, and T. M. Aamodt, "Analyzing CUDA Workloads Using a Detailed GPU Simulator," In Proceedings of 9th International Symposium on Performance Analysis of Systems and Software, pp.163-174, 2009.
16 Booksim simulator, available at http://nocs.stanford.edu/booksim.html
17 CUDA SDK, available at http://developer.download.nvidia.coml compute/cuda/sdk/website/samples.html