1 |
E. Lindholm, M. J. Kligard, and H. P. Moreton, "A user-programmable vertex engine," In Proceedings of 28th Annual Conference on Computer Graphics (SIGGRAPH), pp.149-158, 2001.
|
2 |
J. D. Owens, D. Luebke, N. Govindaraju, M. Harris, J. Kruger, A. E. Lefohn, and T. J. Purcell, "A Survey of General-Purpose Computation on Graphics Hardware," Eurographics 2005, State of the Art Reports, pp.21-51, 2005.
|
3 |
http://developer.nvidia.com/object/cuda_3_1_do wnloads.html
|
4 |
http://www.khronos.org/opencl/
|
5 |
J. Helin, "Performance analysis of the CM-2, a massively parallel SIMD computer," In Proceedings of 6th International Conference on Supercomputing, pp.45-52, 1992.
|
6 |
A. Levinthal and T. Porter, "Chap-a SIMD graphics processor," In Proceedings of 11th Annual Conference on Computer Graphics (SIGGRAPH), pp.77-82, 1984.
|
7 |
S. Che, J. Meng, J. Sheaffer, and K. Skadron, "A performance study of general purpose applications on graphics processors using CUDA," Journal of Parallel and Distributed Computing, Vol.68, No.10, pp.1370-1380, 2008.
DOI
ScienceOn
|
8 |
R. A. Lorie and H. R. Strong, "Method for conditional branch execution in SIMD vector processors," US Patent 4435758, Vol.6, 1984(3).
|
9 |
S. Moy and E. Lindholm, "Method and system for programmable pipelined graphics processing with branching instructions," US Patent 6947047, Vol.20, 2005(9).
|
10 |
E. Rotenberg, Q. Jacobson, and J. E. Smith, "A study of control independence in superscalar processors," In Proceedings of 5th International Symposium on High-Performance Computer Architecture, pp.115-124, 1999.
|
11 |
B. W. Coon and J. E. Lindholm, "System and method for managing divergent threads in a SIMD architecture," US Patent 7353369, Vol.1, 2008(4).
|
12 |
http://www.nvidia.com/object/product_quadro_fx_5800_us.html
|
13 |
http://nocs.stanford.edu/booksim.html
|
14 |
http://developer.download.nvidia.com/compute/ cuda/sdk/website/samples.html
|
15 |
http://www.nvidia.com/content/cudazone/
|
16 |
M. J. Flynn, "Very high-speed computing systems," Proceedings of the IEEE, Vol.54, No.12, pp. 1901-1909, 1966.
DOI
ScienceOn
|
17 |
Y. H. Jang, C. Park, J. H. Park, N. Kim, and K. H. Yoo, "Parallel Processing for Integral Imaging Pickup using Multiple Threads," International Journal of Korea Contents, Vol.5, No.4, pp.30-34, 2009.
과학기술학회마을
DOI
ScienceOn
|
18 |
V. Agarwal, M. S. Hrishikesh, S. W. Keckler, and D. Burger, "Clock rate versus IPC: the end of the road for conventional microArchitectures," In Proceedings of 27th International Symposium on Computer Architecture, pp.248-259, 2000.
|
19 |
N. P. Jouppi and D. W. Wall, "Available instruction-level parallelism for superscalar and superpipelined machines," In Proceedings of 3th International Conference on Architectural Support for Programming Languages and Operating Systems, pp.272-282, 1989.
|
20 |
D. M. Tullsen, S. J. Eggers, and H. M. Levy, "Simultaneous multithreading: maximizing on-chip parallelism," In Proceedings of 22th International Symposium on Computer Architecture, pp.392-403, 1995.
|
21 |
I. Buck, T. Foley, D. Horn, J. Sugerman, K. Fatahalian, M. Houston, and P. Hanrahan, "Brook for GPUs: stream computing on graphics hardware," In Proceedings of 31th Annual Conference on Computer Graphics (SIGGRAPH), pp.777-786, 2004.
|
22 |
H. J. Choi, H. G. Jeon, and C. H. Kim, "Quantitative Anaysis of the Negative Factors on the GPU Performance," Journal of KIISE : Computing Practices and Letters, Vol.18, No.4, pp.282-287, 2012.
|
23 |
E. Rotenberg, Q. Jacobson, and J. Smith, "A study of control independence in superscalar processors," In Proceedings of 5th International Symposium on High-Performance Computer Architecture, pp.115-124, 1999.
|
24 |
W. W. L. Fung, I. Sham, G. Yuan, and T. M. Aamodt, "Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow," In Proceedings of 40th Microarchitecture, pp.407-420, 2007.
|
25 |
H. J. Choi and C. H. Kim, "Performance Evaluation of the GPU Architecture Executing Parallel Applications," Journal of the Korea Contents Association, Vol.12, No.5, pp.10-21, 2012.
과학기술학회마을
DOI
ScienceOn
|
26 |
H. J. Choi, S. G. Kang, J. M. Kim, and C. H. Kim, "Analysis of the CPU/GPU Temperature and Energy Efficiency depending on Executed Applications," Journal of The Korea Society of Computer and Information, Vol.17, No.5, pp.9-20, 2012.
과학기술학회마을
DOI
ScienceOn
|
27 |
http://www.amd.com/stream
|
28 |
https://developer.nvidia.com/cg-toolkit
|
29 |
http://msdn2.microsoft.com/en-us/library/bb50 9638.aspx
|
30 |
http://www.opengl.org/registry/doc/GLSLangS pec.Full.1.20.8.pdf
|
31 |
http://www.simplescalar.com
|
32 |
A. Bakhoda, G. L. Yuan, W. W. L. Fung, H. Wong, and T. M. Aamodt, "Analyzing CUDA Workloads Using a Detailed GPU Simulator," In Proceedings of 9th International Symposium on Performance Analysis of Systems and Software, pp.163-174, 2009.
|