[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.5139/JKSAS.2002.30.7.068

Acceleration of LU-SGS Code on Latest Microprocessors Considering the Increase of Level 2 Cache Hit-Rate

Choi, J.Y. (부산대학교 항공우주공학과)
Oh, Se-Jong (부산대학교 항공우주공학과)

Publication Information

Journal of the Korean Society for Aeronautical & Space Sciences / v.30, no.7, 2002 , pp. 68-80 More about this Journal

Abstract

An approach for composing a performance optimized computational code is suggested for latest microprocessors. The concept of the code optimization, called here as localization, is maximizing the utilization of the second level cache that is common to all the latest computer system, and minimizing the access to system main memory. In this study, the localized optimization of LU-SGS (Lower-Upper Symmetric Gauss-Seidel) code for the solution of fluid dynamic equations was carried out in three different levels and tested for several different microprocessor architectures most widely used in these days. The test results of localized optimization showed a remarkable performance gain up to 7.35 times faster solution, depending on the system, than the baseline algorithm for producing exactly the same solution on the same computer system.

Keywords

Computer Code Optimization; Localization; LU-SGS (Lower-Upper Symmetric Gauss-Seidel) scheme; Microprocessors; Level 2 Cache; Cache Hit-Rate;

Citations & Related Records

Reference

1	http://www.netlib.org/atlas/index.html.
2	Intel Pentium 4 and Xeon Processor Optimization Reference Manual, Intel Corp., 1999-2001, http://developer.intel.com.
3	Anderson, E., et al., LAPACK Users' Guide Third Edition, SIAM 1999, Philadelphia, PA, http://www.netlib.org/lapack/index.html.
4	Patankar, S.V., Numerical Heat Transfer and Fluid Plow, Hemisphere, 1980.
5	Moore, G.E., "Cramming more components onto integrated circuits," Electronics, Vol.38, No. 8, April 19, 1965, http://www.intel.com/research/silicon /mooreslaw.htm.
6	Crandall, R.E., "PowerPC G4 for Engineering, Science, and Education," Apple Computer, Inc., Oct. 2000, http://www.apple .com/powermac/ pdf/PowerPC-G4velocityengine.pdf.
7	Johnson, J.J., "The AMD- $760^{TM}$ MPX Platform for the AMD - $Athlon^{TM}$ MP Processor," White Paper PID# 25787A, AMD Inc., Jan. 2002, http://www.amd.com/us-en/Processors/ Productlnformation/0?30_118_756_809,00.html.
8	Schreiber, R. and Dongarra, J., "Automatic Blocking of Nested Loops," University of Tennessee Computer Science Technical Report, CS-90-108, May 1990, http://www.netHb.org /utk/people/ JackDongarra/pdf/autoblock.pdf.
9	Dongarra, J. J., Du Croz, J., Duff, I. S. and Hammarling, S., "A Set of Level 3 Basic Linear Algebra Subprograms", ACM Trans. Math. Soft, 16 (1990), pp. 1-17, http://www.netlib.org /bias/index.html. DOI
10	Intel Corp., " $Intel^{\circled R}$ 850 Chipset: 82850 Memory Controller Hub (MCH) Datasheet," Intel Document Number 290691-001, Nov. 2000, http://www.intel.com/design/chipsets/850/.
11	Intel Corp., " $Intel{\circled R}$ 845 Chipset: 82845 Memory Controller Hub (MCH) for SDR Datasheet," Intel Document Number 290725-002, Jan. 2002., http://www.intel.com/design/chip sets/845/.
12	Intel Architecture Optimization Reference Manual, Intel Corp., 1998-1999, http://developer .intel.com.
13	http://www.polyhedron.co.uk.
14	Tendler, J.M., Dodson, S., Fields, S., Le, H. and Sinharoy, B., "Power 4 System Micro architecture," IBM Corp., Oct. 2001, http:// www-l.ibm.com/servers/eserver/pseries/hardw are/whitepapers/power4.pdf.
15	Intel Corp., "The Xeon Processor MP Product Overview," Intel Corp., http://www .intel.com/ design/ Xeon/ xeonmp/prodbref/inde x.htm.
16	Yoon, S. and Jameson, A., "Lower-Upper Symmetric-Gauss-Seidel Method for the Euler and navier-Stokes Equations," AIAA Journal, Vol.26, No. 9, 1988, pp.1025-1026. DOI ScienceOn
17	Choi, J.-Y., Jeung, I.-S. and Yoon, Y., "Computational Fluid Dynamics Algorithms for Unsteady Shock-Induced Combustion, Part 1: Validation," AIAA Journal, Vol. 38, No. 7, July 2000, pp.1179-1187. DOI ScienceOn

KSCI

Acceleration of LU-SGS Code on Latest Microprocessors Considering the Increase of Level 2 Cache Hit-Rate 최신 마이크로프로세서에서 2차 캐쉬 적중률 증가를 고려한 LU-SGS 코드의 가속

Acceleration of LU-SGS Code on Latest Microprocessors Considering the Increase of Level 2 Cache Hit-Rate