DOI QR코드

DOI QR Code

Computational Methods for On-Node Performance Optimization and Inter-Node Scalability of HPC Applications

  • 투고 : 2012.09.17
  • 심사 : 2012.10.28
  • 발행 : 2012.12.30

초록

In the age of multi-core and specialized accelerators in high performance computing (HPC) systems, it is critical to understand application characteristics and apply suitable optimizations in order to fully utilize advanced computing system. Often time, the process involves multiple stages of application performance diagnosis and a trial-and-error type of approach for optimization. In this study, a general guideline of performance optimization has been demonstrated with two class-representing applications. The main focuses are on node-level optimization and inter-node scalability improvement. While the number of optimization case studies is somewhat limited in this paper, the result provides insights into the systematic approach in HPC applications performance engineering.

키워드

참고문헌

  1. J. Diamond, M. Burtscher, J. D. McCalpin, B. D. Kim, S. W. Keckler, and J. C. Browne, "Evaluation and optimization of multicore performance bottlenecks in supercomputing applications," Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, Austin, TX, 2011, pp. 32-43.
  2. M. Burtscher, B. D. Kim, J. Diamond, J. McCalpin, L. Koesterke, and J. Browne, "PerfExpert: an easy-to-use performance diagnosis tool for HPC applications," Proceedings of ther ACM/IEEE Conference for High Performance Computing, Networking, Storage and Analysis, New Orleans, LA, 2010, pp. 1-11.
  3. B. D. Kim, S. D. Hong, and S. D. Heister, "A study on scalability and parallel performance of a numerical model on TeraGrid," Proceedings of the TeraGrid Conference, Indianapolis, IN, 2006.
  4. N. Tallent, J. Mellor-Crummey, L. Adhianto, M. Fagan, and M. Krentel, "HPCToolkit: performance tools for scientific computing," Journal of Physics: Conference Series, vol. 125, no. 1, 2008.
  5. O. A. Sopeju, M. Burtscher, A. Rane, and J. Browne, "Auto- SCOPE: automatic suggestions for code optimizations using PerfExpert," Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, Las Vegas, NV, 2011, pp. 19-25.
  6. mpiP: lightweight, scalable MPI profiling ver. 3.3, http://mpip.sourceforge.net/.
  7. S. Succi, The Lattice Boltzmann Equation for Fluid Dynamics and Beyond, New York: Oxford University Press, 2001.
  8. mplabs: mutiphase lattice Boltzmann suite, http://code.google.com/p/mplabs/.
  9. J. C. Desplat, I. Pagonabarraga, and P. Bladon, "Ludwig: a parallel lattice-Boltzmann code for complex fluids," Computer Physics Communication, vol. 134, no. 3, pp. 273-290, 2001. https://doi.org/10.1016/S0010-4655(00)00205-8
  10. O. Filippova and D. Hanel, "Grid refinement for lattice-BGK models," Journal of Computational Physics, vol. 147, no. 1, pp. 219-228, 1998. https://doi.org/10.1006/jcph.1998.6089
  11. Z. Yu and L. S. Fan, "An interaction potential based lattice Boltzmann method with adaptive mesh refinement (AMR) for two-phase flow simulation," Journal of Computational Physics, vol. 228, no. 17, pp. 6456-6478, 2009. https://doi.org/10.1016/j.jcp.2009.05.034
  12. C. Rosales and D. S. Whyte, "Dual grid lattice Boltzmann method for multiphase flows," International Journal for Numerical Methods in Engineering, vol. 84, no. 9, pp. 1068-1084, 2010. https://doi.org/10.1002/nme.2930
  13. H. W. Zheng, C. Shu, and Y. T. Chew, "A lattice Boltzmann model for multiphase flows with large density ratio," Journal of Computational Physics, vol. 218, no. 1, pp. 353-371, 2006. https://doi.org/10.1016/j.jcp.2006.02.015
  14. P. L. Bhatnagar, E. P. Gross, and M. Krook, "A model for collision processes in gases. I. Small amplitude processes in charged and neutral one-component systems," Physical Review, vol. 94, no. 3, pp. 511-525, 1954. https://doi.org/10.1103/PhysRev.94.511

피인용 문헌

  1. A Novel Bitcoin Mining Scheme Based on the Multi-Leader Multi-Follower Stackelberg Game Model vol.6, pp.2169-3536, 2018, https://doi.org/10.1109/ACCESS.2018.2867631