Browse > Article
http://dx.doi.org/10.5626/JCSE.2012.6.4.287

A Comparative Performance Study for Compute Node Sharing  

Park, Jeho (Computing and Information Services, Harvey Mudd College)
Lam, Shui F. (Computer Engineering and Computer Science, California State University Long Beach)
Publication Information
Journal of Computing Science and Engineering / v.6, no.4, 2012 , pp. 287-293 More about this Journal
Abstract
We introduce a methodology for the study of the application-level performance of time-sharing parallel jobs on a set of compute nodes in high performance clusters and report our findings. We assume that parallel jobs arriving at a cluster need to share a set of nodes with the jobs of other users, in that they must compete for processor time in a time-sharing manner and other limited resources such as memory and I/O in a space-sharing manner. Under the assumption, we developed a methodology to simulate job arrivals to a set of compute nodes, and gather and process performance data to calculate the percentage slowdown of parallel jobs. Our goal through this study is to identify a better combination of jobs that minimize performance degradations due to resource sharing and contention. Through our experiments, we found a couple of interesting behaviors for overlapped parallel jobs, which may be used to suggest alternative job allocation schemes aiming to reduce slowdowns that will inevitably result due to resource sharing on a high performance computing cluster. We suggest three job allocation strategies based on our empirical results and propose further studies of the results using a supercomputing facility at the San Diego Supercomputing Center.
Keywords
Resource sharing; Resource allocation; Time-sharing cluster; Job scheduling; High performance computing; Percentage slowdown;
Citations & Related Records
연도 인용수 순위
  • Reference
1 C. Anglano, "A comparative evaluation of implicit coscheduling strategies for networks of workstations," Proceedings of the 9th International Symposium on High-Performance Distributed Computing, Pittsburgh, PA, 2000, pp. 221-228.
2 A. K. L. Wong and A. M. Goscinski, "Concurrent execution of multiple NAS parallel programs on a cluster," Proceedings of the 5th International Conference on Computational Science, Atlanta, GA, 2005, pp. 435-442.
3 B. B. Zhou, X. Qu, and R. P. Brent, "Effective scheduling in a mixed parallel and sequential computing environment," Proceedings of the 6th Euromicro Workshop on Parallel and Distributed Processing, Madrid, Spain, 1998, pp. 32-37.
4 G. S. Choi, S. Agarwal, J. H. Kim, C. R. Das, and A. B. Yoo, "Performance comparison of coscheduling algorithms for non-dedicated clusters through a generic framework," International Journal of High Performance Computing Applications, vol. 21, no. 1, pp. 91-105, 2007.   DOI
5 J. Weinberg and A. Snavely, "Symbiotic space-sharing on SDSC's datastar system," Proceedings of the 12th International Conference on Job Scheduling Strategies for Parallel Processing, Saint-Malo, France, 2006, 192-209.
6 J. Park, "An empirical approach to communication and performance modeling for message passing parallel applications on cluster systems," Ph.D. dissertation, Claremont Graduate University, Claremont, CA, 2009.
7 D. Bailey, T. Harris, W. Saphir, R. van der Wijngaart, A. Woo, and M. Yarrow, "The NAS Parallel Benchmarks 2.0," NASA Ames Research Center, Moffett Field, CA, Report NAS-95-020, 1995.
8 J. Park, S. Lam, and J. Angus, "Self-similarity in message passing parallel processing communication," Proceedings of the International Conference on Communications in Computing, Las Vegas, NV, 2008, pp. 94-100.
9 The LAM/MPI Team, Open Systems Lab, "LAM/MPI User's Guide version 7.1.2," http://brooks.chem.lsa.umich.edu/Cluster/status/docs/7.1.2-lam-user.pdf.
10 Technology Transfer and Intellectual Property Services, "Catalina scheduler: future home of Catalina software distribution page," http://www.sdsc.edu/catalina/.