Browse > Article

Enhanced NOW-Sort on a PC Cluster with a Low-Speed Network  

Kim, Ji-Hyoung (Korea University)
Kim, Dong-Seung (Dept.of Electrical Engineering, Korea University)
Abstract
External sort on cluster computers requires not only fast internal sorting computation but also careful scheduling of disk input and output and interprocessor communication through networks. This is because the overall time for the execution is determined by reflecting the times for all the jobs involved, and the portion for interprocessor communication and disk I/O operations is significant. In this paper, we improve the sorting performance (sorting throughput) on a cluster of PCs with a low-speed network by developing a new algorithm that enables even distribution of load among processors, and optimizes the disk read and write operations with other computation/communication activities during the sort. Experimental results support the effectiveness of the algorithm. We observe the algorithm reduces the sort time by 45% compared to the previous NOW-sort[1], and provides more scalability in the expansion of the computing nodes of the cluster as well.
Keywords
I/O; sort; cluster computing; parallel algorithm;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 C. Nyberg, T. Barclay, Z. Cvetanovic, J. Gray, D. Lomet, 'AlphaSort: A Cache-Sensitive Parallel External Sort.' ACM SIGMOD Record, Proceedings of the 1994 ACM SIGMOD international conference on Management of data, Volume 23 Issue 2, 1994
2 B. Ahn and D. Kim, 'External sort on a cluster of PCs.' 2000 Int'l Con! Parallel and Distributed Processing Techniques and Applications, pp.1443-1448, Las Vegas, Nevada, USA, June 25-29, 2000
3 W. A. Martin, Sorting, ACM Computing Surveys, Vol. 3, No.4, pp. 147-174, 1971   DOI
4 http://research.microsoft.com/barc/SortBenchmark, Sort Benchmark Home Page
5 Y.C. Kim, M. Jeon, D. Kim, A. Sohn, 'Communication-efficient bitonic sort on a distributed memory parallel computer.' Proc. Int'l Conference on Parallel and Distributed Systems (ICP ADS' 2001), pp.165-170, Kyung-Ju, Korea, June 26-29, 2001
6 S-J Lee, M. Jeon, A. Sohn and D. Kim, 'Partitioned Parallel Radix Sort,' Journal of Parallel and Distributed Computing, Vol. 62, pp. 656-668, Academic Press, April 2002   DOI   ScienceOn
7 T.E. Anderson, DE. Culler, and D.A Patterson, 'A Case for NOW(Networks of Workstations).' IEEE Micro, Feb. 1994   DOI   ScienceOn
8 A. Sohn, Y Kodama. 'Load balanced parallel radix sort.' Proc. the 1998 international conference on Supercomputing, pp 305 - 312, 1998   DOI
9 K.E. Batcher, 'Sorting networks and their applications.' Proc. AFIPS Conference, pp. 307-314, 1968
10 A.C. Arpaci-Desseau, R.H. Arpaci-Desseau, D.E. Culler, J.M, Hellerstein, and D.A Patterson, 'High-Performance Sorting on Networks of Workstations.' ACM SIGMOD '97, Tucson, Arizona, May 1997   DOI
11 J. Wyllie, 'SPsort: How to sort a terabyte quickly.' Technical Report, IBM Almaden Lab., Feb. 1999, http://www.almaden.ibm.com/cs/gpfsspsort.html
12 L. Rivera, X. Zhang, A Chien, 'HPVM Minutesort.' Sort Benchmark Home Page, http://research.microsoft.com/barc/SortBenchmark/
13 김지형, 통신과 디스크 입출력 최적화를 통한 병렬 외 부정렬의 성능 향상, 석사학위논문, 고려대학교, Jan. 2002
14 D. Taniar and J.W. Rahayu, 'Sorting in parallel database systems.' Proc. High Performance Computing in the Asia Pacific Region, 2000: The Fourth Int'l Conf. and Exibition Vol.2, pp. 830-835, 2000   DOI
15 L.M. Wegner, J.I. Teuhola, 'The external heapsort' IEEE Trans. Software Engineering, Vol.15, No.7, pp. 917-925, July 1989   DOI   ScienceOn
16 C. Cerin, 'An out-of-core sorting algorithm for clusters with processors at different speed.' Proc. 2002 Parallel and Distributed Processing Symp., April 15-18, Fort Lauderdale, FL, USA   DOI
17 A.C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, David E. Culler, Joseph M. Hellerstein and David A Patterson. 'for the sorting record: experiences in tuning NOW-Sort.' Proc. the SIGMETRICS symposium on Parallel and distributed tools, pp 124 - 133, 1998
18 F. Popovici, J. Bent, B. Forney, A.A. Dusseau, R.A. Dusseau, 'Datamation 2001: A Sorting Odyssey.' Sort Benchmark Home Page, http://research.microsoft.com/barc/SortBenchmark/
19 Anon et al., 'A Measure of Transaction Processing Power.' Datamation, V.31(7):112-118. also in Readings in Database Systems, M.J., Stonebraker ed., Morgan Kaufmann, San Mateo, 1989
20 Anon et al., 'A Measure of Transaction Processing Power.' Datamation, V.31(7):112-118. also in Readings in Database Systems, M.J., Stonebraker ed., Morgan Kaufmann, San Mateo, 1989
21 LAM/MPI Parallel Computing, http://www.lammpi.org
22 The Beowulf Project, http://www.beowulf.org