Efficient All-to-All Personalized Communication Algorithms in Wormhole Networks

웜홀 방식 망에서의 효율적인 완전교환 통신 알고리즘

  • 김시관 (한국과학기술원 전산학과) ;
  • 맹승렬 (한국과학기술원 전산학과) ;
  • 조정완 (한국과학기술원 전산학과)
  • Published : 2000.05.15

Abstract

All-to-all personalized communication, or complete exchange, is at the heart of numerous applications, such as matrix transposition, fast Fourier Transform(FFT), and distributed table lookup.We present an efficient all-to-all personalized communication algorithm for a 2D torus inwormhole-routed networks. Our complete exchange algorithm adopts divide-and-conquer approach toreduce the number of start-up latency significantly, which is a good metric for network performancein wormhole networks. First, we divide the whole network into 2x2 basic cells, After speciallydesignated nodes called master nodes have collected messages to transmit to the rest of the basic cell,only master nodes perform complete exchange with reduced network size, N/2 x N/2. When finishedwith this complete exchange in master nodes, these nodes distribute messages to the rest of the masternode, which results in the desired complete exchange communication. After we present our algorithms,we analyze time complexities and compare our algorithms with several previous algorithms. And weshow that our algorithm is efficient by a factor of 2 in the required start-up time which means thatour algorithm is suitable for wormhole-routed networks.

완전교환 통신은 행렬전이, 푸리에변환 혹은 분산 테이블 검색과 같은 여러 가지 응용에서 아주 많이 활용되는 통신 방법이다. 본 논문은 웜홀 방식을 채용한 2차원 토러스에서의 개시 지연 시간을 줄이기 위하여 분할 및 합병 (divide-and-conquer) 방식을 사용한 효율적인 완전교환 통신 알고리즘을 제 안한다. 전체망을 2x2 형태의 기본셀로 분할한 뒤 각 기본셀에서는 마스터노드라고 불리는 특정 노드를 지정하여 기본셀내의 여타 노드들의 메시지를 이 마스터노드가 수집한다. 이 마스터노드들이 다른 모든 노드로 보내질 메시지를 수집한 뒤 각 기본셀내의 모든 마스터 노드들만이 가상 망을 형성하여 망의 크기가 N/2 x N/2으로 줄어든 상태로 완전 교환 알고리즘을 수행한다. 마스터노드들간의 완전교환 연산을 수행 한 뒤 이 마스터노드들은 자기가 전담했던 여타 노드들의 메시지를 재분배해 줌으로써 주어진 완전교환 연산을 완료한다. 기존의 여러 가지 알고리즘과의 비교 분석을 제시하였으며 제시한 알고리즘이 약 2배 정도의 개시 지연시간 면에서 우수함을 보인다.

Keywords

References

  1. W.C. Athas and C.L. Seitz, 'Multicomputers: Message-Passing Concurrent Computers,' IEEE Computers, Vol. 21, No. 8, pp.9-24, Aug. 1988 https://doi.org/10.1109/2.73
  2. S. Bokhari and H. Berryman, 'Complete exchange on a circuit switched mesh,' Proc. of the 1992 Scalable High Performance Computing Conference, pp.300-306, 1992 https://doi.org/10.1109/SHPCC.1992.232628
  3. J. Bruck, C. T. Ho, and D. Weatherby, 'Efficient Algorithms for All-to-All Communications in Multi-Port Message-Passing Systems,' Proc. of Symposium on Parallel Algorithms and Architectures, pp. 298-309, 1994 https://doi.org/10.1145/181014.181756
  4. W. J. Dally and C. L. Seitz, 'The Torus Routing Chip,' Journal of Parallel and Distributed Computing,' vol. 1, no. 3, pp. 187--196, 1986 https://doi.org/10.1007/BF01660031
  5. S. L. Johnsson and C. T. Ho, 'Optimum Broadcasting and Personalized Communications in Hypercubes,' IEEE Transactions on Computers, 38(9):1249--1268, Sep. 1989 https://doi.org/10.1109/12.29465
  6. S. G. Kim, S. R. Maeng and J. W. Cho, 'Complete Exchange Algorithms in Wormhole-Routed Torus Networks: A Divide-and-Conquer Strategy,' Proc. of Int'l Symposium on Parallel Architectures, Algorithms and Networks, pp. 296-301, 1999 https://doi.org/10.1109/ISPAN.1999.778955
  7. P.K. McKinley, Y.J. Tsai and D.F. Robinson, 'A Survey of Collective Communication in Wormhole-Routed Massively Parallel Computers,' Technical Report, MSU-CPS-94-35, Michigan State University, June 1994
  8. Message Passing Interface Forum, 'Document for standard message-passing interface,' Technical Report CS-93-214, University of Tennessee, Nov. 1993
  9. L.M. Ni and P.K. McKinley, 'A Survey of Wormhole Routing Techniques in Direct Networks,' IEEE Computer, Vol. 26, No. 2, pp.62-76, Feb. 1993 https://doi.org/10.1109/2.191995
  10. D.A. Reed and R.M. Fujimoto, Multicomputer Networks: Message Based Parallel Processing, MIT Press, Cambridge, MA, 1987
  11. D.F. Robinson, D. Judd, P.K. McKinley, and B.H.C. Cheng, 'Efficient Collective Data Distribution in All-Port Wormhole-Routed Hypercubes,' Proc. of Supercomputing '93, Nov. 1993, pp.792-801 https://doi.org/10.1109/SUPERC.1993.1263537
  12. D. S. Scott, 'Efficient All-to-All Patterns in Hypercube and Mesh Topologies,' Proc. of the 6th Conference Distributed Memory Concurrent Computers, pp. 398-403, 1991
  13. S. R. Seidel, 'Circuit-Switched vs. Store-and- Forward Solutions to Symmetric Communication Problems,' Proc. of the 4th Conference Hypercube Concurrent Computers Applications, pp. 253-255, 1989
  14. Y. Suh and S. Yalamanchili, 'Efficient Algorithms for Complete exchange in 2D Tori,' Proc. of the 9th IASTED Int'l Conference Parallel and Distributed Computing and Systems, pp.113-119, 1997
  15. N. Sundar, D. Jayasimha, D. Panda, and P. Sadayappan, 'Complete exchange in 2D Meshes,' Proc. of the 1994 Scalable High Performance Computing Conference, pp.406-413, 1994 https://doi.org/10.1109/SHPCC.1994.296672
  16. R. Thakur and A. Choudhary, 'All-to-all communication on meshes with wormhole routing,' Proc. of the 1994 International Parallel Processing Symposium, pp.561-565, 1994 https://doi.org/10.1109/IPPS.1994.288248
  17. Y.C. Tseng and S. Gupta, 'All-to-All Personalized Communication in a Wormhole-Routed Torus,' IEEE Tran. on Parallel and Distributed Systems, Vol. 7, No. 5, pp.498-505, May 1996 https://doi.org/10.1109/71.503775
  18. Y.C. Tseng, T.H. Lin, S. Gupta and D.K. Panda, 'Bandwidth-Optimal Complete Exchange on Wormhole-Routed 2D/3D Torus Networks: A Diagonal-Propagation Approach,' IEEE Tran. on Parallel and Distributed Systems, Vol. 8, No. 4, pp. 380-396, Apr. 1997 https://doi.org/10.1109/71.588613