Browse > Article

Synchronization of Network Interfaces in System Area Networks  

Song, Hyo-Jung (삼성전자 기술총괄 책임)
Abstract
Many applications in cluster computing require QoS (Quality of Service) services. Since performance predictability is essential to provide QoS service, underlying systems must provide predictable performance guarantees. One way to ensure such guarantees from network subsystems is to generate global schedules from applications'network requests and to execute the local portion of the schedules at each network interface. To ensure accurate execution of the schedules, it is essential that a global time base must be maintained by local clocks at each network interface. The task of providing a single time base is called a synchronization problem and this paper addresses the problem for system area networks. To solve the synchronization problem, FM-QoS (1) proposed a simple synchronization mechanism called FBS(Feedback-Based Synchronization) which uses built-in How control signals. This paper extends the basic notion of FM-QoS to a theoretical framework and generalizes it: 1) to identify a set of built-in network flow control signals for synchrony and to formalize it as a synchronizing schedule, and 2) to analyze the synchronization precision of FBS in terms of flow control parameters. Based on generalization, two application classes are studied for a single switch network and a multiple switch network. For each class, a synchroniring schedule is proposed and its bounded skew is analyzed. Unlike FM-QoS, the synchronizing schedule is proven to minimize the bounded skew value for a single switch network. To understand the analysis results in practical networks, skew values are obtained with flow control parameters of Myrinet-1280/SAN. We observed that the maximum bounded skew of FBS is 9.2 Usec or less over all our experiments. Based on this result, we came to a conclusion that FBS was a feasible synchronization mechanism in system area networks.
Keywords
synchronization; link level flow control; system area networks; cluster computing;
Citations & Related Records
연도 인용수 순위
  • Reference
1 J.H. Kim. Bandwidth and Latency Guarantees in Low-cost, High Performance Networks. PhD thesis, University of Illinois at Urbana-Champaign, January 1997
2 E. Horowitz and S.Sahni. Fundamental of Computer Algorithms. Computer Science Press, 1978
3 P. F. Reynolds, C.Williams Jr., and Jr. R.R. Wagner. Isotach networks. IEEE Transactions on Parallel and Distributed Systems, 8(4), 1997   DOI   ScienceOn
4 K. Connelly. FM -QoS : Real-time communications using self-synchronizing schedules. In Proceedings of SC'97, November 1997
5 Myrinet on VME protocol specification standard. VITA 26-199x Draft 1.1, August 1998. VITA Standards Organization
6 A. Chien et al. Design and evaluation of an HPVM-based Windows NT supercomputer. The International Journal of HIGH PERFORMANCE COMPUTING APPLICATIONS, 13(3):201-219, Fall 1999   DOI
7 R. Horst. TNet : A reliable system area network for I/O and IPC. In Proceedings of the IEEE Symposium on Hot Interconnects, 1994
8 N. Vasanthavada and P. NN. Marinos. Synchronization of fault-tolerant clocks in the presence of malicious faults. IEEE Transactions on Computers, 37(4):440-448, 1998   DOI   ScienceOn
9 I. Foster et al. A distributed resource management architecture that supports advance reservations and co-allocation. In Proceedings of the International Workshop on Quality of Service, 1999   DOI
10 Jon C.R. Benett and H. Zhang, Hierarchical packet fair queuing algorithms. In Proceedings of ACM SIGCOMM, August 1996   DOI
11 F.Schmuck and F.Cristian. Continuous clock amortization need not affect the precision of a clock synchronization algorithm. In Proceedings of the ACM Symposium on Principles of Distributed Computing, pages 133-143, 1990   DOI
12 A. Chien and Others. HPVM software distributions, 1999. Available from http://www-csag.ucsd.edu/projects/hpvm/sw-distributions/index.html