Browse > Article

A Kernel-Level Group Communication System for Highly Available Linux Cluster  

이상균 (삼성전자(주) DVS사업부 시스템 프로그래머)
박성용 (서강대학교 컴퓨터학과)
Abstract
With the increase of interests in cluster, there have been a number of research efforts to address the high availability issues on cluster. However, there are no kernel-level group communication systems to support the development of kernel-level application programs and it is not easy to use traditional user-level group communication systems for the kernel-level applications. This paper presents the design and implementation issues of KCGCS(Kernel-level Cluster Group Communication System), which is a kernel-level group communication module for linux cluster. Unlike traditional user-level group communication systems, the KCGCS uses light-weight heartbeat messages and a ring-based heartbeat mechanism, which allows users to implement scalable failure detection mechanisms. Moreover, the KCGCS improves the reliability by using distributed coordinators to maintain membership information.
Keywords
Group Communication; Linux Cluster; High Availability; Linux Kernel;
Citations & Related Records
연도 인용수 순위
  • Reference
1 O. Rodeh, 'The Design and Implementation of Lansis/E,' 1997
2 S. Ranganathan, A. D. George, R. W. Todd, M. C. Chidester, 'Gossip-Style Failure Detection and Distributed Consensus for Scalable Heterogeneous Clusters,' 2001   DOI
3 L. E. Moser, P. M. Melliar-Smith, D. A. Agarwal, R. K. Budhia, C. A. Lingley-Papadopoulos, 'Totem; A Fault-Tolerant Multicast Group Communication system,' Communications of the ACM No4. Vol39, pp.54-63, 1996   DOI
4 K. Berman and R. V. Renesse, Reliable Distributed Computing with the Isis Toolkit, IEEE Computer Society Press, 1994
5 K. Birman, B. Constable, M. Hayden, J. Hickey, C. Kreitz, R. Renesse, O. Rodeh, W. Vegels, 'The Horus and Ensemble Projects: Accomplishment and Limitations,' 2000
6 Y. Amir, D. Dolev, S. Kramer, and D. Mallti, 'Transis: A Communication Sub-System for High Availability,' FTCS Conference, July 1992   DOI
7 박동식, 이상균, 박성용, 권오영, 박형우, '리눅스 클러스터를 위한 효율적인 커널 통신 시스템의 설계 및 구현', 한국 정보과학회 컴퓨터 시스템 연구회 추계 학술 발표회 논문집, pp.41-47, 2002   과학기술학회마을
8 R. Renesse, K. Birman, S. Maffeis, 'Horus: A Flexible Group Communication System,' 1996
9 'Ensemble Tutorial,' http//www.cs.cornell.edu/Info/Projects/Ensemble/doc/tut/
10 R. Renesse, Y. Minsky, M. Hayden, 'Gossip-Style Failure Detection Service,' 1998