Browse > Article

The Design and Implementation of RISE for Managing a Large Scale Cluster in Distributed Environment  

Park Doo-Sik (삼성전자 컴퓨터서버팀)
Yang Woo-Jin (삼성종합기술원)
Ban Min-Ho (삼성전자 컴퓨터서버팀)
Jeong Karp-Joo (BK21 u-Science 기반 신기술융합 사업단)
Lee Jong-Hyun (건국대학교 컴퓨터공학과)
Lee Sang-Moon (삼성종합기술원)
Lee Chang-Sung (삼성전자 컴퓨터서버팀)
Shin Soon-Churl (삼성종합기술원)
Lee In-Ho (삼성전자 컴퓨터서버팀)
Abstract
In this paper, the way of remote installation and back-up of 3-tier structure is introduced for efficient utilizing the cluster system resources distributed at several places. Recently, cluster system is constructed as the system of over hundreds nodes under complex network system mixed with public networks and private networks. Therefore, the as installation method suitable for the large scale cluster system and the remote recovery of failure nodes are important. However the previous researches which are based on 2-tier architecture may not provide the efficient cluster installation and image back-up method when the network of cluster system is composed of several private networks and public networks. In this paper, RISE (Remote Installation Service and Environment) based on the 3-tier architecture is proposed to solve this problem. In our approach, the managing node's role is divided into the global master node (GRISE) and the local master node (LRISE) to provide the efficient initial system deployment and remote failure recovery of distributed cluster system under the various network systems. Also, LRISE's availability is ensured under the complex network environments by adopting the auto-synchronization mechanism between GRISE and LRISE. In this work, a 64-node cluster system with gigabit network system is utilized for the experiment. From the experimental result, the system image with 1.86GB data can be obtained in 5 minutes and 53 seconds and the image-based installation of 64-node system can be carried out in 17 minutes and 53 seconds.
Keywords
RISE; large scale cluster; 3-tier architecture;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Martin Hamilton. Red Hat Linux Kick-Start HOWTO, http://www.cache.ja.net/dev/kickstart/KickStart-HOWTO.html
2 J. Squyres, S. Scott, M. Chase-Salerno, S. Dague, N. Gorsuch (Open Cluster Group), Open Source Cluster Application Resources (OSCAR). http://oscar.sourceforge.net
3 Sollings K. R., Trivial File Transfer (TFTP) Protocol, Version 2, Internet Request for Comments (RFC) July 1992
4 Partimage, http://www.partimage.org/
5 Sam Chessman, http://www.linuxjournal.com/article/1320
6 The Python Language Home Page, http://www.python.org
7 PostgreSQL Home Page, http://www.postgresql.org
8 M.J. Katz, P.M. Papadopoulos, G. Bruno, Leveraging standard core technologies to programmatically build Linux cluster appliances, Fourth IEEE International Conference on Cluster Computing, Chicago, IL, September 2002, pp. 47-53
9 XML package for Python, http://pyxml.sourceforge.net/
10 Intel Corporation, Preboot execution environment (pxe) specification, http://www.intel.com/design/archives/wfm/downloads/pxespec.htm
11 B. Finley, S. Dague, M. Chase-Salerno, D. Frazier, System Installation Suite (SIS). http://sisuite.org
12 JavaServer Pages, http://java.sun.com/products/jsp/
13 Servlet, http://java.sun.com/products/servlet/
14 Apache Software Foundation, http://jakarta.apache.org/tomcat
15 Top500 supercomputer sites, http://www.top500.org/