Browse > Article

BeanFS: A Distributed File System for Large-scale E-mail Services  

Jung, Wook (KAIST 전산학과)
Lee, Dae-Woo (KAIST 전산학과)
Park, Eun-Ji (KAIST 전산학과)
Lee, Young-Jae (KAIST 전산학과)
Kim, Sang-Hoon (KAIST 전산학과)
Kim, Jin-Soo (성균관대학교 정보통신공학부)
Kim, Tae-Woong ((주)nhn 가상화플랫폼개발팀)
Jun, Sung-Won ((주)nhn 가상화플랫폼개발팀)
Abstract
Distributed file systems running on a cluster of inexpensive commodity hardware are being recognized as an effective solution to support the explosive growth of storage demand in large-scale Internet service companies. This paper presents the design and implementation of BeanFS, a distributed file system for large-scale e-mail services. BeanFS is adapted to e-mail services as follows. First, the volume-based replication scheme alleviates the metadata management overhead of the central metadata server in dealing with a very large number of small files. Second, BeanFS employs a light-weighted consistency maintenance protocol tailored to simple access patterns of e-mail message. Third, transient and permanent failures are treated separately and recovering from transient failures is done quickly and has less overhead.
Keywords
Distributed file system; e-mail system;
Citations & Related Records
연도 인용수 순위
  • Reference
1 D. Skeen and M. Stonebraker, A formal model of crash recovery in a distributed system, pp.295-317, 1987.
2 Mysql reference manual, http://dev.mysql.com/doc/.
3 D. A. Patterson, G. Gibson, and R. H. Katz, A case for redundant arrays of inexpensive disks (raid), In Proceedings of the 1988 ACM SIGMOD International Conference on Management of Data (SIGMOD'88), pp.109-116, 1988.
4 Y. Saito, B. N. Bershad, and H. M. Levy, Manageability, availability and performance in porcupine: a highly scalable, clusterbased mail service, In Proceedings of the 17th ACM Symposium on Operating Systems Principles (SOSP'99), pp.1-15, 1999.
5 Cluster File Systems, Inc. Lustre: A Scalable, High-Performance File System, http://www.clusterfs.com.
6 S. Ghemawat, H. Gobioff, and S.-T. Leung, The google file system, In Proceeding of the 19th ACM Symposium on Operating Systems Principles (SOSP'03), pp.29-43, 2003.
7 Amazon simple storage service (amazon s3), http://aws.amazon.com/s3.
8 The hadoop opensource project, http://lucene.apache.org/hadoop/.
9 S. A. Weil, S. A. Brandt, E. L. Miller, D. D. E. Long, and C. Maltzahn, Ceph: a scalable, highperformance distributed file system. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation (OSDI'06), pp. 307-320, 2006.
10 B. Liskov, S. Ghemawat, R. Gruber, P. Johnson, and L. Shrira, Replication in the harp file system, In Proceedings of the 13th ACM Symposium on Operating Systems Principles (SOSP'91), pp.226-238, 1991.
11 G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels, Dynamo: amazon's highly available key-value store, In Proceedings of 21st ACMSymposium on Operating Systems Principles (SOSP'07), pp.205-220, 2007.
12 S.-H. Kim, Y. Lee, and J.-S. Kim, Flexrpc: A flexible remote procedure call facility for modern cluster file systems, In Proceeding of 9th IEEE International Conference on Cluster Computing (CLUSTER'07), pp.257-284, 2007.
13 J. H. Morris, M. Satyanarayanan, M. H. Conner, J. H. Howard, D. S. Rosenthal, and F. D. Smith, Andrew: a distributed personal computing environment, Communications of the ACM (CACM), 29(3):184-201, 1986.   DOI   ScienceOn
14 J. H. Howard, M. L. Kazar, S. G. Menees, D. A. Nichols, M. Satyanarayanan, R. N. Sidebotham, and M. J. West, Scale and performance in a distributed file system, ACM Transactions on Computer System (TOCS), 6(1):51-81, 1988.   DOI
15 International Data Corporation (IDC), The diverse and exploding digital universe, 2007.