Performance Analysis of Open Source Based Distributed Deduplication File System

오픈 소스 기반 데이터 분산 중복제거 파일 시스템의 성능 분석

  • 정성욱 (충남대학교 컴퓨터공학과) ;
  • 최훈 (충남대학교 컴퓨터공학과)
  • Received : 2014.09.02
  • Accepted : 2014.10.20
  • Published : 2014.12.15


Comparison of two representative deduplication file systems, LessFS and SDFS, shows that Lessfs is better in execution time and CPU utilization while SDFS is better in storage usage (around 1/8 less than general file systems). In this paper, a new system is proposed where the advantages of SDFS and Lessfs are combined. The new system uses multiple DFEs and one DSE to maintain the integrity and consistency of the data. An evaluation study to compare between Single DFE and Dual DFE indicates that the Dual DFE was better than the Single DFE. The Dual DFE reduced the CPU usage and provided fast deduplication time. This reveals that proposed system can be used to solve the problem of an increase in large data storage and power consumption.

데이터 중복제거 파일시스템인 LessFS와 SDFS의 성능을 비교하면, LessFS는 CPU 점유율과 수행 시간에서 성능이 우수하고, SDFS는 중복제거 이후 저장소 사용량이 다른 파일시스템보다 1/8 정도의 이점을 가지고 있다. 본 논문은 SDFS의 장점인 중복제거 이후 저장소 사용량 감소와 LessFS의 장점인 낮은 CPU 점유율과 수행 시간 감소의 장점을 지니는 새로운 방식을 제안한다. SDFS의 Dedup File Engines (DFE) n개를 이용하되, Dedup Storage Engines (이하 DSE) 1개를 두어 중복제거 데이터의 정합성과 일관성을 유지하는 방식이다. 제안하는 방식을 2개의 DFE와 1개의 DSE를 가진 시험환경에 구현하고 성능 비교를 수행한다.



  1. H. K. Lee, Y. J. Cho, and Y. I. Eom, "Benchmark of OpenSource Based Deduplication FileSystem," Proc. of Conference Fall 2012, Vol. 39, No. 2(A), pp. 219-220, 2012.
  2. IT on the Whell IT Common Data Deduplication page, [Online]. Available:, Accessed on Apr. 2, 2014.
  3. Kunsan University radiocom zfs Page, [Online]. Available:, Access on May.11, 2014.
  4. Wikipedia zfs Page, [Online]. Available:, Accessed on May. 11, 2014.
  5. fedoraproject Lessfs Document Page, [Online]. Available:, Accessed on Apr. 20, 2014.
  6. SDFS Deduplication Administration Page, [Online]. Available:, Accessed on May. 12, 2014.
  7. SDFS Deduplication Page, [Online]. Available:, Accessed on Apr. 10, 2014.
  8. Soon8x's little IT Storage corner page, [Online]. Available:, Accessed on May. 10, 2014.
  9. Cloud Computing & NoSQL, Cloud File system Deduplication Page, [Online]. Available:, Accessed on Apr. 1, 2014.
  10. S. O. Jung and H. Choi, "Virtual environments distributed processing data deduplication file system performance analysis," Proc. of KISSE, Korea Computer Congress 2014.

Cited by

  1. Service Status Analysis About the Spatial Information Open Platform based on the Analysis of Web Server Log and System Log vol.23, pp.3, 2015,
  2. Sanitization of Open-Source Based Deduplicated Filesystem vol.26, pp.5, 2016,
  3. A Study on Performance Analysis and Resource Re-distribution Method of the Spatial Information Open Platform Service vol.23, pp.4, 2015,
  4. Design and implementation of a Bloom filter-based data deduplication algorithm for efficient data management pp.1868-5145, 2018,