DOI QR코드

DOI QR Code

File Deduplication using Logical Partition of Storage System

저장 시스템의 논리 파티션을 이용한 파일 중복 제거

  • Received : 2012.09.10
  • Accepted : 2012.10.24
  • Published : 2012.12.31

Abstract

In traditional target-based data deduplication system, all of the files should be chunked and compared for reducing duplicated data blocks. One of the critical problem of this system arises as the number of files are increasing. The system suffers from computational delay for calculating hash value and processing metadata for handling each file. To overcome this problem, in this paper, we propose a novel data deduplication system using logical partition of storage system. The system applies data deduplication scheme to each logical partition not each file. Experiment result shows that the proposed system is more efficient compared with traditional deduplication scheme where the logical partition is full of files by 50% in terms of deduplication capacity and processing time.

Keywords

References

  1. E.J. Choi, J.W. Lee, "The Method of Data Synchronization Among Devices for Personal Cloud Services," Journal of IEMEK, Vol. 6, No. 6, pp.377-382, 2011 (in Korean).
  2. D.T. Meyer, W.J. Bolosky, "A study of practical deduplication," Proceedings on the 9th USENIX conference on File and stroage technologies (FAST), 2011.
  3. S. Quinlan, S. Dorward, "Venti: a new approach to archival storage," Proceedings on the FAST 2002 Conference on File and Storage Technologies, Vol. 4, 2002.
  4. A. Muthitacharoen, B. Chen, D. Mazieres, "A low-bandwidth network file system," ACM SIGOPS Operating System Review Vol. 35, No. 5, pp.174-187, 2001. https://doi.org/10.1145/502059.502052
  5. S. Annapureddy, M.J. Freedman, D. Mazieres, "Shark: Scaling file servers via cooperative caching," Proceedings on the 2nd Symposium on Networked Systems Design and Implementation (NSDI), pp.129-142, 2005.
  6. F. Douglis, A. Iyengar. "Application-specific Delta-encoding via Resemblance Detection," Proceedings on 2003 USENIX Technical Conference, pp.113-126, 2003.
  7. P. Kulkarni, F. Douglis, J. LaVoie, J.M. Tracey, "Redundancy Elimination Within Large Collections of Files," Proceedings on 2004 USENIX Technical Conference, 2004.
  8. B. Zhu, K. Li, H. Patterson, "Avoiding the disk bottleneck in the data domain deduplication file system," Proceedings on the Seventh USENIX Conference on File and Storage Technologies (FAST), pp.269-282, 2008.
  9. A. Broder, M. Mitzenmacher, "Network Applications of Bloom Filters: A Survey," Internet Mathematics, Vol. 1, No. 4, pp.485-509, 2002.
  10. M. Lillibridge, K. Eshghi, D. Bhagwat, V. Deolalikar, G. Trezise, P. Campbell, "Sparse Indexing, Large Scale, Inline Deduplication Using Sampling and Locality," Proceedings on the Seventh USENIX Conference on File and Storage Technologies (FAST) 2009.
  11. D. Harnik, O. Margalit, D. Naor, D. Sotnikov, G. Vernik, "Estimation of deduplication ratios in large data sets," Proceedings on IEEE 28th Symposium on Mass Storage Systems and Technologies, pp.1-11, 2012.