DOI QR코드

DOI QR Code

Data Deduplication Method using PRAM Cache in SSD Storage System

SSD 스토리지 시스템에서 PRAM 캐시를 이용한 데이터 중복제거 기법

  • Received : 2013.01.23
  • Published : 2013.04.25

Abstract

In the recent cloud storage environment, the amount of SSD (Solid-State Drive) replacing with the traditional hard disk drive is increasing. Management of SSD for its space efficiency has become important since SSD provides fast IO performance due to no mechanical movement whereas it has wearable characteristics and does not provide in place update. In order to manage space efficiency of SSD, data de-duplication technique is frequently used. However, this technique occurs much overhead because it consists of data chunking, hasing and hash matching operations. In this paper, we propose new data de-duplication method using PRAM cache. The proposed method uses hierarchical hash tables and LRU(Least Recently Used) for data replacement in PRAM. First hash table in DRAM is used to store hash values of data cached in the PRAM and second hash table in PRAM is used to store hash values of data in SSD storage. The method also enhance data reliability against power failure by maintaining backup of first hash table into PRAM. Experimental results show that average writing frequency and operation time of the proposed method are 44.2% and 38.8% less than those of existing data de-depulication method, respectively, when three workloads are used.

최근 클라우드 스토리지 환경에서 전통적인 스토리지장치인 하드디스크를 대체하여 SSD(Solid-State Drive)의 사용량이 증가하고 있다. SSD는 기계적인 동작이 없어 빠른 입출력 성능을 가지는 반면 덮어쓰기가 불가능한 특성을 가지고 있어 공간 효율성을 위한 관리가 중요하다. 이와 같은 마모도 특성을 갖는 SSD의 공간 효율성을 효과적으로 관리하기 위해 데이터 중복제거 기법을 이용한다. 하지만 데이터 중복제거 기법은 데이터 청킹, 해싱, 해시값 검색과정 연산을 포함하기 때문에 오버헤드가 발생하는 문제점이 있다. 본 논문에서는 SSD 스토리지 시스템에서 PRAM 캐시를 이용한 데이터 중복제거 기법을 제안한다. 제안한 방법은 DRAM의 1차 해시테이블에 PRAM에 캐싱된 데이터를 위한 해시값들을 저장하고, LRU(Least Recently Used)기법을 이용하여 관리한다. PRAM의 2차 해시테이블에는 SSD 스토리지에 저장된 데이터에 대한 해시값들을 저장하고, DRAM의 1차 해시테이블에 대한 백업을 PRAM에 유지함으로써 전원 손실등에 대비하여 신뢰성을 향상시킬 수 있다. 실험결과, 제안하는 기법은 기존의 DRAM에 모든 해시값들을 저장하여 관리하는 기법보다 SSD의 쓰기 횟수 및 연산시간을 워크로드별 평균 44.2%, 38.8%의 감소 효과를 보였다.

Keywords

References

  1. Report to Congress on Server and Data enter Energy Efficiency Public Law 109-431, 2007.
  2. H.S. Hun, Y.W. Ha and B.S. Cho, "A Study on the Policy Terends of Smart Grid in Major Nations," Electronics and Telecommunications Trends, Vol 25, No 3, pp.89-98, 2010
  3. A. Gupta, R. Pisolka, B. Urgaonkar, and A. Sivasubramaniam, "Leveraging value locality in optimizing nand flash-based ssds", in Proceedings of the 9th USENIX conference on File and storage technologies, 2011.
  4. F. Chen, T. Luo, and X. Zhang, "Caftl: a cont ent-aware flash translation layer enhancing the lifespan of flash memory based solid state drives" in Proceedings of the 9th USENIX conference on File and stroage technologies, 2011.
  5. Chulmin Kim, et al. "GHOST: GPGPU-offloaded high performance storage I/O deduplication for primary storage system", in Proceedings of the 2012 International Workshop in Programming Models and Applications for Multicores and Manycores, pp 17-26, 2012.
  6. S. Cho and H. Lee, "Flit-N-Write: A Simple Deterministic Technique to Improve PRAM Write Performance, Energy and Endurance," Proc. MICRO, 2009.
  7. H. Chung, A 58nm 1.8V 1Gb PRAM with 6.4GB/S Program bandwidth, ISSCC 2011.
  8. Y, Choi, A 20nm 1.8V 8Gb PRAM with 40MB/S Program bandwidth, ISSCC 2012.
  9. Benjamin C. Lee, Architecturing PCM as a scalable DRAM alternative, ISCA, 2009.
  10. J. K. Kim et al., "A PRAM and NAND flash hybrid architecture for high-performance embedded storage subsystems," in Proceedings of 2008, pp. 31-40, 2008.
  11. M. Lillibridge, K. Eshghi, D. Bhagwat, V. Deola-likar, G. Trezise, and P. Camble. "Sparse indexing: large scale, inline deduplication using sampling and locality," In Proc. 7th USENIX Conference on File and Storage Technologies, 2009.
  12. B. Debnath, S. Sengupta, J. Li, "ChunkStash: S peeding up Inline Storage Deduplication using Flash Memory," USENIX ATC'10, 2010.
  13. H. J. Lee, K. H. Lee, and S. H. Noh. "Augmenting RAID with an SSD for Energy Relief." In Proc. of USENIX HotPower, 2008.
  14. Man-Keun, S., K. Sungahn, P. Youngwoo and P. Kyu Ho, "NLE-FFS: A flash file system with PRAM for non-linear editing." IEEE Transaction Consumer Electronic, 55(4): 2016-2024., 2009. https://doi.org/10.1109/TCE.2009.5373764
  15. Chin-Hsien Wu, Hau-Shan Wu, "A data de-duplication access framework for solid state drives", SAC'11, Proceedings of the 2011 ACM Symposium on Applied Computing, pp.600-604, Mar, 2011.
  16. S. Quinlan and S. Dorward, "Venti: a new approach to archival storage,"in Proceedings of the 1st USENIX conference on File and storage technologies, pp.89-101, 2002.
  17. Seung-Kyu Lee, Yu-Seok Yang, Deok-Hwan Kim ,"Hybrid Data Deduplication Method for Reducing Wear-Level of SSD-based Server Storage", Journal of KIISE : Computer Systems and Theory, Vol 38, No 6, pp.292-297, Dec, 2011.
  18. N. Agrawal, V. Prabhakan, T. Wobber, J. D. Davis, M. Manasse and R. Panigrahy, "Design Tradeoffs for SSD Performance," USENIX'08 ATC, pp.57-70, 2008.