DOI QR코드

DOI QR Code

Design and Implementation of an In-Memory File System Cache with Selective Compression

대용량 파일시스템을 위한 선택적 압축을 지원하는 인-메모리 캐시의 설계와 구현

  • 최형원 (성균관대학교 전기전자컴퓨터공학과) ;
  • 서의성 (성균관대학교 소프트웨어학과)
  • Received : 2017.03.07
  • Accepted : 2017.04.10
  • Published : 2017.07.15

Abstract

The demand for large-scale storage systems has continued to grow due to the emergence of multimedia, social-network, and big-data services. In order to improve the response time and reduce the load of such large-scale storage systems, DRAM-based in-memory cache systems are becoming popular. However, the high cost of DRAM severely restricts their capacity. While the method of compressing cache entries has been proposed to deal with the capacity limitation issue, compression and decompression, which are technically difficult to parallelize, induce significant processing overhead and in turn retard the response time. A selective compression scheme is proposed in this paper for in-memory file system caches that rapidly estimates the compression ratio of incoming cache entries with their Shannon entropies and compresses cache entries with low compression ratio. In addition, a description is provided of the design and implementation of an in-kernel in-memory file system cache with the proposed selective compression scheme. The evaluation showed that the proposed scheme reduced the execution time of benchmarks by approximately 18% in comparison to the conventional non-compressing in-memory cache scheme. It also provided a cache hit ratio similar to the all-compressing counterpart and reduced 7.5% of the execution time by reducing the compression overhead. In addition, it was shown that the selective compression scheme can reduce the CPU time used for compression by 28% compared to the case of the all-compressing scheme.

DRAM 기반의 인메모리 캐시는 고비용으로 인해 용량을 늘리는 데에는 한계가 있다. 이를 위해 압축을 이용하여 더 많은 데이터를 캐시하는 기법들이 연구되어 왔다. 그러나 압축은 높은 처리부하와 반응 지연을 야기한다. 본 논문에서는 섀넌 엔트로피를 통해 파일의 압축률을 낮은 오버헤드를 통해 고속으로 예측하여, 높은 압축률을 가진 파일만 압축하는 선택적 압축 기법을 제안하였다. 또한 이를 파일시스템 내에서 실제 사용이 가능하도록 커널 레벨에서 파일 시스템을 위한 인메모리 캐시를 제공하도록 구현하였다. 실험 결과 선택적 압축 기법은 비 압축에 비해 약 18%의 실행시간 감소를 보이며, 전체 캐시 데이터 압축 방법에 비해서도 캐시 히트율의 감소에 의한 성능하락을 최소화 시키고, 동시에 압축에 대한 오버헤드를 줄여, 7.5%의 실행시간을 감소시킬 수 있음을 보였다. 또한 압축에 사용되는 CPU사용시간을 모두 압축 했을 때와 비교하여 28%감소시킬 수 있음을 보여주었다.

Keywords

Acknowledgement

Grant : ICBMS 핵심기술 개발 사업 총괄 및 엑사스케일급 클라우드 스토리지 기술 개발

Supported by : 정보통신기술진흥센터

References

  1. Google Searches today [Online]. Available: http://www.internetlivestats.com/google-search-statistics/
  2. DECANDIA, Giuseppe, et al., "Dynamo: amazon's highly available key-value store," ACM SIGOPS operating systems review, Vol. 41, No. 6, pp. 205-220, 2007. https://doi.org/10.1145/1323293.1294281
  3. JANAPA REDDI, Vijay, et al., "Web search using mobile cores: quantifying and mitigating the price of efficiency," ACM SIGARCH Computer Architecture News, Vol. 38, No. 3, pp. 314-325, 2010. https://doi.org/10.1145/1816038.1816002
  4. PETROVIC, Jure, "Using memcached for data distribution in industrial environment," ICONS 08. Third International Conference on. IEEE, pp. 368-372, 2008.
  5. DOUGLIS, Fred, "The Compression Cache: Using On-line Compression to Extend Physical Memory," USENIX Winter, pp. 519-529, 1993.
  6. WU, Xingbo, et al., "Zexpander: a key-value cache with both high performance and fewer misses," Proc. of the Eleventh European Conference on Computer Systems, ACM, pp. 14, 2016.
  7. LUO, Qiuming, et al., "Compression and De-calcification for Memcached," High Performance Computing and Communications; IEEE 14th International Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems (HPCC/SmartCity/DSS), 2016 IEEE 18th International Conference on. IEEE, pp. 341-347, 2016.
  8. kmemcached - a Linux Kernel Memcached [Online]. Available: https://github.com/achivetta/kmemcached
  9. memcached - a distributed memory object caching system [Online]. Available: https://memcached.org/
  10. memcachefs: a memcache filesystem using FUSE [Online]. Available: http://memcachefs.sourceforge.net
  11. K. Morten, M. Gooch, and S. Jones, "Performance evaluation of computer architectures with main memory data compression," Journal of Systems Architecture, Vol. 45, No. 8, pp. 571-590, 1999. https://doi.org/10.1016/S1383-7621(98)00006-X
  12. Kjelso, Morten, Mark Gooch, and Simon Jones, "Design and performance of a main memory hardware data compressor," EUROMICRO 96. Beyond 2000: Hardware and Software Design Strategies, Proc. of the 22nd EUROMICRO Conference. IEEE, pp. 423-430, 1996.
  13. Park, Youngjo, and Jin-Soo Kim, "zFTL: Power-efficient data compression support for NAND flashbased consumer electronics devices," IEEE Transactions on Consumer Electronics, Vol. 57, No. 3, 2011.
  14. LI, Jiangpeng, et al., "How much can data compressibility help to improve NAND flash memory lifetime?," FAST, pp. 227-240, 2015.
  15. SHANNON, Claude Elwood, "A mathematical theory of communication," ACM SIGMOBILE Mobile Computing and Communications Review, Vol. 5, No. 1, pp. 3-55, 2001. https://doi.org/10.1145/584091.584093
  16. A Massively Spiffy Yet Delicately Unobtrusive Compression Library [Online] Available: http://www.zlib.net/
  17. LZO real-time data compression library [Online] Available: http://www.oberhumer.com/opensource/lzo/
  18. kmalloc() [Online]. Available: http://www.makelinux.net/books/lkd2/ch11lev1sec4
  19. vmalloc() [Online]. Available: http://www.makelinux.net/books/lkd2/ch11lev1sec5
  20. Crovella, Mark E., Murad S. Taqqu, Azer Bestavros, "Heavy-tailed probability distributions in the World Wide Web," A practical guide to heavy tails, pp. 3-26, 1998.
  21. Mark E. Crovella, Azer Bestavros, "Self-similarity in World Wide Web traffic: Evidence and possible causes," IEEE/ACM Transactions on Networking, Vol. 5, No. 3, pp. 835-846, 1997. https://doi.org/10.1109/90.650143
  22. Katcher, Jeffrey, "Postmark: A new file system benchmark," Technical Report TR3022, Network Appliance, Vol. 8, 1997.