DOI QR코드

DOI QR Code

BLOCS: Block Correlation Aware Sequential Pattern Mining based Caching Algorithm for Hybrid Storages

BLOCS: 블록 상관관계를 인지하는 시퀀스 패턴 마이닝 기반 하이브리드 스토리지 캐슁 알고리즘

  • Lee, Seongjin (Department of Electronics and Computer Engineering, Hanyang University) ;
  • Won, Youjip (Department of Computer and Software, Hanyang University)
  • 이성진 (한양대학교 전자컴퓨터통신공학과) ;
  • 원유집 (한양대학교 컴퓨터소프트웨어학과)
  • Received : 2014.06.11
  • Accepted : 2014.07.09
  • Published : 2014.07.31

Abstract

In this paper, we propose BLOCS algorithm to find sequence of data that should be saved in cache device of hybrid storage system which uses SSD as a cache device. BLOCS algorithm which uses a sequence pattern mining scheme, creates a set of frequently requested sectors with respect to requested order of sectors. To compare the performance of the proposed scheme, we introduce Distance (DIST) based scheme, Request Frequency (FREQ) based scheme, and Frequency times Size (F-S) based scheme. We measure the hit ratio and I/O latency of different caching schemes using hybrid storage caching simulator. We acquired booting workload along with ten scenarios of launching applications and use the workloads as input to the cache simulator. After experiment with booting workload, we find that BLOCS scheme gives hit ratio of 61% which is about 15% higher than the least performing DIST scheme.

본 논문은 SSD를 캐쉬로 사용하는 하이브리드 저장장치에서 캐쉬에 저장할 데이터를 찾기 위한 BLOCS 기법을 제안한다. 시퀀스 패턴 마이닝을 사용하는 BLOCS 기법은 파일시스템에서 호출하는 섹터들의 연관성을 발생한 순서를 고려하여 빈번히 요청되는 섹터들의 집합을 생성한다. 비교 분석을 위해 탐색거리(DIST) 기반 기법과 요청 빈도(FREQ) 기반 기법 그리고 빈도와 크기의 곱(F-S) 기반 기법을 제안하였다. 제안한 캐슁 기법을 평가하기 위해 하이브리드 캐슁 시뮬레이터를 개발하여 적중률과 응답시간 정보를 얻는다. 부팅 시 발생하는 I/O의 흐름자료와 10개의 응용프로그램들의 실행 시나리오에서 발생한 I/O 흐름자료를 수집하여 캐쉬 시뮬레이터의 입력으로 사용하였다. 실험 결과 부팅 흐름자료에서 제안한 BLOCS 기법이 61%의 적중률을 나타내서 적중률이 가장 낮았던 거리 우선 기반 기법에 비해 15% 더 높은 적중률을 보였다.

Keywords

References

  1. Seagate. Desktop HDD ST4000DM000 specification. http://www.seagate.com/internal-hard-drives/ desktop-hard-drives/desktop-hdd
  2. Samsung. 512GB 2.5-inch SSD 840 pro series. http://www.samsung.com/us/computer/memory-storage/MZ-7PD512BW-specs
  3. Samsung, "1gx 8bit/2gx 8bit/4gx 8bit NAND flash memory (K9K8G08U0A)." 2006.
  4. Samsung, "1g x 8 bit / 2g x 8 bit NAND flash memory (k9lag08u1a)," 2007.
  5. Samsung. What is V-NAND and how is it different to existing technology? http://www.samsung.com/global/business/semiconductor/html/product/flash-solution/vnand/overview.html
  6. High-Capacity SSDs Finally Match the per-GB Prices of Smaller SSDs. http://dealnews.com/features/High-Capacity-SSDs-Finally-Match-the-per-GB-Prices-of-Smaller-SSDs/622014.html
  7. Solid-State Drives Will Complement, Not Replace, Hard-Disk Drives in Data Centers. https://www.gartner.com/doc/2427717/solidstate-drives-complement-replace-harddisk
  8. Jae-Duk Lee, Sung-Hoi Hur, and J.-D. Choi, "Effects of floating-gate interference on NAND flash memory cell operation," Electron Device Letters, IEEE, vol. 23, pp. 264-266, 2002. https://doi.org/10.1109/55.998871
  9. T. Kgil, D. Roberts, and T. Mudge, "Improving NAND Flash Based Disk Caches," in Computer Architecture, 2008. ISCA '08. 35th International Symposium on, pp. 327-338. Beijing, China, June, 2008
  10. J.-W. Hsieh, T.-W. Kuo, P.-L. Wu, and Y.-C. Huang, "Energy-efficient and performance enhanced disks using flash-memory cache, " presented at the Proceedings of the 2007 international symposium on Low power electronics and design, Portland, OR, USA, Aug. 2007.
  11. T. Bisson and S. A. Brandt, "Reducing Hybrid Disk Write Latency with Flash-Backed I/O Requests," in Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, 2007. MASCOTS '07. 15th International Symposium on, pp. 402-409. Istanbul, Turkey, Oct. 2007
  12. Y. Joo, Y. Cho, K. Lee, and N. Chang, "Improving application launch times with hybrid disks," presented at the Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis, pp. 373-382, Grenoble, France, Oct. 2009.
  13. Y. Joo, J. Ryu, S. Park, and K. G. Shin, "FAST: quick application launch on solid-state drives," presented at the Proceedings of the 9th USENIX conference on File and storage technologies, pp. 19-19, San Jose, California, USA. 2011.
  14. R. Koller, L. Marmol, R. Rangaswami, S. Sundararaman, N. Talagala, and M. Zhao, "Write policies for host-side flash caches, " presented at the Proceedings of the 11th USENIX conference on File and Storage Technologies, pp. 45-58, San Jose, California, USA. 2013.
  15. D. A. Holland, E. L. Angelino, G. Wald, and M. I. Seltzer, "Flash caching on the storage client, " in USENIX ATC'13 Proceedings of the 2013 USENIX conference on Annual Technical Conference, pp. 127-138, San Jose, California, USA. 2013.
  16. M. Zaki, "SPADE: An Efficient Algorithm for Mining Frequent Sequences," Machine Learning, vol. 42, issue 1-2, pp. 31-60, Jan, 2001. https://doi.org/10.1023/A:1007652502315
  17. J. Ayres, J. Flannick, J. Gehrke, and T. Yiu, "Sequential PAttern mining using a bitmap representation," presented at the Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, Edmonton, pp. 429-435, Alberta, Canada, 2002.
  18. X. Yan, J. Han, and R. Afshar, "CloSpan: Mining Closed Sequential Patterns in Large Datasets," in In proceedings of the third SIAM International conference on data mining, pp. 166-177, San Francisco, CA, May. 2003.
  19. Z. Li, Z. Chen, S. M. Srinivasan, and Y. Zhou, "C-Miner: mining block correlations in storage systems," presented at the Proceedings of the 3rd USENIX conference on File and storage technologies, pp. 13-13, San Francisco, CA, Mar. 2004.
  20. DiskMon. http://technet.microsoft.com/en-us/sysin ternals/bb896646.aspx
  21. NTFSInfo. http://technet.microsoft.com/en-us/sys internals/bb897424
  22. California Software Labs, "I/O file system filter driver for Windows NT," CSWL INC Technical Report, Pleasanton, California, 2002.