DOI QR코드

DOI QR Code

Wall Cuckoo: A Method for Reducing Memory Access Using Hash Function Categorization

월 쿠쿠: 해시 함수 분류를 이용한 메모리 접근 감소 방법

  • Received : 2019.02.15
  • Accepted : 2019.04.17
  • Published : 2019.06.30

Abstract

The data response speed is a critical issue of cloud services because it directly related to the user experience. As such, the in-memory database is widely adopted in many cloud-based applications for achieving fast data response. However, the current implementation of the in-memory database is mostly based on the linked list-based hash table which cannot guarantee the constant data response time. Thus, cuckoo hashing was introduced as an alternative solution, however, there is a disadvantage that only half of the allocated memory can be used for storing data. Subsequently, bucketized cuckoo hashing (BCH) improved the performance of cuckoo hashing in terms of memory efficiency but still cannot overcome the limitation that the insert overhead. In this paper, we propose a data management solution called Wall Cuckoo which aims to improve not only the insert performance but also lookup performance of BCH. The key idea of Wall Cuckoo is that separates the data among a bucket according to the different hash function be used. By doing so, the searching range among the bucket is narrowed down, thereby the amount of slot accesses required for the data lookup can be reduced. At the same time, the insert performance will be improved because the insert is following up the operation of the lookup. According to analysis, the expected value of slot access required for our Wall Cuckoo is less than that of BCH. We conducted experiments to show that Wall Cuckoo outperforms the BCH and Sorting Cuckoo in terms of the amount of slot access in lookup and insert operations and in different load factor (i.e., 10%-95%).

데이터 응답 속도는 사용자 경험과 직결되기 때문에 클라우드 서비스의 중요한 이슈이다. 그렇기 때문에 사용자의 요청에 빠르게 응답하기 위하여 인-메모리 데이터베이스는 클라우드 기반 응용 프로그램에 널리 사용되고 있다. 하지만, 현재 인-메모리 데이터베이스는 대부분 연결리스트 기반의 해시 테이블로 구현되어 있어 상수 시간의 응답을 보장하지 못한다. 쿠쿠 해싱(cuckoo hashing)이 대안으로 제시되었지만, 할당된 메모리의 반만 사용할 수 있다는 단점이 있었다. 이후 버킷화 쿠쿠 해싱(bucketized cuckoo hashing)이 메모리 효율을 개선하였으나 삽입 연산시의 오버헤드를 여전히 극복하지 못하였다. 본 논문에서는 BCH의 삽입 성능과 탐색 성능을 동시에 향상시키는 데이터 관리 방법인 월 쿠쿠(wall cuckoo)를 제안한다. 월 쿠쿠의 핵심 아이디어는 버킷 내부의 데이터를 사용된 해시 함수에 따라 분리하는 것이다. 이를 통하여 버킷의 탐색 범위가 줄어들어 접근해야 하는 슬롯의 수를 줄일 수 있는데, 이렇게 탐색 연산의 성능이 향상되기 때문에 탐색 과정이 포함되어 있는 삽입 연산 또한 개선된다. 분석에 따르면, 월 쿠쿠에서의 슬롯 접근 횟수 기댓값은 BCH의 기댓값보다 작다. 우리는 월 쿠쿠와 BCH, 정렬 쿠쿠를 비교하는 실험을 진행하였으며, 각 메모리 사용률(10%-95%)에서 월 쿠쿠의 탐색 및 삽입 연산이 다른 기법보다 더 적은 슬롯 접근 횟수를 가지는 것을 보였다.

Keywords

JBCRIN_2019_v8n6_127_f0001.png 이미지

Fig. 1. Insert Process of Bucketized Cuckoo Hashing

JBCRIN_2019_v8n6_127_f0002.png 이미지

Fig. 2. Comparing the Search Range of (a) BCH and (b) Wall Cuckoo

JBCRIN_2019_v8n6_127_f0003.png 이미지

Fig. 3. Example of the Insertion Process when There Exist an Empty Slot in the Bucket

JBCRIN_2019_v8n6_127_f0004.png 이미지

Fig. 4. Example of the Insertion Process when Kick-out Process is Needed

JBCRIN_2019_v8n6_127_f0005.png 이미지

Fig. 5. Comparison of the Number of Slot Accesses in Negative Lookup Process

JBCRIN_2019_v8n6_127_f0006.png 이미지

Fig. 6. Comparison of the Number of Slot Accesses in Positive Lookup Process

JBCRIN_2019_v8n6_127_f0007.png 이미지

Fig. 7. Comparison of the Number of Slot Accesses in Insert Process

JBCRIN_2019_v8n6_127_f0008.png 이미지

Fig. 8. Comparison of Negative Lookup Time

JBCRIN_2019_v8n6_127_f0009.png 이미지

Fig. 9. Comparison of Positive Lookup Time

JBCRIN_2019_v8n6_127_f0010.png 이미지

Fig. 10. Comparison of Insert Time

JBCRIN_2019_v8n6_127_f0011.png 이미지

Fig. 11. Comparison of the Number of Slot Accesses in Different Insert/lookup Ratio

JBCRIN_2019_v8n6_127_f0012.png 이미지

Fig. 12. Comparison of Time in Different Insert/Lookup Ratio

Table 1. Experimental Environment

JBCRIN_2019_v8n6_127_t0001.png 이미지

Table 2. Experiment Workload

JBCRIN_2019_v8n6_127_t0002.png 이미지

Table 3. Experiment Workload

JBCRIN_2019_v8n6_127_t0003.png 이미지

References

  1. DB-Engines, DB-Engines Ranking [Internet], http://db-engines.com/en/ranking/.
  2. Amazon Web Services, Definition of Key-value Database [Internet], https://aws.amazon.com/ko/nosql/key-value/.
  3. Redis, OBJECT Subcommand [Internet], https://redis.io/commands/object.
  4. Memcached [Internet], https://memcached.org/.
  5. Memcached GitHub, Memcached/memcached.h [Internet], https://goo.gl/BK7nkQ, Line 462.
  6. R. Pagh and F. F. Rodler, "Cuckoo Hashing," Eur. Symp. Algorithms, Springer Berlin Heidelberg, pp.121-133, Aug. 2001.
  7. R. Kutzelnigg, "An Improved Version of Cuckoo Hashing: Average Case Analysis of Construction Cost and Search Operations," Math. Comput. Sci., Vol.3, No.1, pp.47-60, 2010. https://doi.org/10.1007/s11786-009-0005-x
  8. D. H. Min, R. H. Jang, D. H. Nyang, and K. H. Lee, "Sorting Cuckoo - Enhancing Lookup Performance of Cuckoo Hashing Using Insertion Sort -," The Journal of Korean Institute of Communications and Information Sciences, Vol.42, No.3, pp.566-576, Mar. 2017. https://doi.org/10.7840/kics.2017.42.3.566
  9. U. Erlingsson, M. Manasse, and F. McSherry, "A Cool and Practical Alternative to Traditional Hash Tables," in Proc. WDAS'06, Jan. 2006.
  10. Y. Sun, Y. Hua, S. Jiang, Q. Li, S. Cao, and P. Zuo, "SmartCuckoo: A Fast and Cost-Efficient Hashing Index Scheme for Cloud Storage Systems," USENIX Annual Technical Conference, pp.553-565, Santa Clara, CA, USA, July 2017.
  11. Y. Sun, Y. Hua, D. Feng, L. Yang, P. Zuo, S. Cao, and Y. Guo, "A Collision-Mitigation Cuckoo Hashing Scheme for Large-Scale Storage Systems," IEEE Trans. Parallel Distrib. Syst., Vol.28, No.3, pp.619-632, 2017. https://doi.org/10.1109/TPDS.2016.2594763
  12. E. Lehman and R. Panigrahy, "3.5-way Cuckoo Hashing for the Price of 2-and-a-bit," in Eur. Symp. Algorithms, pp. 671-681, Springer Berlin Heidelberg, Sept. 2009.
  13. E. Porat and B. Shalem, "A Cuckoo Hashing Variant with Improved Memory Utilization and Insertion Time," in IEEE 2012 Data Compression Conf., pp.347-356, Apr. 2012.
  14. A. D. Breslow, D. P. Zhang, J. L. Greathouse, N. Jayasena, and D. M. Tullsen, "Horton Tables: Fast Hash Tables for In-memory Data-intensive Computing," USENIX ATC 16, pp.281-294, Jun. 2016.
  15. J. Bob, A Hash Function for Hash Table Lookup [Internet], http://www.burtleburtle.net/bob/hash/doobs.html.
  16. M. Matsumoto and T. Nishimura, "Mersenne Twister: A 623-dimensionally Equidistributed Uniform Pseudo-random Number Generator," ACM TOMACS, Vol.8, No.1, pp.3-30, 1998. https://doi.org/10.1145/272991.272995
  17. E. H. Kim and M. S. Kim, "Enhanced chained and Cuckoo hashing methods for multi-core CPUs," Cluster Computing, Vol.17, No.3, pp.665-680, 2014. https://doi.org/10.1007/s10586-013-0343-y