DOI QR코드

DOI QR Code

Sorting Cuckoo: Enhancing Lookup Performance of Cuckoo Hashing Using Insertion Sort

Sorting Cuckoo: 삽입 정렬을 이용한 Cuckoo Hashing의 입력 연산의 성능 향상

  • Min, Dae-hong (Inha University Department of Computer Science Engineering) ;
  • Jang, Rhong-ho (Inha University Department of Computer Science Engineering) ;
  • Nyang, Dae-hun (Inha University Department of Computer Science Engineering) ;
  • Lee, Kyung-hee (Suwon University Department of Electrical Engineering)
  • Received : 2017.01.23
  • Accepted : 2017.02.16
  • Published : 2017.03.31

Abstract

Key-value stores proved its superiority by being applied to various NoSQL databases such as Redis, Memcached. Lookup performance is important because key-value store applications performs more lookup than insert operations in most environments. However, in traditional applications, lookup may be slow because hash tables are constructed out of linked-list. Therefore, cuckoo hashing has been getting attention from the academia for constant lookup time, and bucketized cuckoo hashing (BCH) has been proposed since it can achieve high load factor. In this paper, we introduce Sorting Cuckoo which inserts data using insertion sort in BCH structure. Sorting Cuckoo determines the existence of a key with a relatively small memory access because data are sorted in each buckets. In particular, the higher memory load factor, the better lookup performance than BCH's. Experimental results show that Sorting Cuckoo has smaller memory access than BCH's as many as about 19 million (25%) in 10 million negative lookup operations (key is not in the table), about 4 million times (10%) in 10 million positive lookup operations (where it is) with load factor 95%.

키-값 저장소(key-value store)는 Redis, Memcached 등의 다양한 NoSQL 데이터베이스에 응용되어 그 우수성을 보였다. 그리고 키-값 저장소 응용프로그램은 대부분의 환경에서 삽입 연산(insert) 보다 탐색 연산(lookup)이 많이 발생하기 때문에 탐색의 성능이 중요하다. 하지만 기존의 응용프로그램은 해시 테이블을 링크 리스트(linked list) 형태로 유지하기 때문에 탐색 연산이 느릴 수 있다. 따라서 탐색 연산을 상수 시간 내에 완료할 수 있는 쿠쿠 해싱(cuckoo hashing)이 학계의 주목을 받기 시작했고, 그 후 메모리 사용률이 더 높은 버킷화 쿠쿠 해싱(Bucketized Cuckoo Hashing, BCH)이 제안되었다. 본 논문에서는 BCH 구조를 기반으로 하여 삽입 정렬 방법으로 데이터를 입력하는 Sorting Cuckoo를 소개한다. Sorting Cuckoo를 이용하면 데이터가 정렬된 상태에서 탐색을 수행하기 때문에 상대적으로 적은 메모리 접근을 통해 키의 존재 여부를 판단할 수 있으며, 메모리 점유율(load factor)이 높을수록 BCH보다 탐색의 성능이 좋아진다. 실험 결과에 의하면 Sorting Cuckoo는 메모리 점유율이 95%인 상황에서 BCH보다 천만 번의 negative 탐색(데이터가 존재하지 않는 탐색)에서는 최대 25%(약 1900만회), 천만 번의 positive 탐색(데이터가 존재하는 탐색)에서는 최대 10%(약 400만 회)만큼 더 적은 메모리 접근을 이용하였다.

Keywords

References

  1. Redis, Retrieved Jan., 19, 2017, from https://redis.io/
  2. DB-Engines, Retrieved Jan., 19, 2017, from http://db-engines.com/en/ranking/
  3. Memcached, Retrieved Jan., 19, 2017, from http://memcached.org/
  4. R. Kutzelnigg, "An improved version of cuckoo hashing: Average case analysis of construction cost and search operations," Math. Comput. Sci., vol. 3, no. 1, pp. 47-60, 2010. https://doi.org/10.1007/s11786-009-0005-x
  5. B. Fan, D. G. Andersen, and M. Kaminsky, "MemC3: Compact and concurrent memcache with dumber caching and smarter hashing," The 10th USENIX Symp. NSDI 13, pp. 371-384, 2013.
  6. R. Pagh and F. F. Rodler, "Cuckoo hashing," Eur. Symp. Algorithms, Springer Berlin Heidelberg, pp. 121-133, Aug. 2001.
  7. U. Erlingsson, M. Manasse, and F. McSherry, "A cool and practical alternative to traditional hash tables," in Proc. WDAS'06, Jan. 2006.
  8. E. Lehman and R. Panigrahy, "3.5-way cuckoo hashing for the price of 2-and-a-bit," in Eur. Symp. Algorithms, pp. 671-681, Springer Berlin Heidelberg, Sept. 2009.
  9. E. Porat and B. Shalem, "A cuckoo hashing variant with improved memory utilization and insertion time," in IEEE 2012 Data Compression Conf., pp. 347-356, Apr. 2012.
  10. W. Kuszmaul, "Brief announcement: Fast concurrent cuckoo kick-out eviction schemes for high-density tables," in Proc. 28th ACM SPAA '16, pp. 363-365, Jul. 2016.
  11. A. D. Breslow, D. P. Zhang, J. L. Greathouse, N. Jayasena, and D. M. Tullsen, "Horton tables: Fast hash tables for in-memory data-intensive computing," USENIX ATC 16, Jun. 2016.
  12. R. Jang, C. Jung, K. Kim, D. Nyang, and K. Lee, "Enhancing RCC(Recyclable counter with confinement) with cuckoo hashing," J. KICS, vol. 41, no. 6, pp. 663-671, Jun. 2016. https://doi.org/10.7840/kics.2016.41.6.663
  13. X. Li, D. G. Andersen, M. Kaminsky, and M. J. Freedman, "Algorithmic improvements for fast concurrent cuckoo hashing," in Proc. ACM 9th Eur. Conf. Comput. Syst., p. 27, Apr. 2014.
  14. B. Jenkins, "A new hash function for hash table lookup," Dr. Dobb's J., 1997.
  15. M. Matsumoto and T. Nishimura, "Mersenne twister: A 623-dimensionally equidistributed uniform pseudo-random number generator," ACM TOMACS, vol. 8, no. 1, pp. 3-30, 1998. https://doi.org/10.1145/272991.272995