최소 완전 해쉬 함수를 위한 선택-순서화-사상-탐색 접근 방법

A Selecting-Ordering-Mapping-Searching Approach for Minimal Perfect Hash Functions

  • 이하규 (성공회대학교 컴퓨터정보학부)
  • 발행 : 2000.01.15

초록

본 논문에서는 대규모 정적 탐색키 집합에 대한 최소 완전 해쉬 함수(MPHF: Minimal Perfect Hash Function) 생성 방법을 기술한다. 현재 MPHF 생성에서는 사상-순서화-탐색(MOS: Mapping-Ordering-Searching) 접근 방법이 널리 사용된다. 본 연구에서는 MOS 접근 방식을 개선하여, 보다 효과적으로 MPHF를 생성하기 위해 선택 단계를 새로 도입하고 순서화 단계를 사상 단계보다 먼저 수행하는 선택-순서화-사상-탐색(SOMS: Selecting-Ordering-Mapping-Searching) 접근 방법을 제안한다. 본 연구에서 제안된 MPHF 생성 알고리즘은 기대 처리 시간이 키의 수에 대해 선형적인 확률적 알고리즘이다. 실험 결과 MPHF 생성 속도가 빠르며, 해쉬 함수가 차지하는 기억 공간이 작은 것으로 나타났다.

This paper describes a method of generating MPHFs(Minimal Perfect Hash Functions) for large static search key sets. The MOS(Mapping-Ordering-Searching) approach is widely used presently in MPHF generation. In this research, the MOS approach is improved and a SOMS(Selecting-Ordering-Mapping-Searching) approach is proposed, where the Selecting step is newly introduced and the Orderng step is performed before the Mapping step to generate MPHFs more effectively. The MPHF generation algorithm proposed in this research is probabilistic and the expected processing time is linear to the number of keys. Experimental results show that MPHFs are generated fast and the space needed to represent the hash functions is small.

키워드

참고문헌

  1. R. Cichelli, 'Minimal Perfect Hash Functions Made Simple,' CACM, Vol. 23, pp. 17-19, 1980 https://doi.org/10.1145/358808.358813
  2. N. Cercone, M. Krause, and J. Boates, 'Minimal and Almost Minimal Perfect Hash Function Search with Application to Natural Language Lexicon Design,' Computers and Mathematics with Applications, Vol. 9, pp. 215-231, 1983
  3. E. Fox, Q. Chen, A. Daoud, and L. Heath, 'Order Preserving Minimal Perfect Hash Functions and Information Retrieval,' ACM Transactions on Information Systems, Vol. 9, pp. 281-308, 1991 https://doi.org/10.1145/125187.125200
  4. E. Fox, Q. Chen, and L. Heath, 'A Faster Algorithms for Constructing Minimal Perfect Hash Functions,' In 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR'92), pp.266-273, 1992 https://doi.org/10.1145/133160.133209
  5. E. Fox, L. Heath, Q. Chen, and A. Daoud, 'Practical Minimal Perfect Hash Functions for Large Databases,' CACM, Vol. 35, pp. 105-121, 1992 https://doi.org/10.1145/129617.129623
  6. G. Havas and B. Majewski, 'Optimal Algorithms for Minimal Perfect Hashing,' Technical Report 234, Department of Computer Science, The University of Queensland, 1992
  7. G. Havas and B. Majewski, 'Graph Theoretical Obstacles to Perfect Hashing,' Congressus Numerantium, Vol. 98, pp. 81-93, 1993
  8. G. Havas, B. Majewski, N. Wormald, and Z. Czech, 'Graphs, Hypergraphs and Hashing,' In 19th International Workshop on Graph-Theoretic Concepts in Computer Science (WG'93), Vol. 790 of Lecture Notes in Computer Science, pp. 153-165, 1994, Springer-Verlag
  9. T. Sager, 'A Polynomial Time Generator for Minimal Perfect Hashing Functions,' CACM, Vol. 28, pp. 523-532, 1985 https://doi.org/10.1145/3532.3538
  10. S. Wartic, E. Fox, L. Heath, and Q. Chen, 'Hashing Algorithms,' Information Retrieval: Data Structures & Algorithms (ed. W. Frakes and R. Baeza-Yates), pp. 293-318, Prentice Hall, New Jersey, 1992
  11. P. Pearson, 'Fast Hashing of Variable-Length Text Strings,' CACM, Vol. 6, pp. 677-680, 1990 https://doi.org/10.1145/78973.78978
  12. Park and Miller, 'Random Number Generators: Good Ones Are Hard to Find,' CACM, Vol. 31, No. 10, pp. 1192-1201, 1988 https://doi.org/10.1145/63039.63042