GC-Tree: A Hierarchical Index Structure for Image Databases

GC-트리 : 이미지 데이타베이스를 위한 계층 색인 구조

  • 차광호 (숙명여자대학교 멀티미디어학과)
  • Published : 2004.02.01

Abstract

With the proliferation of multimedia data, there is an increasing need to support the indexing and retrieval of high-dimensional image data. Although there have been many efforts, the performance of existing multidimensional indexing methods is not satisfactory in high dimensions. Thus the dimensionality reduction and the approximate solution methods were tried to deal with the so-called dimensionality curse. But these methods are inevitably accompanied by the loss of precision of query results. Therefore, recently, the vector approximation-based methods such as the VA- file and the LPC-file were developed to preserve the precision of query results. However, the performance of the vector approximation-based methods depend largely on the size of the approximation file and they lose the advantages of the multidimensional indexing methods that prune much search space. In this paper, we propose a new index structure called the GC-tree for efficient similarity search in image databases. The GC-tree is based on a special subspace partitioning strategy which is optimized for clustered high-dimensional images. It adaptively partitions the data space based on a density function and dynamically constructs an index structure. The resultant index structure adapts well to the strongly clustered distribution of high-dimensional images.

멀티미디어 데이타의 사용이 증가함에 따라 고차원 이미지 데이타에 대한 효율적인 색인과 검색 기법이 크게 요구되고 있다. 그러나 많은 노력에도 불구하고 현재의 다차원 색인 기법들은 고차원 데이타 공간에서 만족할 만한 성능을 보여주지 못하고 있다. 이러한 소위 차원의 저주를 해결하기 위해 최근에 차원을 줄이거나 근사 해를 구하는 둥의 접근법이 시도되고 있지만 이러한 방법들은 근본적으로 정확도의 상실이라는 문제를 갖고 있다. 정확도의 보존을 위해 VA-file, LPC-file둥과 같이 벡터 근사에 기반 한 기법들이 최근에 개발되었다. 그러나 이 기법은 검색 성능이 색인 파일의 크기에 큰 영향을 받으며, 한번에 큰 검색 공간을 줄이는 계층 색인 구조의 장점을 상실한다. 본 논문에서는 이미지 데이터베이스에서 유사성 질의를 위한 새로운 계층 색인 구조인 GC-트리를 제안한다. GC-트리는 밀도 함수에 기초하여 데이타 공간을 적응적으로 분할하고, 색인 구조를 동적으로 생성한다. 이러한 특성을 갖는 GC-트리는 군집화 된 고차원 이미지 데이타 검색에 훌륭한 성능을 나타낸다.

Keywords

References

  1. Arya, S., Mount, D. M., Netanyahu, N. S., Silverman, R., and Wu, A. Y., 'An Optimal Algorithm for Approximate Nearest Neighbor Searching in Fixed Dimensions,' Journal of the ACM, 45(6), 891-923, 1998 https://doi.org/10.1145/293347.293348
  2. Indyk, P. and Motwani, R., 'Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality,' Proc. of the ACM Symp. on the Theory of Computing, 604-613, 1998 https://doi.org/10.1145/276698.276876
  3. Kushilevitz, E., Ostrovsky, R. and Y. Rabani, 'Efficient Search for Approximate Nearest Neighbor in High Dimensional Spaces,' Proc. of the ACM Symp. on the Theory of Computing, 614-623, 1998 https://doi.org/10.1145/276698.276877
  4. Weber, R., Schek, H.-J., and Blott, S., 'A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces,' Proc. of the Int'l Conf. on VLDB, 194-205, 1998
  5. Niblack, N., et al., 'The QBIC Project: Querying Images By Content Using Color,Texture and Shape,' Proc. of the SPIE Conf. on Storage and Retrieval for Image and Video Databases II, 173-187, 1993 https://doi.org/10.1117/12.143648
  6. Cha, G.-H., Zhu, X., Petkovic, D., and Chung, C.-W., 'An Efficient Indexing Method for Nearest Neighbor Searches in High-Dimensional Image Databases,' IEEE Transactions on Multimedia, Vol. 4, No. 1, pp. 76-87, March 2002 https://doi.org/10.1109/6046.985556
  7. 차광호, 정진완, '적응분할과 벡터근사에 기반한 고차원 이미지 색인기법', 정보과학회 논문지:데이타베이스, 제29호 제2호, pp. 128-137, 2002. 4
  8. Chakrabarti, K. and Mehrotra, S., 'Local Dimensionality Reduction: A New Approach to Indexing High Dimensional Spaces,' Proc. of the Int'l Conf. on VLDB, 89-100, 2000
  9. Kanth, K.V.R., Agrawal, D. and Singh, A., 'Dimensionality Reduction for Similarity Searching in Dynamic Databases,' Proc. of the ACM SIGMOD Int'l Conf., 166-176, 1998 https://doi.org/10.1145/276304.276320
  10. Lin, K.-I., Jagadish, H.V., and Faloutsos, C., 'The TV-tree: An Index Structure for High Dimensional Data,' The VLDB Journal, 3(4), 517-542, 1994 https://doi.org/10.1007/BF01231606
  11. Megiddo, N. and Shaft, U., 'Efficient Nearest Neighbor Indexing Based on a Collection of Space-Filling Curves,' Technical Report RJ 10093, IBM Almaden Research Center, Nov. 1997
  12. Shepherd, J., Zhu, X. and Megiddo, N., 'A Fast Indexing Method for Multidimensional Nearest Neighbor Search,' Proc. of the IS&T/SPIE Conf. on Storage and Retrieval for Image and Video Databases VII, 350-355, 1999 https://doi.org/10.1117/12.333854
  13. Berchtold, S., Boehm, C., and Kriegel, H.-P., 'The Pyramid-Technique Towards Breaking the Curse of Dimensionality,' Proc. of the ACM SIGMOD Int'l Conf. on Management of Data, 142-153, 1998 https://doi.org/10.1145/276304.276318
  14. S. Berchtold, C. Boehm, H.V. Jagadish, H.-P. Kriegel, and J. Sander, 'Independent Quantization: An Index Compression Technique for High Dimensional Data Spaces,' Proc. of the IEEE Data Engineering, pp. 577-588, 2000 https://doi.org/10.1109/ICDE.2000.839456