High-Dimensional Image Indexing based on Adaptive Partitioning ana Vector Approximation

적응 분할과 벡터 근사에 기반한 고차원 이미지 색인 기법

  • Cha, Gwang-Ho (Dept.of Multi Media, Sookmyung Women's University) ;
  • Jeong, Jin-Wan (Dept. of Computer Science, Korea Advanced Institute of Science and Technology)
  • 차광호 (숙명여자대학교 멀티미디어학과) ;
  • 정진완 (한국과학기술원 전산학과)
  • Published : 2002.04.01

Abstract

In this paper, we propose the LPC+-file for efficient indexing of high-dimensional image data. With the proliferation of multimedia data, there Is an increasing need to support the indexing and retrieval of high-dimensional image data. Recently, the LPC-file (5) that based on vector approximation has been developed for indexing high-dimensional data. The LPC-file gives good performance especially when the dataset is uniformly distributed. However, compared with for the uniformly distributed dataset, its performance degrades when the dataset is clustered. We improve the performance of the LPC-file for the strongly clustered image dataset. The basic idea is to adaptively partition the data space to find subspaces with high-density clusters and to assign more bits to them than others to increase the discriminatory power of the approximation of vectors. The total number of bits used to represent vector approximations is rather less than that of the LPC-file since the partitioned cells in the LPC+-file share the bits. An empirical evaluation shows that the LPC+-file results in significant performance improvements for real image data sets which are strongly clustered.

이 논문은 고차원 이미지 데이타의 효율적인 색인을 위한 LCP+-file을 제시한다. 멀티미디어 데이타의 사용이 증가하면서 고차원 이미지 데이타의 색인과 검색의 지원에 대한 요구가 증가하고 있다. 최근에 고차원 데이타의 색인을 위해 벡터 근사에 기반한 LPC-file (5)이 개발되었다. LPC-file은 특히, 데이터 집합이 균일하게 분포할 때는 좋은 성능을 나타내지만 클러스터(cluster)를 이를 때는 성능이 하락한다. 본 논문은 강하게 클러스터를 이루는 이미지 데이타 집합에 대해 LPC-file의 성능을 향상시킨 LCP+-file을 제시한다. 기본 아이디어는 고밀도 클러스터를 갖는 부분 공간을 찾기 위해 데이타 공간을 적응적으로 분할하고, 그 공간에 대해 벡터 근사의 식별 능력을 향상시키기 위해 더 많은 수의 비트를 할당한다. 그러나 분할된 공간이 비트들을 공유하기 때문에 사용되는 전체 비트 수는 오히려 줄어든다. 실험 결과에 따르면 LCP+-file은 강하게 클러스터를 이루는 이미지 데이터 집합에 대해 LPC-file의 성능을 크게 향상시킨다.

Keywords

References

  1. Arya, S. et al., 'An Optimal Algorithm for Approximate Nearest Neighbor Searching in Fixed Dimensions,' Journal of the ACM, 45(6), 891-923, Nov. 1998 https://doi.org/10.1145/293347.293348
  2. Beckmann N. et al., 'The R*-tree: An efficient and robust access method for points and rectangles,' Proc. of ACM SIGMOD Int'l Conf. on Management of Data, 322-331, 1990 https://doi.org/10.1145/93597.98741
  3. Berchtold, S.,Boehm, C., and Kriegel, H.-P., 'The Pyramid-Technique:Towards Breaking the Curse of Dimensionality,' Proc. of the ACM SIGMOD Int'l Conf., 142-153, 1998 https://doi.org/10.1145/276304.276318
  4. Berchtold, S., Keim, D. A., Kriegel, H.-P., 'The X-tree: An Index Structure for High-Dimensional Data,' Proc. of the Int'l Conf. on Very Large Data Bases, 28-39, 1996
  5. Cha, G.-H., Zhu, X., Petkovic, D., and Chung, C.-W., 'An Efficient Indexing Method for Nearest Neighbor Searches in High-Dimensional Image Databases,' IEEE Transactions on Multimedia, 4(1), 76-87, March 2002 https://doi.org/10.1109/6046.985556
  6. Cha, G.-H. and Chung, C.-W., 'A New Indexing Scheme for Content-Based Image Retrieval,' Multimedia Tools and Applications, 6(3), 263-288, May 1998 https://doi.org/10.1023/A:1009608331551
  7. Chakrabarti, K. and Mehrotra, S., 'Local Dimensionality Reduction: A New Approach to Indexing High Dimensional Spaces,' Proc. of the Int'l Conf. on VLDB, 89-100, 2000
  8. Flickner, M., et al., 'Query by image and video content: the QBIC system,' IEEE Computer, 28, 23-32, 1995 https://doi.org/10.1109/2.410146
  9. Indyk, P. and Motwani, R., 'Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality,' Proc. of the ACM Symp. on the Theory of Computing, 604-613, 1998 https://doi.org/10.1145/276698.276876
  10. Kanth, K. V. R., Agrawal, D. and Singh, A., 'Dimensionality Reduction for Similarity Searching in Dynamic Databases,' Proc. of the ACM SIGMOD Int'l Conf., 166-176, 1998 https://doi.org/10.1145/276304.276320
  11. Katayama, N. and Satoh, S. 'The SR-tree: An Index Structure for High-Dimensional Nearest Neighbor Queries,' Proc. of the ACM SIGMOD Int'l Conf. 369-380, 1997 https://doi.org/10.1145/253260.253347
  12. Kushilevitz, E., Ostrovsky, R. and Y. Rabani, 'Efficient Search for Approximate Nearest Neighbor in High Dimensional Spaces,' Proc. of the ACM Symp.on the Theory of Computing, 614-623, 1998 https://doi.org/10.1145/276698.276877
  13. Lin K.-I., Jagadish, H. V., Faloutsos, C., 'The TV-tree: An Index Structure for High-Dimensional Data,' The VLDB Journal, 3(4), 517-542, 1994 https://doi.org/10.1007/BF01231606
  14. Megiddo, N. and Shaft, U., 'Efficient Nearest Neighbor Indexing Based on a Collection of Space-Filling Curves,' Technical Report RJ 10093, IBM Almaden Research Center, Nov. 1997
  15. Miyahara, M. and Yoshida, Y., ',Mathematical Transform of (R,G,B) Color Data to Munsell (H,V,C) Color Data,' Visual Communication and Image Processing, 1001, 650-657, SPIE, 1992
  16. Niblack. N. et al., 'The QBIC Project: Querying Images By Content Using Color, Texture, and Shape,' Proc. of the SPIE Conf., 173-187, 1993 https://doi.org/10.1117/12.143648
  17. Shepherd, J., Zhu, X., Megiddo, N., 'A Fast Indexing Method for Multidimensional Nearest Neighbor Search,' Proc. of the SPIE Conf., 350-355, 1999 https://doi.org/10.1117/12.333854
  18. Weber, R., Schek, H.-J., Blott, S., 'A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces,' Proc. of the Int'l Conf. on VLDB, 194-205, 1998