DOI QR코드

DOI QR Code

A KD-Tree-Based Nearest Neighbor Search for Large Quantities of Data

  • Yen, Shwu-Huey (Department of Computer Science and Information Engineering, Tamkang University) ;
  • Hsieh, Ya-Ju (Department of Computer Science and Information Engineering, Tamkang University)
  • Received : 2012.07.07
  • Accepted : 2013.03.16
  • Published : 2013.03.31

Abstract

The discovery of nearest neighbors, without training in advance, has many applications, such as the formation of mosaic images, image matching, image retrieval and image stitching. When the quantity of data is huge and the number of dimensions is high, the efficient identification of a nearest neighbor (NN) is very important. This study proposes a variation of the KD-tree - the arbitrary KD-tree (KDA) - which is constructed without the need to evaluate variances. Multiple KDAs can be constructed efficiently and possess independent tree structures, when the amount of data is large. Upon testing, using extended synthetic databases and real-world SIFT data, this study concludes that the KDA method increases computational efficiency and produces satisfactory accuracy, when solving NN problems.

Keywords

References

  1. A. Bosch, A. Zisserman and X. Munoz, "Image classification using random forests and ferns," in Proc. of IEEE International Conference on Computer Vision, pp. 1-8, 2007.
  2. S. Lazebnik, C. Schmid and J. Ponce, "Beyond bags of features: spatial pyramid matching for recognizing natural scene categories," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, 2006.
  3. D. Lowe, "Distinctive image features from scale invariant keypoints," International Journal of Computer Vision, vol. 60, no. 2, pp. 91-110, 2004. https://doi.org/10.1023/B:VISI.0000029664.99615.94
  4. J. H. Freidman, J. L. Bentley and R. A. Finkel, "An algorithm for finding best matches in logarithmic expected time," ACM Transactions on Mathematical Software, vol. 3, no.3, pp. 209-226, 1997.
  5. J. L. Bentley, " Multidimensional binary search trees used for associative searching," Communications of the ACM, vol. 18, no. 9, pp. 509-517, 1975. https://doi.org/10.1145/361002.361007
  6. C. Silpa - Anan and R. Hartley, "Optimised KD-trees for fast image descriptor matching," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-8, 2008.
  7. M. Muja and D. Lowe, "Fast approximate nearest neighbors with automatic algorithm configuration," in Proc. of International Conference on Computer Vision Theory and Applications (VISAPP), Lisbon, Portugal, Feb. 2009.
  8. P. Wu, S. C.H. Hoi, N. D. Dung and H. Ying, "Randomly projected KD-Trees with distance metric learning for image retrieval," in Proc. of International Conference on MultiMedia Modeling, Taipei, Taiwan, pp. 371-382, 2011.
  9. S. H. Yen, C. Y. Shih, T.K. Li and H. W. Chang, "Applying multiple KD-Trees in high dimensional nearest neighbor searching," International Journal of Circuits, Systems and Signal Processing, vol. 4, no. 4, pp. 153-160, 2010.
  10. K. Beyer, J. Goldstein, R. Ramakrishnan and U. Shaft, "When is "nearest neighbor" meaningful?" in Proc. of International Conference on Database Theory, LNCS, vol. 1540, pp.217-235, Springer, Heidelberg, 1999.
  11. M. Houle, K. Kriegel, P. Kroger, E. Schubert and A. Zimek, "Can shared-neighbor distances defeat the curse of dimensionality?" Scientific and Statistical Database Management, LNCS, vol. 6187, pp. 482-500, Springer, Heidelberg, 2010.
  12. B. Pagel, F. Korn and C. Faloutsos, "Deflating the dimensionality curse using multiple fractal dimensions," in Proc. of IEEE International Conference on Data Engineering, pp. 589-598, 2000.

Cited by

  1. Parallel Implementation Strategy for Content Based Video Copy Detection Using a Multi-core Processor vol.8, pp.10, 2014, https://doi.org/10.3837/tiis.2014.10.014