Browse > Article
http://dx.doi.org/10.3745/JIPS.02.0167

The Kernel Trick for Content-Based Media Retrieval in Online Social Networks  

Cha, Guang-Ho (Dept. of Computer Science and Engineering, Seoul National University of Science and Technology)
Publication Information
Journal of Information Processing Systems / v.17, no.5, 2021 , pp. 1020-1033 More about this Journal
Abstract
Nowadays, online or mobile social network services (SNS) are very popular and widely spread in our society and daily lives to instantly share, disseminate, and search information. In particular, SNS such as YouTube, Flickr, Facebook, and Amazon allow users to upload billions of images or videos and also provide a number of multimedia information to users. Information retrieval in multimedia-rich SNS is very useful but challenging task. Content-based media retrieval (CBMR) is the process of obtaining the relevant image or video objects for a given query from a collection of information sources. However, CBMR suffers from the dimensionality curse due to inherent high dimensionality features of media data. This paper investigates the effectiveness of the kernel trick in CBMR, specifically, the kernel principal component analysis (KPCA) for dimensionality reduction. KPCA is a nonlinear extension of linear principal component analysis (LPCA) to discovering nonlinear embeddings using the kernel trick. The fundamental idea of KPCA is mapping the input data into a highdimensional feature space through a nonlinear kernel function and then computing the principal components on that mapped space. This paper investigates the potential of KPCA in CBMR for feature extraction or dimensionality reduction. Using the Gaussian kernel in our experiments, we compute the principal components of an image dataset in the transformed space and then we use them as new feature dimensions for the image dataset. Moreover, KPCA can be applied to other many domains including CBMR, where LPCA has been used to extract features and where the nonlinear extension would be effective. Our results from extensive experiments demonstrate that the potential of KPCA is very encouraging compared with LPCA in CBMR.
Keywords
Content-Based Retrieval; Dimensionality Curse; Nearest Neighbor Query; Online Social Network; Kernel Method; Kernel Principal Component Analysis; Similarity Search; Social Network Service;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 W. Ahmad and R. Ali, "Information retrieval from social networks: a survey," in Proceedings of 2016 3rd International Conference on Recent Advances in Information Technology (RAIT), Dhanbad, India, 2016, pp. 631-635.
2 N. Beckmann, H. P. Kriegel, R. Schneider, and B. Seeger, "The R*-tree: an efficient and robust access method for points and rectangles," in Proceedings of the 1990 ACM SIGMOD International Conference on Management of Data, Atlantic City, NJ, 1990, pp. 322-331.
3 S. S. Lee, M. Shishibori, and C. Y. Han, "An improvement video search method for VP-tree by using a trigonometric inequality," Journal of Information Processing Systems, vol. 9, no. 2, pp. 315-332, 2013.   DOI
4 I. Kamel and C. Faloutsos, "Hilbert R-tree: an improved R-tree using fractals," in Proceedings of 20th International Conference on Very Large Data Bases, Santiago de Chile, Chile, 1994, pp. 500-509.
5 G. H. Cha and C. W. Chung, "A new indexing scheme for content-based image retrieval," Multimedia Tools and Applications, vol. 6, no. 3, pp. 263-288, 1998.   DOI
6 G. H. Cha, X. Zhu, P. Petkovic, and C. W. Chung, "An efficient indexing method for nearest neighbor searches in high-dimensional image databases," IEEE Transactions on Multimedia, vol. 4, no. 1, pp. 76-87, 2002.   DOI
7 G. Srang, Introduction to Linear Algebra, 5th ed. Wellesley, MA: Wellesley-Cambridge Press, 2016.
8 A. Andoni and P. Indyk, "Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions," Communications of the ACM, vol. 51, no. 1, pp. 117-122, 2008.   DOI
9 A. Andoni and P. Indyk, "Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions," in Proceedings of 2006 47th Annual IEEE Symposium on Foundations of Computer Science, Berkeley, CA, 2006, pp. 459-468.
10 I. Jolliffe, "Principal component analysis," in Encyclopedia of Statistics in Behavioral Science. Chichester, UK: John Wiley & Sons, 2005.
11 MPEG-7 [Online]. Available: https://mpeg.chiariglione.org/standards/mpeg-7.
12 G. Strang and K. Borre, Linear Algebra, Geodesy, and GPS. Wellesley, MA: Wellesley-Cambridge Press, 1997.
13 B. Scholkopf and A. J. Smola, Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. Cambridge, MA: MIT Press, 2018.
14 M. Flickner, H. Sawhney, W. Niblack, J. Ashley, Q. Huang, B. Dom, et al., "Query by image and video content: the QBIC system," Computer, vol. 28, no. 9, pp. 23-32, 1995.   DOI
15 J. Payne, L. Hepplewhite, and T. J. Stonham, "Texture, human perception, and information retrieval measures," in Proceedings of ACM SIGIR MF/IR Workshop, Athens, Greece, 2000.
16 J. Lever, M. Krzywinski, and N. Altman, "Points of significance: principal component analysis," Nature Methods, vol. 14, no. 7, pp. 641-643, 2017.   DOI
17 R. Weber, H. J. Schek, and S. Blott, "A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces," in Proceedings of 24rd International Conference on Very Large Data Bases, New York City, NY, 1998, pp. 194-205.
18 A. W. C. Fu, P. M. S. Chan, Y. L. Cheung, and Y. S. Moon, "Dynamic VP-tree indexing for n-nearest neighbor search given pair-wise distances," The VLDB Journal, vol. 9, no. 2, pp. 154-173, 2000.   DOI
19 N. Pfister, P. Buhlmann, B Scholkopf, and J. Peters, "Kernel-based tests for joint independence," Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol. 80, no. 1, pp. 5-31, 2018.   DOI
20 C. J. Simon-Gabriel and B. Scholkopf, "Kernel distribution embeddings: universal kernels, characteristic kernels and kernel metrics on distributions," The Journal of Machine Learning Research, vol. 19, no. 1, pp. 1-29, 2018.
21 L. Wu, C. Faloutsos, K. Sycara, and T. R. Payne, "FALCON: feedback adaptive loop for content-based retrieval," in Proceedings of 26th International Conference on Very Large Data Bases, Cairo, Egypt, 2000, pp. 297-306.
22 P. B. Scholkopf, C. Burgest, and V. Vapnik, "Extracting support data for a given task," in Proceedings of the 1st International Conference on Knowledge Discovery and Data Mining (KDD), Montreal, Canada, 1995, pp. 252-257.
23 C. Faloutsos, R. Barber, M. Flickner, J. Hafner, W. Niblack, D. Petkovic, and W. Equitz, "Efficient and effective querying by image content," Journal of Intelligent Information Systems, vol. 3, no. 3-4, pp. 231-262, 1994.   DOI