DOI QR코드

DOI QR Code

A Comparative Study of Local Features in Face-based Video Retrieval

  • Zhou, Juan (Department of Computer Science, Yangtze University) ;
  • Huang, Lan (Department of Computer Science, Yangtze University)
  • Received : 2016.04.13
  • Accepted : 2017.03.09
  • Published : 2017.03.30

Abstract

Face-based video retrieval has become an active and important branch of intelligent video analysis. Face profiling and matching is a fundamental step and is crucial to the effectiveness of video retrieval. Although many algorithms have been developed for processing static face images, their effectiveness in face-based video retrieval is still unknown, simply because videos have different resolutions, faces vary in scale, and different lighting conditions and angles are used. In this paper, we combined content-based and semantic-based image analysis techniques, and systematically evaluated four mainstream local features to represent face images in the video retrieval task: Harris operators, SIFT and SURF descriptors, and eigenfaces. Results of ten independent runs of 10-fold cross-validation on datasets consisting of TED (Technology Entertainment Design) talk videos showed the effectiveness of our approach, where the SIFT descriptors achieved an average F-score of 0.725 in video retrieval and thus were the most effective, while the SURF descriptors were computed in 0.3 seconds per image on average and were the most efficient in most cases.

Keywords

References

  1. P. Geetha and V. Narayanan, "A survey of content-based video retrieval," Journal of Computer Science, vol. 4, no. 6, pp. 474-486, 2008. https://doi.org/10.3844/jcssp.2008.474.486
  2. R. Belaroussi and M. Milgram, "A comparative study on face detection and tracking algorithms," Expert Systems with Applications, vol. 39, no. 8, pp. 7158-7164, 2012. https://doi.org/10.1016/j.eswa.2012.01.076
  3. Y. Chen, X. Li, A. Dick, and R. Hill, "Ranking consistency for image matching and object retrieval," Pattern Recognition, vol. 47, no. 3, pp. 1349-1360, 2014. https://doi.org/10.1016/j.patcog.2013.09.011
  4. F. Hopfgartner, "Personalised video retrieval: application of implicit feedback and semantic user profiles," Ph.D. dissertation, University of Glasgow, UK, 2010.
  5. S. Yu, L. Jiang, Z. Xu, Y. Yang, and A. G. Hauptmann, "Content-based video search over 1million videos with 1 core in 1 second," in Proceedings of the 5th ACM on International Conference on Multimedia Retrieval (ICMR), Shanghai, China, 2015, pp. 419-426.
  6. A. Mittal, "An overview of multimedia content-based retrieval strategies," Informatica, vol. 30, no. 3, pp. 347-356, 2006.
  7. S. Memar, L. S. Affendey, N. Mustapha, S. C. Doraisamy, and M. Ektefa, "An integrated semantic-based approach in concept based video retrieval," Multimedia Tools and Applications, vol. 64, no. 1, pp. 77-95, 2013. https://doi.org/10.1007/s11042-011-0848-4
  8. J. Joglekar and S. S. Gedam, "Area based image matching methods: a survey," International Journal of Emerging Technology and Advanced Engineering, vol. 2, no. 1, pp. 130-136, 2012.
  9. C. Harris and M. Stephens, "A combined corner and edge detector," in Proceedings of 4th Alvey Vision Conference (AVC), Manchester, UK, 1988, pp. 147-151.
  10. D. G. Lowe, "Object recognition from local scale-invariant features," in Proceedings of the 7th International Conference on Computer Vision, Kerkyra, Greece, 1999, pp. 1150-1157.
  11. D. G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, vol. 60, no. 2, pp. 91-110, 2004. https://doi.org/10.1023/B:VISI.0000029664.99615.94
  12. H. Bay, T. Tuytelaars, and L. Van Gool, "Surf: speeded up robust features," in Computer Vision: ECCV 2006. Heidelberg: Springer, 2006, pp. 404-417.
  13. Y. Ke and R. Sukthankar, "PCA-SIFT: a more distinctive representation for local image descriptors," in Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Washington, DC, 2004, pp. 506-513.
  14. K. Mikolajczyk and C. Schmid, "A performance evaluation of local descriptors," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 10, pp. 1615-1630, 2005. https://doi.org/10.1109/TPAMI.2005.188
  15. L. Juan and O. Gwun, "A comparison of sift, PCA-SIFT and SURF," International Journal of Image Processing, vol. 3, no. 4, pp. 143-152, 2009.
  16. M. H. Le, B. S. Woo, and K. H. Jo, "A comparison of sift and Harris Conner features for correspondence points matching," in Proceedings of 2011 17th Korea-Japan Joint Workshop on Frontiers of Computer Vision, Ulsan, Korea, 2011, pp. 1-4.
  17. O. Miksik and K. Mikolajczyk, "Evaluation of local detectors and descriptors for fast feature matching," in Proceedings of 2012 21st International Conference on Pattern Recognition, Tsukuba, Japan, 2012, pp. 2681-2684.
  18. H. P. Morevec, "Towards automatic visual obstacle avoidance," in Proceedings of International Joint of Conference on Artificial Intelligence, Cambridge, MA, 1977, pp. 584-584.
  19. L. Sirovich and M. Kirby, "Low-dimensional procedure for the characterization of human faces," Journal of the Optical Society of America A Optics and Image Science, vol. 4, no. 3, pp. 519-524, 1987. https://doi.org/10.1364/JOSAA.4.000519
  20. M. Turk and A. Pentland, "Eigenfaces for recognition," Journal of Cognitive Neuroscience, vol. 3, no. 1, pp. 71-86, 1991. https://doi.org/10.1162/jocn.1991.3.1.71
  21. A. Saudagar and H. Mohammed, "A comparative study of video splitting techniques," in Progress in Systems Engineering. Cham: Springer International Publishing, 2015, pp. 783-788.
  22. P. Timse, P. Aggarwal, P. Sinha, and N. Vora, "Face recognition based door lock system using OpenCV and C# with remote access and security features," International Journal of Engineering Research and Applications, vol. 4, no. 4, pp. 52-57, 2014.