Active Learning on Sparse Graph for Image Annotation

  • Li, Minxian (School of Computer Science, Nanjing University of Science and Technology) ;
  • Tang, Jinhui (School of Computer Science, Nanjing University of Science and Technology) ;
  • Zhao, Chunxia (School of Computer Science, Nanjing University of Science and Technology)
  • 투고 : 2012.08.22
  • 심사 : 2012.09.25
  • 발행 : 2012.10.31

초록

Due to the semantic gap issue, the performance of automatic image annotation is still far from satisfactory. Active learning approaches provide a possible solution to cope with this problem by selecting most effective samples to ask users to label for training. One of the key research points in active learning is how to select the most effective samples. In this paper, we propose a novel active learning approach based on sparse graph. Comparing with the existing active learning approaches, the proposed method selects the samples based on two criteria: uncertainty and representativeness. The representativeness indicates the contribution of a sample's label propagating to the other samples, while the existing approaches did not take the representativeness into consideration. Extensive experiments show that bringing the representativeness criterion into the sample selection process can significantly improve the active learning effectiveness.

키워드

참고문헌

  1. S. Zhu, Y. Liu, "Semi-supervised learning model based efficient image annotation," IEEE Signal Processing Letters, vol. 16, no. 4, pp. 989-992, Nov 2009. https://doi.org/10.1109/LSP.2009.2028114
  2. M. Wang and X.-S. Hua, "Active learning in multimedia annotation and retrieval: a survey," ACM Transaction on Intelligent Systems and Technology, vol. 2, no. 2, Feb 2011.
  3. X.J. Wang, L. Zhang, X. Li, and W.Y. Ma, "Annotating images by mining image search results," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 30, no. 11, pp. 1919-1932, Nov 2008. https://doi.org/10.1109/TPAMI.2008.127
  4. T. Huang, C. Dagli, S. Rajaram, E. Chang, M. Mandel, G. Poliner, and D. Ellis, "Active learning for interactive multimedia retrieval," In Proceedings of the IEEE, vol. 96, no. 4, pp. 648-667, Apr 2008. https://doi.org/10.1109/JPROC.2008.916364
  5. S. Tong and E. Chang, "Support vector machine active learning for image retrieval," in Proc. of the 9th ACM Int. Conf. on Multimedia, pp. 107-118, Sep 2001.
  6. J. He, M.J. Li, H.J. Zhang, H. Tong, and C. Zhang, "Mean version space: a new active learning method for content-based image retrieval," in Proc. of the 6th ACM SIGMM Int. Workshop on Multimedia Information Retrieval, pp.15-22, Oct 2004.
  7. A. Joshi, F. Porikli, and N. Papanikolopoulos, "Multi-class active learning for image classification," in Proc. of IEEE Conf. on Computer Vision and Pattern Recognition, pp. 2372-2379, Jun 2009.
  8. Y. Wu, I. Kozintsev, J. yves Bouguet, and C. Dulong, "Sampling strategies for active learning in personal photo retrieval," in Proc. of IEEE Int. Conf. on Multimedia and Expo, pp. 529-532, Jul 2006.
  9. C. Zhang and T. Chen, "Annotating retrieval database with active learning," in Proc. of IEEE Int. Conf. on Image Process, vol. 3, no. 2, pp. 595-598, Sep 2003.
  10. S. C. H. Hoi, R. Jin, J. Zhu, and M. R. Lyu, "Batch mode active learning and its application to medical image classification," in Proc. of the 23rd Int. Conf. on Machine learning, pp. 417-424, 2006.
  11. C. Dagli, S. Rajaram, and T. Huang, "Leveraging active learning for relevance feedback using an information theoretic diversity measure," in Proc. of Int. Conf. on Image Video Retrieval, vol. 4071, pp. 123-132, 2006.
  12. S. Ayache and G. Quenot, "Evaluation of active learning strategies for video indexing," Image Communication, vol. 22, no. 7, pp. 692-704, Aug 2007.
  13. P. H. Gosselin and M. Cord, "A comparison of active classification methods for content-based image retrieval," in Proc. of the 1st Int.Workshop on Computer Vision Meets Databases, pp. 51-58, Jun 2004.
  14. J. Tang, S. Yan, et al, "Inferring semantic concepts from community-contributed images and noisy tags," in Proc. of the 17th ACM Int. Conf. on Multimedia, pp. 223-232, Oct 2009.
  15. J. Tang, H. Li, et al, "Image annotation by graph-based inference with integrated multiple/single instance representations," IEEE Transactions on Multimedia, vol. 12, no. 2, pp. 131-141, Feb 2010. https://doi.org/10.1109/TMM.2009.2037373
  16. J. Tang, R. Hong, S. Yan, and et al, "Image annotation by knn-sparse graph-based label propagation over noisily-tagged web images," ACM Transactions on Intelligent Systems and Technology, vol. 2, no. 2, pp. 111-126, Feb 2011.
  17. N. Cesa-Bianchi, C. Gentile, F. Vitale, and G. Zappella, "Active learning on trees and graphs," in Proc. of the 23rd Conf. on Learning Theory, pp. 320-332, Jun 2010.
  18. X. Wang, W. Ma, L. Zhang, X. Li, "Multi-graph enabled active learning for multimodal web image retrieval," in Proc. of the 7th ACM SIGMM Int.Workshop on Multimedia Information Retrieval, pp. 65-72, Aug 2005.
  19. J. Tang, Z. Zha, D. Tao, and T. S. Chua, "Semantic-gap-oriented active learning for multilabel image annotation," IEEE Transactions on Image Processing, vol. 21, no. 4, pp.2354-2360, Apr 2012. https://doi.org/10.1109/TIP.2011.2180916
  20. D. A. Cohn, Z. Ghahramani, and M. I. Jordan, "Active learning with statistical models," Journal of Artificial Intelligence Research, vol. 4, no.1, pp. 129-145, January, 1996.
  21. R. D. King, K. E. Whelan, F. M. Jones, P. G. Reiser, C. H. Bryant, S. H. Muggleton, D. B. Kell, and S. G. Oliver, "Functional genomic hypothesis generation and experimentation by a robot scientist," Nature, vol. 427, no. 6971, pp. 247-252, Jan 2004. https://doi.org/10.1038/nature02236
  22. D. Cohn, L. Atlas , and R. Ladner, "Improving generalization with active learning," Machine Learning, vol. 15, no. 2, pp. 201-221, May, 1994.
  23. S. Tong, and D. Koller, "Support vector machine active learning with applications to text classification," Journal of Machine Learning Research, vol. 2, pp.45-66, Mar 2002.
  24. N. Roy and A. McCallum, "Toward optimal active learning through sampling estimation of error reduction," in Proc. of 18th Int. Conf. Machine Learning, pp. 441-448, 2001.
  25. M. Tang, X. Luo, and S. Roukos, "Active learning for statistical natural language parsing," in Proc. of 40th Annu. Meeting Assoc. Comput. Linguist., pp. 120-127, 2002.
  26. D. Shen, J. Zhang, J. Su, G. Zhou, and C. -L. Tan, "Multi-criteria-based active learning for named entity recognition," in Proc. of 42nd Annu. Meeting Assoc. Comput. Linguist., pp. 589-596, 2004.
  27. H. Tong, J. He, M. Li, W. Ma, H. J. Zhang, C. Zhang, "Manifold-ranking based keyword propagation for image retrieval," EURASIP Journal on Applied Signal Processing, vol. 21, pp. 1-10, Jan 2006.
  28. A. Ghoshal, P. Ircing, S. Khudanpur, "Hidden Markov models for automatic annotation and content-based retrieval of images and video," in Proc.of the 28th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 544-551, 2005.
  29. M. Wang, X. Hua, J. Tang, R. Hong, "Beyond distance measurement: constructing neighborhood similarity for video annotation," IEEE trans. On Multimedia, vol.11, no.3, 2009.
  30. M. Wang, X. Hua, R. Hong, J. Tang, G. Qi, Y. Song, "Unified video annotation via multi-graph learning," IEEE trans. On Circuit System and Video Technology, vol.19, no.5, 2009.
  31. J. Park, S. Park, Y. Shin and D. Jang, "A novel system for detecting adult images on the internet," KSII Transaction on internet and information system, vol. 4, no. 5, pp. 910-924, 2010.
  32. B. Wu and J. Juang, "Real-time vehicle detector with dynamic segmentation and rule-based tracking reasoning for complex traffic conditions," KSII Transaction on internet and information system, vol. 5, no. 12, pp. 2355-2373, 2011.
  33. S. T. Roweis and L. K. Saul, "Nonlinear dimensionality reduction by locally linear embedding," Science, vol. 290, no. 5500, 2323-2326, 2000. https://doi.org/10.1126/science.290.5500.2323
  34. F. Wang and C. Zhang, "Label propagation through linear neighborhoods," IEEE Transactions on Knowledge and Data Engineering, vol. 20, no. 1, pp. 55-67, 2008. https://doi.org/10.1109/TKDE.2007.190672
  35. R. Rao and B. Olshausen and M. Lewicki, Probabilistic models of the brain: perception and neural function, MIT Press, 2002.
  36. J. Wright, A. Yang, A. Ganesh, S. Sastry, and Y. Ma, "Robust face recognition via sparse representation," IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 31, no. 2, pp. 210-227, February, 2009. https://doi.org/10.1109/TPAMI.2008.79
  37. D. L. Donoho, "For most large underdetermined systems of linear equations the minimal -norm solution is also the sparsest solution," Communications on Pure and Applied Mathematics, vol. 59, no. 6, pp. 797-829, 2006. https://doi.org/10.1002/cpa.20132
  38. l1-magic. http://www.acm.caltech.edu/l1magic/.
  39. D. Mount and S. Araya, "Ann: a library for approximate nearest neighbor searching," in CGC 2nd Annual Fall Workshop on Computational Geometry, 1997.
  40. D.D. Lewis and W.A. Gale, "A sequential algorithm for training text classifiers," in Proc.of the 17th annual Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, pp. 3-12, July 03-06, 1994.
  41. T.-S. Chua and J. Tang et al, "NUS-WIDE: a real-world web image database from National University of Singapore," in Proc.of ACM Int. Conf. on Image and Video Retrieval, Jul 2009.
  42. Trec-10 proceedings appendix on common evaluation measures.