[KSCI] Korea Science Citation Index Service

Correlation-based Automatic Image Captioning

Hyungjeong, Yang
Pinar, Duygulu
Christos, Falout

Publication Information

Journal of KIISE:Software and Applications / v.31, no.10, 2004 , pp. 1386-1399 More about this Journal

Abstract

This paper presents correlation-based automatic image captioning. Given a training set of annotated images, we want to discover correlations between visual features and textual features, so that we can automatically generate descriptive textual features for a new unseen image. We develop models with multiple design alternatives such as 1) adaptively clustering visual features, 2) weighting visual features and textual features, and 3) reducing dimensionality for noise sup-Pression. We experiment thoroughly on 10 data sets of various content styles from the Corel image database, about 680MB. The major contributions of this work are: (a) we show that careful weighting visual and textual features, as well as clustering visual features adaptively leads to consistent performance improvements, and (b) our proposed methods achieve a relative improvement of up to 45% on annotation accuracy over the state-of-the-art, EM approach.

Keywords

Image annotation; correlation; Singular Value Decomposition; Clustering;

Citations & Related Records

Times Cited By KSCI : 1 (Citation Analysis)

Reference
Cited By KSCI

1	Hamerly, G. and Elkan, C. 'Learning the k in k-means,' Proceedings of the NIPS, 2003
2	Shi, J. and Malik, J., 'Normalized cuts and image segrnenatation,' IEEE Trans. on Pattern Analysis and Machine Itelligence, Vol 22, No.8, pp = '888-905', 2000 DOI ScienceOn
3	Furmas, G. W., Deerwester, S., Dumais, S. T., Landauer, T., Harshman, R. A., Streeter, L. A., and Lochbaum, K. E., 'Information retrieval using a singular value decomposition model of latent semantic structure,' Proceedings of the 11th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 465-480, 1998 DOI
4	Monay, F. and Gatica-Perez, D. 'On Image AutoAnnotation with Latent Space Models,' Proc. ACM Int. Conf. on Multimedia (ACM MM), 2003
5	Velliste, M. and Murphy, R.F., 'Automated Determination of Protein Subcellular Locations from 3D Fluorescence Microscope Images,' Proc. 2002 IEEE Inti Syrnp Biomed Imaging (ISBI 2002), pp. 867-870, 2002 DOI
6	Foss, A. and Zaane, O. 'A Parameterless Method for Efficiently Discovering Clusters of Arbitrary Shape in Large Datasets', Proc. of the IEEE International Conference on Data Mining (ICDM '2002), pp. 179-186, 2002
7	Pelleg, Dan and Moore, A., 'X -means: Extending K -rneans with Efficient Estimation of the Number of Clusters,' Proceedings of the Seventeenth International Conference on Machine Learning, 2000
8	Wenyin, L., Dumais, S., Sun, Y., Zhang, H., Czerwinski, M. and Field, B., 'Semi-Automatic Image Annotation,' INTERACT2001, 8th IFIP TC.13 Conference on Human-Computer Interaction, 2001
9	Brown, P. F., Pietra, S. A., Della, P. and Mercer, R. L., 'The mathematics of statistical machine translation: Parameter estimation,' Computational Linguistics, Vol. 19, No.2, pp. 263-311, 1993
10	Barnard, K., Duygulu, P. and Forsyth, D. A., 'Clustering art,' IEEE Conf. on Computer Vision and Pattern Recognition, pp. 434-441, 2001
11	Hofmann, T., 'Unsupervised Learning by Probabilistic Latent Semantic Analysis,' Machine Learning Journal, Vol. = 42, No.1, pp. 177-196, 2001 DOI
12	Lavrenko, V., Manmatha, R. and Jeon, J. 'A Model for Learning the Semantics of Pictures,' NIPS, 2003
13	Carbonetto, P., Freitas, N. de and Barnard, K., 'A Statistical Model for General Contextual Object Recognition,' ECCV 2004
14	Han, J. and Kamber, M., Data Mining: Concepts and Techniques, Morgan Kaufmann, 2000
15	Zhang, B., 'Generalized K-Harmonic Means Dynamic Weighting of Data in Unsupervised Learning,' Proceeding of the First SIAM Intl. Conf. On Data Mining, 2001
16	Lee, J., and Oh, H., 'Design of Indexing Agent for Semantic-based Video Retrieval,' Journal of Korean Information Processing Society, Vol. 10, No.6, pp.687-694, 2003 과학기술학회마을 DOI
17	Ankerst, M., Breung, M. M., Kriegel, H. and Sander, J., 'OPTICS: Ordering Points to Identify the Clustering Structure,' Proc. ACM SIGMOD '99, 1999 DOI
18	Barnard, K. and Forsyth, D. A., 'Learning the semantics of words and pictures', Int. Conf. on Computer Vision',pp. 408-15, 2001
19	Jeon, J., Lavrenko, V. and Manmatha, R, 'Automatic Image Annotation and Retrieval using Cross-Media Relevance Models,' 26th Annual International ACM SIGIR Conference, 2003 DOI
20	Barnard, K, Duygulu, P., Guru, R, Gabbur, P. and Forsyth, D. A, 'The effects of segmentation and feature choice in a translation model of object recognition,' IEEE Conf, on Computer Vision and Pattern Recognition, 2003 DOI
21	Duygulu, P., Barnard, K., Freitas, J. F. G. de and Forsyth, D. A., 'Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary,' The Proceedings of the Seventh European Conference on Computer Vision, pp. IV:97-112, 2002
22	Li, J. and Wang, J. Z., 'Automatic linguistic indexing of pictures by a statistical modeling approach,' IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 25, No. 10, 2003
23	Mori, Y. and Takahashi, H. and Oka, R. 'Imageto-word transformation based on dividing and vector quantizing images with words,' First International Workshop on Multimedia Intelligent Storage and Retrieval Management, 1999
24	Na, Y, 'Image Content Modeling for Meaningbased Retrieval,' Journal of Korean Information Science Society, Vol. 30, No.2, pp. 145-156, 2003
25	Maron, O. and Ratan, A. L., 'Multiple-Instance Learning for Natural Scene Classification,' The Fifteenth International Conference on Machine Learning, 1998
26	Benitez, A. B. and Chang, S.-F., 'Image Classification Using Multimedia Knowledge Networks,' Proceeding of the International Conference on Image Processing (ICIP-2003), 2003 DOI
27	Jaimes, A., Tseng, B., and Smith, J., 'Modal Keywords, Ontologies, and Reasoning for Video Understanding,' CIVR 2003, pp.248-259, 2003
28	Cho, M., Choi, J., Shin, J., and Kim, P., 'Concept-based image retrieval using similarity measurement between concepts,' Proc. of Korean Information Science Society Conference, No. 2483, pp, 253-255, 2003 과학기술학회마을
29	Blei, D.M. and Jordan, M. I., 'Modeling Annotated Data', 26th Annual International ACM SIGIR Conference', 2003

KSCI

Correlation-based Automatic Image Captioning 상호 관계 기반 자동 이미지 주석 생성

Correlation-based Automatic Image Captioning