DOI QR코드

DOI QR Code

Comparison of Fine-Tuned Convolutional Neural Networks for Clipart Style Classification

  • Lee, Seungbin (Department of Computer Science and Engineering Sogang University) ;
  • Kim, Hyungon (Department of Computer Science and Engineering Sogang University) ;
  • Seok, Hyekyoung (Department of Computer Science and Engineering Sogang University) ;
  • Nang, Jongho (Department of Computer Science and Engineering Sogang University)
  • Received : 2017.09.05
  • Accepted : 2017.09.30
  • Published : 2017.11.30

Abstract

Clipart is artificial visual contents that are created using various tools such as Illustrator to highlight some information. Here, the style of the clipart plays a critical role in determining how it looks. However, previous studies on clipart are focused only on the object recognition [16], segmentation, and retrieval of clipart images using hand-craft image features. Recently, some clipart classification researches based on the style similarity using CNN have been proposed, however, they have used different CNN-models and experimented with different benchmark dataset so that it is very hard to compare their performances. This paper presents an experimental analysis of the clipart classification based on the style similarity with two well-known CNN-models (Inception Resnet V2 [13] and VGG-16 [14] and transfers learning with the same benchmark dataset (Microsoft Style Dataset 3.6K). From this experiment, we find out that the accuracy of Inception Resnet V2 is better than VGG for clipart style classification because of its deep nature and convolution map with various sizes in parallel. We also find out that the end-to-end training can improve the accuracy more than 20% in both CNN models.

Keywords

References

  1. M. F. Barroso, M. J. Fonseca, B. Barroso, P. Ribeiro, and J. A. Jorge, "Retrieving ClipArt Images by Content." in Proceedings of the 3rd International Conference on Image and Video Retrieval, pp. 500-507, 2004.
  2. M. J. Fonseca, A. Ferreira, and J. A. Jorge, "Sketch-Based Retrieval of Complex Drawings Using Hierarchical Topology and Geometry," Computer-Aided Design, vol. 41, no. 12, pp. 1067-1081, 2009. https://doi.org/10.1016/j.cad.2009.09.004
  3. P. Sousa and M. J. Fonseca, "Geometric Matching for Clip-Art Drawing Retrieval," Journal of Visual Communication and Image Representation, vol. 20, no. 2, pp. 71-83, 2009. https://doi.org/10.1016/j.jvcir.2008.11.005
  4. P. Martins, R. Jesus, M. J. Fonseca and N. Correia, "Clip Art Retrieval Combining Raster and Vector Methods," in Proceedings of 11th International Workshop on Content-Based Multimedia Indexing, pp. 35-40, 2013.
  5. E. Garces, A. Agarwala, D. Gutierrez, and A. Hertzmann, "A Similarity Measure for Illustration Style," ACM Transactions on Graphics, vol. 33, no. 4, 2014.
  6. B. Saleh, M. Dontcheva, A. Hertzmann, and Z. Lui, "Learning Style Similarity for Searching Infographics, " in Proceedings of the 41st Graphics Interface Conference, pp. 59-64, 2015.
  7. T. Furuya, S. Kuriyama, and R. Ohbuchi, "An Unsupervised Approach for Comparing Styles of Illustrations," in Proceedings of 13th International Workshop on Content-Based Multimedia Indexing, pp. 35-40, 2013.
  8. S. Karayev, M. Trentacoste, H. Han, A. Agarwala, T. Darrell, A. Hertzmann, and H. Winnemoeller, "Recognizing Image Style," in Proceedings of British Machine Vision Conference, no. 121, 2014.
  9. S. Bell and K. Bala, "Learning Visual Similarity for Product Design with Convolutional Neural Networks," ACM Transactions on Graphics, vol. 34, no. 4, 2015.
  10. C. Kuo, Y. Chou, and P. Chang, "Using Deep Convolutional Neural Networks for Image Retrieval," in Proceedings of Visual Information Processing and Communication VII, pp. 1-6, 2016.
  11. J. Wan, D. Wang, S. Hoi, P. Wu, J. Zhu, Y. Zhang, and J. Li, "Deep Learning for Content-based Image Retrieval : A Comprehensive Study," in Proceedings of 22nd ACM Transactions on Multimedia, pp. 157-166, 2014.
  12. E. Garces, A. Agarwala, A. Hertzmann, and D. Gutierrez, "Style-Based Exploration of Illustration Datasets," Multimedia Tools and Applications, vol. 76, pp. 13067-13086, 2017. https://doi.org/10.1007/s11042-016-3702-x
  13. C. Szegedy, S. Ioffe, V. Vanhoucke, and A. Alemi, "Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning." in Proceedings of the 31st Association for the Advancement of Artificial Intelligence , pp. 4278-4284, 2017.
  14. K. Simonyan, "Very Deep Convolutional Networks for Large-Scale Image Recognition," arXiv preprint arXiv: 1409.1556, 2014.
  15. B. Chu, V. Madhavan, O, Beijbom, J. Hoffman, and T. Darrell, "Best Practices for Fine-tuning Visual Classifiers to New Domains," in Proceedings of the European Conference on Computer Vision 2016, pp. 435-442, 2016.
  16. A. Poernomo and D. Kang, "Content-Aware Convolutional Neural Network for Object Recognition Tast," International Journal of Advanced Smart Convergence, vol. 5, no. 3, pp. 1-7, 2016. https://doi.org/10.7236/IJASC.2016.5.3.1
  17. OpenClipart Library. [online]. Available : https://openclipart.org
  18. J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, T. Darrell, and T. Eecs, "DeCAF : A Deep Convolutional Activation Feature for Generic Visual Recognition," in Proceedings of the International Conference on Machine Learning, pp. 647-655, 2014.
  19. A. Krizhevsky, I, Sutskever, and G. E. Hinton, "Imagenet Classification with Deep Convolutional Neural Network," in Proceedings of the 25th International Conference on Neural Information Processing Systems, pp. 1097-1105, 2012.
  20. G. Chechik, V. Sharma, U. Shalit, and S. Bengio, "Large Scale Online Learning of Image Similarity Through Ranking," Journal of Machine Learning Research, vol. 11, pp. 1109-1135, 2010.
  21. P. Donova, J. Libeks, A. Agarwala, and A. Hertzmann, "Exploratory Font Selection Using Crowdsourced Attributes," ACM Transactions on Graphics, vol. 33, no. 4, 2014.
  22. B. Kulis, "Metric Learning: A Survey," Fondations and Trends in Machine Learning, vol. 5, no. 4, pp. 287-364, 2013. https://doi.org/10.1561/2200000019
  23. C. Szegedy, W. Lui, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, "Going Deeper with Convolutions," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-9, 2015.
  24. B. Kim, "Combining Empirical Feature Map and Conjugate Least Squares Support Vector Machine for Real Time Image Recognition: Research with Jade Solution Company," International Journal of Internet, Broadcasting and Communication, vol. 9, no. 1, pp. 9-17, 2017. https://doi.org/10.7236/IJIBC.2017.9.1.9