[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.6109/jicce.2015.13.3.205

CRF-Based Figure/Ground Segmentation with Pixel-Level Sparse Coding and Neighborhood Interactions

Zhang, Lihe (School of Information and Communication Engineering, Dalian University of Technology)
Piao, Yongri (School of Information and Communication Engineering, Dalian University of Technology)

Publication Information

Journal of information and communication convergence engineering / v.13, no.3, 2015 , pp. 205-214 More about this Journal

Abstract

In this paper, we propose a new approach to learning a discriminative model for figure/ground segmentation by incorporating the bag-of-features and conditional random field (CRF) techniques. We advocate the use of image patches instead of superpixels as the basic processing unit. The latter has a homogeneous appearance and adheres to object boundaries, while an image patch often contains more discriminative information (e.g., local image structure) to distinguish its categories. We use pixel-level sparse coding to represent an image patch. With the proposed feature representation, the unary classifier achieves a considerable binary segmentation performance. Further, we integrate unary and pairwise potentials into the CRF model to refine the segmentation results. The pairwise potentials include color and texture potentials with neighborhood interactions, and an edge potential. High segmentation accuracy is demonstrated on three benchmark datasets: the Weizmann horse dataset, the VOC2006 cow dataset, and the MSRC multiclass dataset. Extensive experiments show that the proposed approach performs favorably against the state-of-the-art approaches.

Keywords

Conditional random field; Figure/ground segmentation; Neighborhood interaction; Sparse coding;

Citations & Related Records

Reference

1	B. J. Frey and D. J. MacKay, "A revolution: belief propagation in graphs with cycles," in Proceedings of Advances in Neural Information Processing Systems (NIPS1998), Denver, CO, pp. 479-485, 1998.
2	S. Vicente, C. Rother, and V. Kolmogorov, "Object cosegmentation," in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR2011), Providence, RI, pp. 2217-2224, 2011.
3	E. Borenstein and S. Ullman, "Combined top-down and bottom-up segmentation," in Proceedings of the 2004 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'04), Washington, DC, 2004.
4	M. Everingham, The VOC 2006 database [Internet]. Available: http://www.pascal-network.org/challenges/VOC/databases.html.
5	L. Zhang and Q. Ji, “Image segmentation with a unified graphical model,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, no. 8, pp. 1406-1425, 2010. DOI
6	D. A. Ross, J. Lim, R. S. Lin, and M. H. Yang, “Incremental learning for robust visual tracking,” International Journal of Computer Vision, vol. 77, no. 1-3, pp. 125-141, 2008. DOI
7	J. Carreira and C. Sminchisescu, "Constrained parametric min-cuts for automatic object segmentation," in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2010), San Francisco, CA, pp. 3241-3248, 2010.
8	C. Galleguillos, B. McFee, S. Belongie, and G. Lanckriet, "Multi-class object localization by combining local contextual interactions," in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR2010), San Francisco, CA, pp. 113-120, 2010.
9	S. Kumar and M. Hebert, "Discriminative random fields: a discriminative framework for contextual interaction in classi-fication," in Proceedings of 9th IEEE International Conference on Computer Vision, Nice, France, pp. 1150-1157, 2003.
10	H. Lee, A. Battle, R. Raina, and A. Y. Ng, "Efficient sparse coding algorithms," in Proceedings of Advances in Neural Information Processing Systems (NIPS2006), Vancouver, Canada, pp. 801-808, 2006.
11	R. Nock and F. Nielsen, “Statistical region merging,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 11, pp. 1452-1458, 2004. DOI
12	G. E. Hinton, “Training products of experts by minimizing contrastive divergence,” Neural Computation, vol. 14, no. 8, pp. 1771-1800, 2002. DOI
13	P. Kohli. L. Ladicky, and P. H. Torr, "Robust higher order potentials for enforcing label consistency," in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR2008), Anchorage, AK, pp. 1-8, 2008.
14	S. V. N. Vishwanathan, N. N. Schraudolph, M. W. Schmidt, and K. P. Murphy, "Accelerated training of conditional random fields with stochastic gradient methods," in Proceedings of the 23rd International Conference on Machine Learning (ICML), Pittsburgh, PA, pp. 969-976, 2006.
15	C. Sutton and A. McCallum, "Piecewise training of undirected models," in Proceedings of the 21st Conference on Uncertainty in Artificial Intelligence (UAI2005), Edinburgh, Scotland, pp. 1-8, 2005.
16	S. Z. Li, Markov Random Field Modeling in Image Analysis. Tokyo: Springer, 2001.
17	L. Ladicky, C. Russell, P. Kohli, and P. H. Torr, "Associative hierarchical CRFs for object class image segmentation," in Proceedings of 2009 IEEE 12th International Conference on Computer Vision, Kyoto, Japan, pp. 739-746, 2009.
18	J. Shotton, J. Winn, C. Rother, and A. Criminisi, "Textonboost: joint appearance, shape and context modeling for multi-class object recognition and segmentation," in Proceedings of 9th European Conference on Computer Vision (ECCV2006), Graz, Austria, pp. 1-15, 2006.
19	C. Chen, D. Freedman, and C. H. Lampert, "Enforcing topological constraints in random field image segmentation," in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR2011), Providence, RI, pp. 2089-2096, 2011.
20	A. Rosenfeld and D. Weinshall, "Extracting foreground masks towards object recognition," in Proceedings of 2011 IEEE International Conference on Computer Vision (ICCV), Barcelona, Spain, pp. 1371-1378, 2011.
21	L. Wu, S. C. Hoi, and N. Yu, “Semantics-preserving bag-of-words models and applications,” IEEE Transactions on Image Processing, vol. 19, no. 7, pp. 1908-1920, 2010. DOI
22	D. Singaraju and R. Vidal, "Using global bag of features models in random fields for joint categorization and segmentation of objects," in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR2011), Providence, RI, pp. 2313-2319, 2011.
23	B. Fulkerson, A. Vedaldi, and S. Soatto, "Class segmentation and object localization with superpixel neighborhoods," in Proceedings of 2009 IEEE 12th International Conference on Computer Vision, Kyoto, Japan, pp. 670-677, 2009.
24	L. Fei-Fei and P. Perona, "A Bayesian hierarchical model for learning natural scene categoriesm" in Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR2005), San Diego, CA, pp. 524-531, 2005.
25	S. Lazebnik, C. Schmid, and J. Ponce, "Beyond bags of features: spatial pyramid matching for recognizing natural scene categories," in Proceedings of 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, New York City, NY, pp. 2169-2178, 2006.
26	X. Ren and J. Malik, "Learning a classification model for segmentation," in Proceedings of 9th IEEE International Conference on Computer Vision, Nice, France, pp. 10-17, 2003.
27	B. Wu and R. Nevatia, "Simultaneous object detection and segmentation by boosting local shape feature based classifier," in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR2007), Minneapolis, MN, pp. 1-8, 2007.
28	X. He, R. S. Zemel, and M. A. Carreira-Perpiñán, "Multiscale conditional random fields for image labeling," in Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR2004), Washington, DC, pp. 695-702, 2004.
29	P. Duygulu, K. Barnard, J. F. de Freitas, and D. A. Forsyth, "Object recognition as machine translation: learning a lexicon for a fixed image vocabulary," in Proceedings of 7th European Conference on Computer Vision (ECCV2002), Copenhagen, Denmark, pp. 97-112, 2002.