DOI QR코드

DOI QR Code

Semantic Segmentation of Urban Scenes Using Location Prior Information

사전위치정보를 이용한 도심 영상의 의미론적 분할

  • Received : 2017.04.13
  • Accepted : 2017.08.15
  • Published : 2017.08.31

Abstract

This paper proposes a method to segment urban scenes semantically based on location prior information. Since major scene elements in urban environments such as roads, buildings, and vehicles are often located at specific locations, using the location prior information of these elements can improve the segmentation performance. The location priors are defined in special 2D coordinates, referred to as road-normal coordinates, which are perpendicular to the orientation of the road. With the help of depth information to each element, all the possible pixels in the image are projected into these coordinates and the learned prior information is applied to those pixels. The proposed location prior can be modeled by defining a unary potential of a conditional random field (CRF) as a sum of two sub-potentials: an appearance feature-based potential and a location potential. The proposed method was validated using publicly available KITTI dataset, which has urban images and corresponding 3D depth measurements.

Keywords

References

  1. J. Lafferty, A. McCallum, and F. Pereira, "Conditional random fields: Probabilistic models for segmenting and labeling sequence data," in International Conference on Machine Learning (ICML), Williamstown, USA, pp. 282-289, 2001.
  2. S. Divvala, D. Hoiem, J. Hays, A.A. Efros, and M. Hebert, "An empirical study of context in object detection," in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, USA, pp. 1271-1278, 2009.
  3. R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. Susstrunk, "SLIC superpixels compared to state-of-the-art superpixel methods," IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), vol. 34, no. 11, pp. 2274-2282, November, 2012. https://doi.org/10.1109/TPAMI.2012.120
  4. D. Hoiem, A.A. Efros, and M. Hebert, "Geometric context from a single image," in IEEE International Conference on Computer Vision (ICCV), Beijing, China, pp. 654-661, 2005.
  5. P. Kohli, and P.H. Torr, "Robust higher order potentials for enforcing label consistency," International Journal of Computer Vision (IJCV), vol. 82, no. 3, pp. 302-324, May, 2009. https://doi.org/10.1007/s11263-008-0202-0
  6. S. Gould, J. Rodgers, D. Cohen, G. Elidan, and D. Koller, "Multi-class segmentation with relative location prior," International Journal of Computer Vision (IJCV), vol. 80, no. 3, pp. 300-316, December, 2008. https://doi.org/10.1007/s11263-008-0140-x
  7. S. Gould, R. Fulton, and D. Koller, "Decomposing a scene into geometric and semantically consistent regions," in IEEE International Conference on Computer Vision (ICCV), Kyoto, Japan, pp. 1-8, 2009.
  8. L. Ladicky, C. Russell, P. Kohli, and P.H. Torr, "Graph cut based inference with co-occurrence statistics," in European Conference on Computer Vision (ECCV), Crete, Greece, pp. 239-253, 2010.
  9. J. Bergbauer, C. Nieuwenhuis, M. Souiai, and D. Cremers, "Proximity priors for variational semantic segmentation and recognition," in IEEE International Conference on Computer Vision Workshops (ICCVW), Sydney, Australia, pp. 15-21, 2013.
  10. C. Cadena and J. Kosecka, "Semantic segmentation with heterogeneous sensor coverages," in IEEE International Conference on Robotics and Automation (ICRA), Hong Kong, China, pp. 2639-2645, 2014.
  11. L. Breiman, "Random forests," Machine Learning, vol. 45, no. 1, pp. 5-32, October, 2001. https://doi.org/10.1023/A:1010933404324
  12. J. Wang and J. Kim, "Semantic segmentation of urban scenes with enhanced spatial contexts," in IEEE International Conference on Ubiquitous Robots and Ambient Intelligence (URAI), Xian, China, pp. 263-266, 2016.
  13. A. Geiger, M. Lauer, C. Wojek, C. Stiller, and R. Urtasun, "3d traffic scene understanding from movable platforms," IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), vol. 36, no. 5, pp. 1012-1025, May, 2014. https://doi.org/10.1109/TPAMI.2013.185
  14. A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, "Vision meets robotics: The kitti dataset," International Journal of Robotics Research (IJRR), vol. 32, no. 11, pp. 1231-1237, September, 2013. https://doi.org/10.1177/0278364913491297
  15. Y. Boykov, O. Veksler, and R. Zabih, "Fast approximate energy minimization via graph cuts," IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), vol. 23, no. 11, pp. 1222-1239, November, 2001. https://doi.org/10.1109/34.969114