Browse > Article
http://dx.doi.org/10.3837/tiis.2022.02.009

Deep Facade Parsing with Occlusions  

Ma, Wenguang (Faculty of Information Technology, Beijing University of Technology)
Ma, Wei (Faculty of Information Technology, Beijing University of Technology)
Xu, Shibiao (National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences)
Publication Information
KSII Transactions on Internet and Information Systems (TIIS) / v.16, no.2, 2022 , pp. 524-543 More about this Journal
Abstract
Correct facade image parsing is essential to the semantic understanding of outdoor scenes. Unfortunately, there are often various occlusions in front of buildings, which fails many existing methods. In this paper, we propose an end-to-end deep network for facade parsing with occlusions. The network learns to decompose an input image into visible and invisible parts by occlusion reasoning. Then, a context aggregation module is proposed to collect nonlocal cues for semantic segmentation of the visible part. In addition, considering the regularity of man-made buildings, a repetitive pattern completion branch is designed to infer the contents in the invisible regions by referring to the visible part. Finally, the parsing map of the input facade image is generated by fusing the results of the visible and invisible results. Experiments on both synthetic and real datasets demonstrate that the proposed method outperforms state-of-the-art methods in parsing facades with occlusions. Moreover, we applied our method in applications of image inpainting and 3D semantic modeling.
Keywords
Facade parsing; occlusion; repetitive pattern; man-made structure;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 M. Kozinski, R. Gadde, S. Zagoruyko, G. Obozinski, and R. Marlet, "A MRF shape prior for facade parsing with occlusions," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2820-2828, Jun. 2015.
2 N. Silberman, D. Hoiem, P . Kohli, and R. Fergus, "Indoor segmentation and support inference from rgbd images," in Proc. of the European Conference on Computer Vision, pp. 746-760, Oct. 2012.
3 A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, "Vision meets robotics: The KITTI dataset," The International Journal of Robotics Research, vol. 32, no. 11, pp. 1231- 1237, Sep. 2013.   DOI
4 M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele, "The cityscapes dataset for semantic urban scene understanding," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3213-3223, Jun. 2016.
5 H. Liu, Y . Xu, J. Zhang, J. Zhu, Y . Li, and C. S. Hoi, "DeepFacade: A deep learning approach to facade parsing with symmetric loss," IEEE Transactions on Multimedia, vol. 22, no. 12, pp. 3153-3165, Dec. 2020.   DOI
6 J. Liu, E. Z. Psarakis, Y . Feng, and I. Stamos, "A kronecker product model for repeated pattern detection on 2d urban images," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 9, pp. 2266-2272, Jul. 2018.   DOI
7 O. Teboul, "Ecole centrale paris facades database," 2010. [Online]. Available: http://vision.mas.ecp.fr/Personnel/teboul/data.php
8 A. Cohen, M. R. Oswald, Y . Liu, and M. Pollefeys, "Symmetry-aware facade parsing with occlusions," in Proc. of the International Conference on 3D Vision, pp. 393-401, Oct. 2017.
9 O. Teboul, L. Simon, P . Koutsourakis, and N. Paragios, "Segmentation of building facades using procedural shape priors," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3105-3112, Jun. 2010.
10 P . Isola, J.-Y . Zhu, T. Zhou, and A. A. Efros, "Image-to-image translation with conditional adversarial networks," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125-1134, Jun. 2017.
11 M. Schmitz and H. Mayer, "A convolutional network for semantic facade segmentation and interpretation," The International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences, Vol. XLI-B3, pp.709-715, Jun. 2016.   DOI
12 ESRI, "Cityengine," 2016. [Online]. Available: http://www.esri.com/software/cityengine
13 A. Wendel, M. Donoser, and H. Bischof, "Unsupervised facade segmentation using repetitive patterns," in Proc. of the Joint Pattern Recognition Symposium, pp. 51-60, Sep. 2010.
14 C. Rodriguez-Pardo, S. Suja, D. Pascual, J. Lopez-Moreno, and E. Garces, "Automatic extraction and synthesis of regular repeatable patterns," Computers & Graphics, vol. 83, pp. 33-41, Oct. 2019.   DOI
15 O. Ronneberger, P . Fischer, and T. Brox, "U-net: Convolutional networks for biomedical image segmentation," in Proc. of the International Conference on Medical Image Computing and Computer Assisted Intervention, pp. 234-241, Oct. 2015.
16 L.-C. Chen, Y . Zhu, G. Papandreou, F. Schroff, and H. Adam, "Encoder-decoder with atrous separable convolution for semantic image segmentation," in Proc. of the European Conference on Computer Vision, pp. 833-851, Sep. 2018.
17 W. Ma and W. Ma, "Deep window detection in street scenes," KSII Transactions on Internet and Information Systems, vol. 14, no. 2, pp. 855-870, Feb. 2020.   DOI
18 J. Femiani, W. R. Para, N. Mitra, and P. Wonka, "Facade segmentation in the wild," arXiv preprint arXiv:1805.08634, 2018.
19 W. Ma, W. Ma, S. Xu, and H. Zha, "Pyramid ALKNet for semantic parsing of building facade image," IEEE Geoscience and Remote Sensing Letters, vol. 18, no. 6, pp. 1009-1013, June 2021.   DOI
20 W. Ma, S. Xu, W. Ma, and H. Zha, "Multiview feature aggregation for facade parsing," IEEE Geoscience and Remote Sensing Letters, vol. 19, pp. 1-5, 2020.
21 V. Badrinarayanan, A. Kendall, and R. Cipolla, "Segnet: A deep convolutional encoder-decoder architecture for image segmentation," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 12, pp. 2481-2495, Jan. 2017.   DOI
22 C. Peng, X. Zhang, G. Y u, G. Luo, and J. Sun, "Large kernel matters- improve semantic segmentation by global convolutional network," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4353- 4361, Jun. 2017.
23 R. Fathalla and G. Vogiatzis, "A deep learning pipeline for semantic facade segmentation," in Proc. of the British Machine Vision Conference, pp. 120.1-120.13, Sep. 2017.
24 C.-K. Li, H.-X. Zhang, J.-X. Liu, Y .-Q. Zhang, S.-C. Zou, and Y .-T. Fang, "Window detection in facades using heatmap fusion," Journal of Computer Science and Technology, vol. 35, no. 4, pp. 900-912, Jul. 2020.   DOI
25 K. Bacharidis, F. Sarri, and L. Ragia, "3D building facade reconstruction using deep learning," ISPRS International Journal of Geo-Information, vol. 9, no. 5, p. 322, May 2020.   DOI
26 L.-C. Chen, G. Papandreou, F. Schroff, and H. Adam, "Rethinking atrous convolution for semantic image segmentation," arXiv preprint arXiv:1706.05587, 2017.
27 K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770-778, Jun. 2016.
28 S. Iizuka, E. Simo-Serra, and H. Ishikawa, "Globally and locally consistent image completion," ACM Transactions on Graphics, vol. 36, no. 4, pp. 1-14, Jul. 2017.
29 W. Liu, A. Rabinovich, and A. C. Berg, "Parsenet: Looking wider to see better," arXiv preprint arXiv:1506.04579, 2015.
30 P . Rez, M. Gangnet, and A. Blake, "Poisson image editing," Acm Transactions on Graphics, vol. 22, no. 3, pp. 313-318, p. 313, Jul. 2003.   DOI
31 H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia, "Pyramid scene parsing network," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881-2890, Jun. 2017.
32 A. Cohen, A. G. Schwing, and M. Pollefeys, "Efficient structured parsing of facades using dynamic programming," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3206-3213, Jun. 2014.
33 H. Riemenschneider, U. Krispel, W. Thaller, M. Donoser, S. Havemann, D. Fellner, and H. Bischof, "Irregular lattices for complex shape grammar facade parsing," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1640-1647, Jun. 2012.
34 H. Liu, J. Zhang, J. Zhu, and S. C. Hoi, "DeepFacade: A deep learning approach to facade parsing," in Proc. of the 26th International Joint Conference on Artificial Intelligence, pp. 2301-2307, Aug. 2017.
35 K. Nazeri, E. Ng, T. Joseph, F. Qureshi, and M. Ebrahimi, "Edge-connect: Generative image inpainting with adversarial edge learning," arXiv preprint arXiv:1901.00212, 2019.
36 R. Gadde, V . Jampani, R. Marlet, and P . V . Gehler, "Efficient 2d and 3d facade segmentation using auto-context," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 5, pp. 1273-1280, Apr. 2017.   DOI
37 E. Shelhamer, J. Long, and T. Darrell, "Fully convolutional networks for semantic segmentation," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 4, pp. 640-651, 2017.   DOI
38 M. Mathias, A. Martinovi' c, and L. V an Gool, "Atlas: A three-layered approach to facade parsing," International Journal of Computer Vision, vol. 118, no. 1, pp. 22-48, May 2016.   DOI
39 R. Gadde, R. Marlet, and N. Paragios, "Learning grammars for architecture-specific facade parsing," International Journal of Computer Vision, vol. 117, no. 3, pp. 290-316, Mar. 2016.   DOI