Browse > Article
http://dx.doi.org/10.3837/tiis.2020.02.022

Deep Window Detection in Street Scenes  

Ma, Wenguang (Faculty of Information Technology, Beijing University of Technology)
Ma, Wei (Faculty of Information Technology, Beijing University of Technology)
Publication Information
KSII Transactions on Internet and Information Systems (TIIS) / v.14, no.2, 2020 , pp. 855-870 More about this Journal
Abstract
Windows are key components of building facades. Detecting windows, crucial to 3D semantic reconstruction and scene parsing, is a challenging task in computer vision. Early methods try to solve window detection by using hand-crafted features and traditional classifiers. However, these methods are unable to handle the diversity of window instances in real scenes and suffer from heavy computational costs. Recently, convolutional neural networks based object detection algorithms attract much attention due to their good performances. Unfortunately, directly training them for challenging window detection cannot achieve satisfying results. In this paper, we propose an approach for window detection. It involves an improved Faster R-CNN architecture for window detection, featuring in a window region proposal network, an RoI feature fusion and a context enhancement module. Besides, a post optimization process is designed by the regular distribution of windows to refine detection results obtained by the improved deep architecture. Furthermore, we present a newly collected dataset which is the largest one for window detection in real street scenes to date. Experimental results on both existing datasets and the new dataset show that the proposed method has outstanding performance.
Keywords
Window dataset; window detection; regular distribution; context enhancement; convolutional neural network;
Citations & Related Records
Times Cited By KSCI : 6  (Citation Analysis)
연도 인용수 순위
1 Xavier Glorot and Yoshua Bengio, "Understanding the difficulty of training deep feedforward neural networks," in Proc. of the thirteenth International Conference on Artificial Intelligence and Statistics, pp.249-256, May, 2010.
2 Seohee Park, Myunggeun Ji and Junchul Chun, "2D human pose estimation based on object detection using RGB-D information," KSII Transactions on Internet and Information Systems, vol. 12, no. 2, pp. 800-816, 2018.   DOI
3 Md Abu Layek, TaeChoong Chung and Eui-Nam Huh, "Remote distance measurement from a single image by automatic detection and perspective correction," KSII Transactions on Internet and Information Systems, vol. 13, no. 8, pp. 3981-4004, 2019.   DOI
4 Karen Simonyan and Andrew Zisserman, "Very deep convolutional networks for large-scale image recognition," Computer Vision and Pattern Recognition, 2014.
5 Kaiming He, Xiangyu Zhang, Shaoqing Ren and Jian Sun, "Deep residual learning for image recognition," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp.770-778, June, 2016.
6 Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollar and C. Lawrence Zitnick, "Microsoft coco: Common objects in context," in Proc. of the European Conference on Computer Vision, pp.740-755, September, 2014.
7 Bolei Zhou, Hang Zhao, Xavier Puig, Sanja Fidler, Adela Barriuso and Antonio Torralba, "Scene parsing through ade20k dataset," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp.633-641, June, 2017.
8 Radim Tylecek and Radim Sara, "Spatial pattern templates for recognition of objects with regular structure," in Proc. of the German Conference on Pattern Recognition, pp.364-374, September, 2013.
9 O. Teboul, "Ecole centrale paris facades database," (Web Link).
10 Carl Doersch, Saurabh Singh, Abhinav Gupta, Josef Sivic and Alexei Efros, "What makes paris look like paris?," ACM Transaction on Graphics, vol. 31, no. 4, pp.2-5, 2012.
11 Joseph Redmon, Santosh Divvala, Ross Girshick and Ali Farhadi, "You only look once: Unified, real-time object detection," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp.779-788, June, 2016.
12 Haoxiang Li, Zhe Lin, Xiaohui Shen, Jonathan Brandt and Gang Hua, "A convolutional neural network cascade for face detection," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp.5325-5334, June, 2015.
13 Martin Zlocha, Qi Dou and Ben Glocker, "Improving retinaNet for CT lesion detection with dense masks from weak RECIST labels," in Proc. of the International Conference on Medical Image Computing and Computer-Assisted Intervention, pp.402-410, October, 2019.
14 Ross Girshick, Jeff Donahue, Trevor Darrell and Jitendra Malik, "Rich feature hierarchies for accurate object detection and semantic segmentation," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp.580-587, June, 2014.
15 Ross Girshick, "Fast r-cnn," in Proc. of the IEEE International Conference on Computer Vision, pp.1440-1448, December, 2015.
16 Shaoqing Ren, Kaiming He, Ross Girshick and Jian Sun, "Faster r-cnn: Towards real-time object detection with region proposal networks," in Proc. of the Advances in Neural Information Processing Systems, pp.91-99, December, 2015.
17 Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu and Alexander C. Berg, "Ssd: Single shot multibox detector," in Proc. of the European Conference on Computer Vision, pp.21-37, October, 2016.
18 Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He and Piotr Dollar, "Focal loss for dense object detection," in Proc. of the IEEE International Conference on Computer Vision, pp.2980-2988, October, 2017.
19 Hei Law and Jia Deng, "Cornernet: Detecting objects as paired keypoints," in Proc. of the European Conference on Computer Vision, pp.734-750, 2019.
20 Andrea Cohen, Johannes L. Schonberger, Pablo Speciale, Torsten Sattler, Jan-Michael Frahm and Marc Pollefeys, "Indoor-outdoor 3d reconstruction alignment," in Proc. of the European Conference on Computer Vision, pp.285-300, October, 2016.
21 Haider Ali, Christin Seifert, Nitin Jindal, Lucas Paletta and Gerhard Paar, "Window detection in facades," in Proc. of the 14th International Conference on Image Analysis and Processing, pp.837-842, September, 2007.
22 Marcel Neuhausen and Markus Konig, "Improved Window Detection in Facade Images," in Proc. of the Advances in Informatics and Computing in Civil and Construction Engineering, pp.537-543, January, 2019.
23 Michal Recky and Franz Leberl, "Michal Windows detection using k-means in CIE-Lab color space," in Proc. of the 20th International Conference on Pattern Recognition, pp.356-359., August, 2010.
24 Navneet Dalal and Bill Triggs, "Histograms of oriented gradients for human detection," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 886-893, June, 2005.
25 David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, vol. 60, no. 2, pp. 91-110, 2004.   DOI
26 Paul Viola and Michael Jones, "Rapid object detection using a boosted cascade of simple features," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp.511-518, December, 2001.
27 Christopher JC. Burges, "A tutorial on support vector machines for pattern recognition," Data mining and Knowledge Discovery, vol. 2, no. 2, pp. 121-167, 1998.   DOI
28 Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li and Li Fei-Fei, "Imagenet: A large-scale hierarchical image database," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp.248-255, June, 2009.
29 Mark Everingham, SM Ali Eslami, Luc Van Gool, Christopher KI Williams, John Winn and Andrew Zisserman, "The pascal visual object classes challenge: A retrospective," International Journal of Computer Vision, vol. 111, no. 1, pp. 98-136, 2015.   DOI
30 Kaiming He, Georgia Gkioxari, Piotr Dollar and Ross Girshick, "Mask r-cnn," in Proc. of the IEEE International Conference on Computer Vision, pp.2961-2969, October, 2017.
31 Jiaquan Shen, Ningzhong Liu, Han Sun, Xiaoli Tao and Qiangyi Li, "Vehicle detection in aerial images based on hyper feature map in deep convolutional network," KSII Transactions on Internet and Information Systems, vol. 13, no. 4, pp. 1989-2011, 2019.   DOI