[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.7848/ksgpc.2018.36.6.469

Evaluation of Building Detection from Aerial Images Using Region-based Convolutional Neural Network for Deep Learning

Lee, Dae Geon (Dept. of Environment, Energy & Geoinformatics, Sejong University)
Cho, Eun Ji (Dept. of Environment, Energy & Geoinformatics, Sejong University)
Lee, Dong-Cheon (Dept. of Environment, Energy & Geoinformatics, Sejong University)

Publication Information

Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography / v.36, no.6, 2018 , pp. 469-481 More about this Journal

Abstract

DL (Deep Learning) is getting popular in various fields to implement artificial intelligence that resembles human learning and cognition. DL based on complicate structure of the ANN (Artificial Neural Network) requires computing power and computation cost. Variety of DL models with improved performance have been developed with powerful computer specification. The main purpose of this paper is to detect buildings from aerial images and evaluate performance of Mask R-CNN (Region-based Convolutional Neural Network) developed by FAIR (Facebook AI Research) team recently. Mask R-CNN is a R-CNN that is evaluated to be one of the best ANN models in terms of performance for semantic segmentation with pixel-level accuracy. The performance of the DL models is determined by training ability as well as architecture of the ANN. In this paper, we characteristics of the Mask R-CNN with various types of the images and evaluate possibility of the generalization which is the ultimate goal of the DL. As for future study, it is expected that reliability and generalization of DL will be improved by using a variety of spatial information data for training of the DL models.

Keywords

Deep Learning; Region-based Convolutional Neural Network; Object Detection; Semantic Segmentation;

Citations & Related Records

Times Cited By KSCI : 3 (Citation Analysis)

Reference
Cited By KSCI

1	Back, C.S. and Yom, J.H. (2018), Comparison of point cloud volume calculated by artificial intelligence learning method and photogrammetric method, Proceedings of Korean Society of Surveying, Geodesy, Photogrammetry and Cartography, 19-20 April, Yongin, Korea, pp. 227-230.
2	Ball, J., Anderson, D., and Chan, C. (2017), A comprehensive survey of deep learning in remote sensing: Theories, tools and challenges for the community, Journal of Applied Remote Sensing, Vol. 11. No. 4, pp. 1-54.
3	Campos-Taberner, M., Romero-Soriano, A., Gatta, C., Camps-Valls, G., Lagrange, A., Le Saux, B., Beaupere, A., Boulch, A., Chan-Hon-Tong, A., Herbin, S., Randrianarivo, H., Ferecatu, M., Shimoni, M., Moser, G., and Tuia, D. (2016), Processing of extremely highresolution LiDAR and RGB data: Outcome of the 2015 IEEE GRSS data fusion contest-Part A: 2-D contest, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, Vol. 9, No. 12, pp. 5547-5559. DOI
4	Choe, Y.J. and Yom, J.H. (2017), Downscaling of MODIS land surface temperature to LANDSAT scale using multi-layer perceptron, Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography, Vol. 35, No. 4, pp. 313-318. (in Korean with English abstract) DOI
5	Chung, D. and Lee, I. (2017), Point cloud classification base on deep learning, Proceedings of Korean Society of Surveying, Geodesy, Photogrammetry, and Cartography, Yeosu, Korea, pp. 110-113. (in Korean with English abstract)
6	Deng, Z., Sun, H., Zhou, S., Zhao, Lei, L., and Zou, H. (2018), Multi-scale object detection in remote sensing imagery with convolutional neural networks, ISPRS Journal of Photogrammetry and Remote Sensing, Vol. 145, pp. 3-22. DOI
7	Garcia-Garcia, A., Orts-Escolano, S., Oprea, S., Villena-Martinez, V., and Garcia-Rodriguez, J. (2017), A review on deep learning techniques applied to semantic segmentation, arXiv:1704.06857.
8	Girshick, R. (2015), Fast R-CNN, IEEE International Conference on Computer Vision, ICCV 2015, 13-16 December, Santiago, Chile, pp. 1440-1448.
9	Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2016), Region-based convolutional networks for accurate object detection and segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 38, No. 1, pp. 1-16. DOI
10	Hazirbas, C., Ma, L., Domokos, C., and Cremers, D. (2016), FuseNet: Incorporating depth into semantic segmentation via fusion-based CNN architecture, Proceedings of the Asian Conference on Computer Vision, Vol. 2, 20-24 November, Taipei, Taiwan.
11	He, k., Gkioxari, G., Dollar, p., and Girshick, R. (2017), Mask R-CNN, Proceedings of IEEE International Conference on Computer Vision (ICCV) 2017, 22-29 October, Venice, Italy, pp. 2980-2988.
12	Hertz, J., Krogh, A., and Palmer, R. (1991), Introduction to the Theory of Neural Computation, Addison-Wesley, Reading, MA, 327p.
13	Kang, J., Korner, M., Wang, Y., Taubenbock, H., and Zhu, X. (2018), Building instance classification using street view images, ISPRS Journal of Photogrammetry and Remote Sensing, Vol. 145, pp. 44-59. DOI
14	Kemker, R., Salvaggio, C., and Kanan, C. (2018), Algorithms for semantic segmentation of multispectral remote sensing imagery using deep learning, ISPRS Journal of Photogrammetry and Remote Sensing, Vol. 145, pp. 60-77. DOI
15	Kim, H. and Bae, T., (2017), Preliminary study of deep learning-based precipitation prediction, Journal of the Korean Society of Surveying, Geodesy, Photogrammetry, and Cartography, Vol. 35, No. 5, 423-430. DOI
16	Marmanis, D., Wegner, J., Galliani, S., Schindler, K., Datcu, M., and Stilla, U. (2016), Semantic segmentation of aerial images with an ensemble of CNNS, ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Vol. 3-3, XXIII ISPRS Congress, 12-19 July, Prague, Czech Republic, pp. 473-480.
17	Krizhevsky, A., Sutskever, I., and Hinton, G. (2012), ImageNet classification with deep convolutional neural networks, Proceedings of the 25th International Conference on Neural Information Processing Systems, Vol. 1, 3-8 December, Lake Tahoe, Nevada, pp. 1097-1105.
18	LeCun, Y., Boser, B., Denker, J., Henderson, D., Howard, R. Hubbard, W., and Jackel, L. (1989), Backpropagation applied to handwritten zip code recognition. Neural Computation, No. 1, Vol. 4, pp. 541-551. DOI
19	Lee, G. and Yom, J.H. (2018), Design and implementation of web-based automatic preprocessing system of remote sensing imagery for machine learning modeling, Journal of the Korean Society for Geospatial Information Science, Vol. 26 No. 1, pp. 61-67. (in Korean with English abstract)
20	Long, J., Shelhamer, E., and Darrell, T. (2015), Fully convolutional networks for semantic segmentation, Proceedings of IEEE Conference on Computer Vision and Patton Recognition, 7-12 June, Boston, MA, pp. 3431-3440.
21	Maturana, D. and Scherer, S. (2015), 3D Convolutional neural networks for landing zone detection from LiDAR, IEEE International Conference on Robotics and Automation, Seattle, Washington, 26-30 May, pp. 3471-3478.
22	McCulloch, W. and Pitts, W. (1943), A logical calculus of the ideas immanent in nervous activity, Bulletin of Mathematical Biophysics, Vol. 7, pp. 115-133.
23	Oh, H. (2010), Landslide detection and landslide susceptibility mapping using aerial photos and artificial neural networks, Korean Journal of Remote Sensing, Vol. 26, No. 1, pp. 47-57. (in Korean with English abstract)
24	Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang Z., Karpathy, A., Khosla, A., Bernstein, M., and Berg, A. (2015), Imagenet large scale visual recognition challenge, International Journal of Computer Vision, Vol. 115, No. 3, pp. 211-252. DOI
25	Pang, Y., Sun, M., Jiang, X., and Li, X. (2018), Convolution in convolution for network in network, IEEE Transactions on Neural Networks and Learning Systems, Vol. 29, No. 5, pp. 1587-1597. DOI
26	Parthasarathy, D. (2017), A brief history of CNNs in image segmentation: From R-CNN to Mask R-CNN, https://blog.athelas.com/a-brief-history-of-cnns-in-image-segmentation-from-r-cnn-to-mask-r-cnn-34ea83205de4 (last date accessed: 6 September 2018).
27	Ren, S., He, K., Girshick, R., and Sun, J. (2017), Faster R-CNN: Towards real-time object detection with region proposal networks, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 39, No. 6, pp. 1137-1149. DOI
28	Rosenblatt, F. (1958), The perceptron: A probabilistic model for information storage and organization in the brain, Psychological Review, Vol. 65, No. 6, pp. 386-408. DOI
29	Rumelhart, D., Hinton, G., and Williams, R. (1986), Learning internal representations by back-propagating errors, Nature, Vol. 323, No. 9, pp. 533-536. DOI
30	Schenk, T. (1999), Digital Photogrammetry: Volume 1, TerraScience, Laurelville, OH, 428p.
31	You, Q., Jin, H., Wang, Z., Fang, C., and Luo, J. (2016), Image captioning with semantic attention, IEEE Conference on Computer Vision and Pattern Recognition, 26 June-1 July, Las Vegas, Nevada, pp. 4651-4659.
32	Vo, A.V., Truong-Hong, L., Laefer, D., Tiede, D., d'Oleire-Oltmanns, S., Baraldi, A., Shimoni, M., Moser, G., and Tuia, D. (2016), Processing of extremely high resolution LiDAR and RGB Data: Outcome of the 2015 IEEE GRSS data fusion contest-Part B: 3-D Contest, IEEE Journal of Selected Topics In Applied Earth Observations And Remote Sensing, Vol. 9, No. 12, pp. 5560-5575. DOI
33	Wang, S., Quan, D., Liang, X., Ning, M., Guo, Y., and Jiao, L. (2018), A deep learning framework for remote sensing image registration, ISPRS Journal of Photogrammetry and Remote Sensing, Vol. 145, pp. 148-164. DOI
34	Zhang, B., Gu, J., Chen, C., Han, J., Su, X., Cao, X., and Liu, J. (2018), One-two-one networks for compression artifacts in remote sensing, ISPRS Journal of Photogrammetry and Remote Sensing, Vol. 145, pp. 184-196. DOI
35	Shaikh, F. (2018), Automatic image captioning using deep learning (CNN and LSTM) in PyTorch, Analytics vidhya, https://www.analyticsvidhya.com/blog/2018/04/solving-an-image-captioning-task-using-deep-learning/ (last date accessed: 31 October 2018).
36	Simard, P., Steinkraus, D., and Platt, J. (2003), Best practices for convolutional neural networks applied to visual document analysis, Proceedings of the Seventh International Conference on Document Analysis and Recognition, ICDAR 2003, 3-6 August, Vol. 2, pp. 958-962.
37	Tokarczyk, P., Wegner, J., Walk, S., and Schindler, K. (2015), Features, color spaces, and boosting: new insights on semantic classification of remote sensing images, IEEE Transactions on Geoscience And Remote Sensing, Vol. 53, No. 1, pp. 280-295. DOI
38	Audebert, N., Le Saux, B., and Lefevre, S. (2018), Beyond RGB: Very high resolution urban remote sensing with multimodal deep networks, ISPRS Journal of Photogrammetry and Remote Sensing, Vol. 140, pp. 20-32. DOI
39	Xing, Y., Wang, M., Yang, S., and Jiao, L. (2018), Pansharpening via deep metric learning, ISPRS Journal of Photogrammetry and Remote Sensing, Vol. 145, pp. 165-183. DOI
40	Xu, K., Ba, J., Kiros, R., Cho, K., Courville, A., Salakhutdinov, R., Zemel, R., and Bengio, Y. (2015), Show, attend and tell: Neural image caption generation with visual attention, International Conference on Machine Learning, 6-11 July, Lille, France, pp. 2048-2057.

2	(2018) 지적과 국토정보 포인트 클라우드에서 딥러닝을 이용한 객체 분류 및 변화 탐지 / 50 (2) , 37
6	(2020) 한국측량학회지 적외선 영상, 라이다 데이터 및 특성정보 융합 기반의 합성곱 인공신경망을 이용한 건물탐지 / 38 (6) , 635
1	(2018) 한국전자통신학회 논문지 인공지능 기반 유해조류 탐지 관제 시스템 / 16 (1) , 175
3	(2018) 한국측량학회지 항공영상을 이용한 딥러닝 기반 건물객체 추출 기법들의 비교평가 / 39 (3) , 157

KSCI

Evaluation of Building Detection from Aerial Images Using Region-based Convolutional Neural Network for Deep Learning 딥러닝을 위한 영역기반 합성곱 신경망에 의한 항공영상에서 건물탐지 평가

Evaluation of Building Detection from Aerial Images Using Region-based Convolutional Neural Network for Deep Learning