Browse > Article
http://dx.doi.org/10.12989/gae.2021.25.1.001

Landslide susceptibility assessment using feature selection-based machine learning models  

Liu, Lei-Lei (Key Laboratory of Metallogenic Prediction of Nonferrous Metals and Geological Environment Monitoring, Ministry of Education, School of Geosciences and Info-Physics, Central South University)
Yang, Can (Key Laboratory of Metallogenic Prediction of Nonferrous Metals and Geological Environment Monitoring, Ministry of Education, School of Geosciences and Info-Physics, Central South University)
Wang, Xiao-Mi (School of Resources and Environmental Science, Hunan Normal University)
Publication Information
Geomechanics and Engineering / v.25, no.1, 2021 , pp. 1-16 More about this Journal
Abstract
Machine learning models have been widely used for landslide susceptibility assessment (LSA) in recent years. The large number of inputs or conditioning factors for these models, however, can reduce the computation efficiency and increase the difficulty in collecting data. Feature selection is a good tool to address this problem by selecting the most important features among all factors to reduce the size of the input variables. However, two important questions need to be solved: (1) how do feature selection methods affect the performance of machine learning models? and (2) which feature selection method is the most suitable for a given machine learning model? This paper aims to address these two questions by comparing the predictive performance of 13 feature selection-based machine learning (FS-ML) models and 5 ordinary machine learning models on LSA. First, five commonly used machine learning models (i.e., logistic regression, support vector machine, artificial neural network, Gaussian process and random forest) and six typical feature selection methods in the literature are adopted to constitute the proposed models. Then, fifteen conditioning factors are chosen as input variables and 1,017 landslides are used as recorded data. Next, feature selection methods are used to obtain the importance of the conditioning factors to create feature subsets, based on which 13 FS-ML models are constructed. For each of the machine learning models, a best optimized FS-ML model is selected according to the area under curve value. Finally, five optimal FS-ML models are obtained and applied to the LSA of the studied area. The predictive abilities of the FS-ML models on LSA are verified and compared through the receive operating characteristic curve and statistical indicators such as sensitivity, specificity and accuracy. The results showed that different feature selection methods have different effects on the performance of LSA machine learning models. FS-ML models generally outperform the ordinary machine learning models. The best FS-ML model is the recursive feature elimination (RFE) optimized RF, and RFE is an optimal method for feature selection.
Keywords
landslide; susceptibility assessment; machine learning; feature selection; Geographic Information System;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Boulfoul, K., Hammoud, F. and Abbeche, K. (2020), "Numerical study on the optimal position of a pile for stabilization purpose of a slope", Geomech. Eng., 21(5), 401-411. https://doi.org/10.12989/gae.2020.21.5.401.   DOI
2 Breiman, L. (2001), "Random forests", Machine Learn., 45(1), 5-32. https://doi.org/10.1023/a:1010933404324.   DOI
3 Bui, D.T., Tuan, T.A., Klempe, H., Pradhan, B. and Revhaug, I. (2016), "Spatial prediction models for shallow landslide hazards: A comparative assessment of the efficacy of support vector machines, artificial neural networks, kernel logistic regression, and logistic model tree", Landslides, 13(2), 361-378. https://doi.org/10.1007/s10346-015-0557-6.   DOI
4 Catani, F., Lagomarsino, D., Segoni, S. and Tofani, V. (2013), "Landslide susceptibility estimation by random forests technique: Sensitivity and scaling issues", Nat. Hazards Earth Syst. Sci., 13(11), 2815-2831. https://doi.org/10.5194/nhess-13-2815-2013.   DOI
5 Chen, W., Panahi, M. and Pourghasemi, H.R. (2017a), "Performance evaluation of GIS-based new ensemble data mining techniques of adaptive neuro-fuzzy inference system (ANFIS) with genetic algorithm (GA), differential evolution (DE), and particle swarm optimization (PSO) for landslide spatial modelling", Catena, 157, 310-324. https://doi.org/10.1016/j.catena.2017.05.034.   DOI
6 Chen, W., Pourghasemi, H.R. and Zhao, Z. (2017b), "A GIS-based comparative study of Dempster-Shafer, logistic regression and artificial neural network models for landslide susceptibility mapping", Geocarto Int., 32(4), 367-385. https://doi.org/10.1080/10106049.2016.1140824.   DOI
7 Akgun, A., Sezer, E.A., Nefeslioglu, H.A., Gokceoglu, C. and Pradhan, B. (2012), "An easy-to-use MATLAB program (MamLand) for the assessment of landslide susceptibility using a Mamdani fuzzy algorithm", Land Degrad. Develop., 38(1), 23-34. https://doi.org/10.1016/j.cageo.2011.04.012.   DOI
8 Amato, G., Eisank, C., Castro Camilo, D. and Lombardo, L. (2019), "Accounting for covariate distributions in slope-unitbased landslide susceptibility models. A case study in the alpine environment", Eng. Geol., 260(3), 105237. https://doi.org/10.1016/j.enggeo.2019.105237.   DOI
9 Balzano, B., Tarantino, A., Nicotera, M.V., Forte, G., de Falco, M. and Santo, A. (2019), "Building physically based models for assessing rainfall-induced shallow landslide hazard at catchment scale: Case study of the Sorrento Peninsula (Italy)", Can. Geotech. J., 56(9), 1291-1303. https://doi.org/10.1139/cgj-2017-0611.   DOI
10 Chen, W., Xie, X., Peng, J., Shahabi, H., Hong, H., Bui, D.T., Duan, Z., Li, S. and Zhu, A.X. (2018), "GIS-based landslide susceptibility evaluation using a novel hybrid integration approach of bivariate statistical based random forest method", Catena, 164, 135-149. https://doi.org/10.1016/j.catena.2018.01.012.   DOI
11 Degraff, J.V. and Canuti, P. (1988), "Using isopleth mapping to evaluate landslide activity in relation to agricultural practices", B. Eng. Geol. Environ., 38(1), 61-71.
12 Chen, W., Xie, X., Wang, J., Pradhan, B., Hong, H., Bui, D.T., Duan, Z. and Ma, J. (2017c), "A comparative study of logistic model tree, random forest, and classification and regression tree models for spatial prediction of landslide susceptibility", Catena, 151, 147-160. https://doi.org/10.1016/j.catena.2016.11.032.   DOI
13 Cheng, W.C., Ni, J.C., Arulrajah, A. and Huang, H.W. (2018), "A simple approach for characterising tunnel bore conditions based upon pipe jacking data", Tunn. Undergr. Sp. Tech., 71, 494-504. https://doi.org/10.1016/j.tust.2017.10.002.   DOI
14 Cortes, C. and Vapnik, V. (1995), "Support-vector networks", Machine Learning, 20(3), 273-297. https://doi.org/10.1007/BF00994018.   DOI
15 Guyon, I., Weston, J., Barnhill, S. and Vapnik, V. (2002), "Gene selection for cancer classification using support vector machines", Machine Learning, 46(1-3), 389-422. https://doi.org/10.1023/A:1012487302797.   DOI
16 Hong, H., Ilia, I., Tsangaratos, P., Chen, W. and Xu, C. (2017), "A hybrid fuzzy weight of evidence method in landslide susceptibility analysis on the Wuyuan area, China", Geomorphology, 290, 1-16. https://doi.org/10.1016/j.geomorph.2017.04.002.   DOI
17 Hong, H., Tsangaratos, P., Ilia, I., Liu, J., Zhu, A.X. and Chen, W. (2018c), "Application of fuzzy weight of evidence and data mining techniques in construction of flood susceptibility map of Poyang County, China", Sci. Total Environ., 625, 575-588. https://doi.org/10.1016/j.scitotenv.2017.12.256.   DOI
18 Hong, H., Liu, J., Bui, D.T., Pradhan, B., Acharya, T.D., Pham, B.T., Zhu, A.X., Chen, W. and Ahmad, B.B. (2018a), "Landslide susceptibility mapping using J48 Decision Tree with AdaBoost, Bagging and Rotation Forest ensembles in the Guangchang area (China)", Catena, 163, 399-413. https://doi.org/10.1016/j.catena.2018.01.005.   DOI
19 Hong, H., Pourghasemi, H.R. and Pourtaghi, Z.S. (2016), "Landslide susceptibility assessment in Lianhua County (China): A comparison between a random forest data mining technique and bivariate and multivariate statistical models", Geomorphology, 259, 105-118. https://doi.org/10.1016/j.geomorph.2016.02.012.   DOI
20 Hong, H., Pradhan, B., Sameen, M.I., Kalantar, B., Zhu, A. and Chen, W. (2018b), "Improving the accuracy of landslide susceptibility model using a novel region-partitioning approach", Landslides, 15(4), 753-772. https://doi.org/10.1007/s10346-017-0906-8.   DOI
21 Irigaray, C., Fernández, T., El Hamdouni, R. and Chacon, J. (2007), "Evaluation and validation of landslide-susceptibility maps obtained by a GIS matrix method: Examples from the Betic Cordillera (southern Spain)", Nat. Hazards, 41(1), 61-79. https://doi.org/10.1007/s11069-006-9027-8.   DOI
22 Kavzoglu, T. and Mather, P.M. (2010), "The role of feature selection in artificial neural network applications", Int. J. Remote Sensing, 23(15), 2919-2937. https://doi.org/10.1080/01431160110107743.   DOI
23 Lagomarsino, D., Tofani, V., Segoni, S., Catani, F. and Casagli, N. (2017), "A tool for classification and regression Using random forest methodology: Applications to landslide susceptibility mapping and soil thickness modeling", Environ. Model. Asses., 22(3), 201-214. https://doi.org/10.1007/s10666-016-9538-y.   DOI
24 Rasmussen, C.E. and Nickisch, H. (2010), "Gaussian processes for machine learning (GPML) toolbox", J. Mach. Learn. Res., 11(6), 3011-3015. https://doi.org/10.1115/1.4002474.   DOI
25 Li, C., Yao, D., Wang, Z., Liu, C.C., Wuliji, N., Yang, L., Li, L. and Amini, F. (2016), "Model test on rainfall-induced loess-mudstone interfacial landslides in Qingshuihe, China", Environ. Earth Sci., 75(9), 835. https://doi.org/10.1007/s12665-016-5658-6.   DOI
26 Paola, R., Galli, M., Cardinali, M., Guzzetti, F. and Ardizzone, F., (2004), Geomorphological Mapping to Assess Landslide Risk: Concepts, Methods and Applications in the Umbria Region of Central Italy, in Landslide Hazard and Risk, Hoboken, New Jersey, U.S.A.
27 Pham, B.T., Avand, M., Janizadeh, S., Phong, T.V., Al-Ansari, N., Ho, L.S., Das, S., Le, H.V., Amini, A., Bozchaloei, S.K., Jafari, F. and Prakash, I. (2020), "GIS based hybrid computational approaches for flash flood susceptibility assessment", Water, 12(3), 683. https://doi.org/10.3390/w12030683.   DOI
28 Pourghasemi, H.R. and Rahmati, O. (2018), "Prediction of the landslide susceptibility: Which algorithm, which precision?", Catena, 162, 177-192. https://doi.org/10.1016/j.catena.2017.11.022.   DOI
29 Pourghasemi, H.R., Kornejady, A., Kerle, N. and Shabani, F. (2020), "Investigating the effects of different landslide positioning techniques, landslide partitioning approaches, and presence-absence balances on landslide susceptibility mapping", Catena, 187, 104364. https://doi.org/10.1016/j.catena.2019.104364.   DOI
30 Pourghasemi, H.R., Pradhan, B., Gokceoglu, C., Mohammadi, M. and Moradi, H.R. (2013), "Application of weights-of-evidence and certainty factor models and their comparison in landslide susceptibility mapping at Haraz watershed, Iran", Arab. J. Geosci., 6(7), 2351-2365. https://doi.org/10.1007/s12517-012-0532-7.   DOI
31 Reichenbach, P., Rossi, M., Malamud, B.D., Mihir, M. and Guzzetti, F. (2018), "A review of statistically-based landslide susceptibility models", Earth-Sci. Rev., 180, 60-91. https://doi.org/10.1016/j.earscirev.2018.03.001.   DOI
32 Samia, J., Temme, A., Bregt, A., Wallinga, J., Guzzetti, F. and Ardizzone, F. (2020), "Dynamic path-dependent landslide susceptibility modelling", Nat. Hazards Earth Syst. Sci., 20(1), 271-285. https://doi.org/10.5194/nhess-20-271-2020.   DOI
33 Sheil, B.B., Suryasentana, S.K. and Cheng, W.C. (2020), "Assessment of anomaly detection methods applied to microtunneling", J. Geotech. Geoenviron. Eng., 146(9), 04020094. https://doi.org/10.1061/(ASCE)GT.1943-5606.0002326.   DOI
34 Shou, K.J. and Lin, J.F. (2020), "Evaluation of the extreme rainfall predictions and their impact on landslide susceptibility in a subcatchment scale", Eng. Geol., 265, 105434. https://doi.org/10.1016/j.enggeo.2019.105434.   DOI
35 Skolidis, G. and Sanguinetti, G. (2011), "Bayesian nultitask classification with Gaussian process priors", IEEE T. Neur. Networks, 22(12), 2011-2021. https://doi.org/10.1109/tnn.2011.2168568.   DOI
36 Sun, D., Wen, H., Wang, D. and Xu, J. (2020), "A random forest model of landslide susceptibility mapping based on hyperparameter optimization using Bayes algorithm", Geomorphology, 362, 107201. https://doi.org/10.1016/j.geomorph.2020.107201.   DOI
37 Vasu, N.N. and Lee, S.R. (2016), "A hybrid feature selection algorithm integrating an extreme learning machine for landslide susceptibility modeling of Mt. Woomyeon, South Korea", Geomorphology, 263, 50-70. https://doi.org/10.1016/j.geomorph.2016.03.023.   DOI
38 Liu, D. and Chen, X. (2015), "Shearing characteristics of slip zone soils and strain localization analysis of a landslide", Geomech. Eng., 8(1), 33-52. https://doi.org/10.12989/gae.2015.8.1.033.   DOI
39 Liu, L.L., Cheng, Y.M., Pan, Q.J. and Dias, D. (2020), "Incorporating stratigraphic boundary uncertainty into reliability analysis of slopes in spatially variable soils using one-dimensional conditional Markov chain model", Comput. Geotech., 118, 103321. https://doi.org/10.1016/j.compgeo.2019.103321.   DOI
40 Liu, L.L., Deng, Z.P., Zhang, S.H. and Cheng, Y.M. (2018), "Simplified framework for system reliability analysis of slopes in spatially variable soils", Eng. Geol., 239, 330-343. https://doi.org/10.1016/j.enggeo.2018.04.009.   DOI
41 Wang, F., Xu, P., Wang, C., Wang, N. and Jiang, N. (2017), "Application of a GIS-based slope unit method for landslide susceptibility mapping along the Longzi river, Southeastern Tibetan Plateau, China", ISPRS Int. J. Geo-Inform., 6(6), 172. https://doi.org/10.3390/ijgi6060172.   DOI
42 Micheletti, N., Foresti, L., Robert, S., Leuenberger, M., Pedrazzini, A., Jaboyedoff, M. and Kanevski, M. (2014), "Machine learning feature selection methods for landslide susceptibility mapping", Math. Geosci., 46(1), 33-57. https://doi.org/10.1007/s11004-013-9511-0.   DOI
43 Yang, Y., Yang, J., Xu, C., Xu, C. and Song, C. (2019), "Local-scale landslide susceptibility mapping using the B-GeoSVC model", Landslides, 16(7), 1301-1312. https://doi.org/10.1007/s10346-019-01174-y.   DOI
44 Youssef, A.M., Al-Kathery, M. and Pradhan, B. (2015), "Landslide susceptibility mapping at Al-Hasher area, Jizan (Saudi Arabia) using GIS-based frequency ratio and index of entropy models", Geosci. J., 19(1), 113-134. https://doi.org/10.1007/s12303-014-0032-8.   DOI
45 Lombardi, M., Cardarilli, M. and Raspa, G. (2017), "Spatial variability analysis of soil strength to slope stability assessment", Geomech. Eng., 12(3), 483-503. https://doi.org/10.12989/gae.2017.12.3.483.   DOI
46 Lombardo, L. and Mai, P.M. (2018), "Presenting logistic regression-based landslide susceptibility results", Eng. Geol., 244, 14-24. https://doi.org/10.1016/j.enggeo.2018.07.019.   DOI
47 Merghadi, A., Yunus, A.P., Dou, J., Whiteley, J., ThaiPham, B., Bui, D.T., Avtar, R. and Abderrahmane, B. (2020), "Machine learning methods for landslide susceptibility studies: A comparative overview of algorithm performance", Earth-Sci. Rev., 207, 103225. https://doi.org/10.1016/j.earscirev.2020.103225.   DOI
48 Moore, I., Grayson, R. and Ladson, T. (1991), "Digital Terrain Modeling: A review of hydrological, geomorphological, and biological applications", Hydrol. Process., 5, 3-30. https://doi.org/10.1002/hyp.3360050103.   DOI
49 Reshef, D.N., Reshef, Y.A., Finucane, H.K., Grossman, S.R., McVean, G., Turnbaugh, P.J., Lander, E.S., Mitzenmacher, M. and Sabeti, P.C. (2011), "Detecting novel associations in large data sets", Science, 334(6062), 1518-1524. https://doi.org/10.1126/science.1205438.   DOI
50 Reif, D.M., Motsinger, A.A., Mckinney, B.A., Jr, J.E.C. and Moore, J.H. (2006), "Feature selection using a random forests classifier for the integrated analysis of multiple data types" Proceedings of the IEEE Symposium on Computational Intelligence in Bioinformatics & Computational Biology, Toronto, Canada, September.
51 Xing, H., Liu, L. and Luo, Y. (2019), "Water-induced changes in mechanical parameters of soil-rock mixture and their effect on talus slope stability", Geomech. Eng., 18(4), 353-362. https://doi.org/10.12989/gae.2019.18.4.353.   DOI
52 Wang, L.J., Guo, M., Sawada, K., Lin, J. and Zhang, J. (2015), "Landslide susceptibility mapping in Mizunami City, Japan: A comparison between logistic regression, bivariate statistical analysis and multivariate adaptive regression spline models", Catena, 135, 271-282. https://doi.org/10.1016/j.catena.2015.08.007.   DOI
53 Weiss, A. (2001), "Topographic position and landforms analysis", Proceedings of the ESRI User Conference, San Diego, California, U.S.A., July.
54 Wold, S., Esbensen, K. and Geladi, P. (1987), "Principal component analysis", Chemometr. Intell. Lab., 2(1-3), 37-52. https://doi.org/10.1016/0169-7439(87)80084-9.   DOI
55 Yalcin, A., Reis, S., Aydinoglu, A.C. and Yomralioglu, T. (2011), "A GIS-based comparative study of frequency ratio, analytical hierarchy process, bivariate statistics and logistics regression methods for landslide susceptibility mapping in Trabzon, NE Turkey", Catena, 85(3), 274-287. https://doi.org/https://doi.org/10.1016/j.catena.2011.01.014.   DOI
56 Zhang, K., Wu, X., Niu, R., Yang, K. and Zhao, L. (2017), "The assessment of landslide susceptibility mapping using random forest and decision tree methods in the Three Gorges Reservoir area, China", Environ. Earth Sci., 76(11), 405. https://doi.org/10.1007/s12665-017-6731-5.   DOI