DOI QR코드

DOI QR Code

Improved prediction of soil liquefaction susceptibility using ensemble learning algorithms

  • Satyam Tiwari (Department of Civil Engineering, Indian Institute of Technology (ISM) Dhanbad) ;
  • Sarat K. Das (Department of Civil Engineering, Indian Institute of Technology (ISM) Dhanbad) ;
  • Madhumita Mohanty (Department of Civil Engineering, Indian Institute of Technology (ISM) Dhanbad) ;
  • Prakhar (Department of Civil Engineering, National Institute of Technology Jaipur)
  • Received : 2023.04.06
  • Accepted : 2024.05.09
  • Published : 2024.06.10

Abstract

The prediction of the susceptibility of soil to liquefaction using a limited set of parameters, particularly when dealing with highly unbalanced databases is a challenging problem. The current study focuses on different ensemble learning classification algorithms using highly unbalanced databases of results from in-situ tests; standard penetration test (SPT), shear wave velocity (Vs) test, and cone penetration test (CPT). The input parameters for these datasets consist of earthquake intensity parameters, strong ground motion parameters, and in-situ soil testing parameters. liquefaction index serving as the binary output parameter. After a rigorous comparison with existing literature, extreme gradient boosting (XGBoost), bagging, and random forest (RF) emerge as the most efficient models for liquefaction instance classification across different datasets. Notably, for SPT and Vs-based models, XGBoost exhibits superior performance, followed by Light gradient boosting machine (LightGBM) and Bagging, while for CPT-based models, Bagging ranks highest, followed by Gradient boosting and random forest, with CPT-based models demonstrating lower Gmean(error), rendering them preferable for soil liquefaction susceptibility prediction. Key parameters influencing model performance include internal friction angle of soil (ϕ) and percentage of fines less than 75 µ (F75) for SPT and Vs data and normalized average cone tip resistance (qc) and peak horizontal ground acceleration (amax) for CPT data. It was also observed that the addition of Vs measurement to SPT data increased the efficiency of the prediction in comparison to only SPT data. Furthermore, to enhance usability, a graphical user interface (GUI) for seamless classification operations based on provided input parameters was proposed.

Keywords

Acknowledgement

Authors acknowledge the Ministry of Education, Government of India, for the Prime Minister Research Fellowship and Grant (PMRF ID: 1601650) for providing necessary funding for this research.

References

  1. Abbaszadeh Shahri, A. and Naderi, S. (2016), "Modified correlations to predict the shear wave velocity using piezocone penetration test data and geotechnical parameters: a case study in the southwest of sweden", Innov. Infrastruct. Solutions, 1(1), 1-9. https://doi.org/10.1007/s41062-016-0014-y.
  2. Andrus, R.D. and Stokoe II, K.H. (2000), "Liquefaction resistance of soils from shear-wave velocity", J. Geotech. Geoenviron. Eng., 126(11), 1015-1025. https://doi.org/10.1061/(ASCE)1090-0241(2000)126:11(1015).
  3. Anitescu, C., Atroshchenko, E., Alajlan, N. and Rabczuk, T. (2019), "Artificial neural network methods for the solution of second order boundary value problems", Comput. Mater. Continua, 59(1), 345-359. https://doi.org/10.32604/cmc.2019.06641.
  4. Atangana Njock, P.G., Shen, S.L., Zhou, A. and Lyu, H.M. (2020), "Evaluation of soil liquefaction using AI technology incorporating a coupled ENN / t-SNE model", Soil Dyn. Earthq. Eng., 130, 105988. https://doi.org/10.1016/j.soildyn.2019.105988.
  5. Bhowan, U., Zhang, M. and Johnston, M. (2010), "Genetic programming for classification with unbalanced data", Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 6021 LNCS1-13. https://doi.org/10.1007/978-3-642-12148-7_1.
  6. Breiman, L. (1996), "Bagging predictors", Mach. Learn., 24(2), 123-140. https://doi.org/10.1007/BF00058655.
  7. Breiman, L. (2001), "Random forests", Machine Learning, 45(1), 5-32. https://doi.org/10.1023/A:1010933404324.
  8. Chen, T. and Guestrin, C. (2016), "XGBoost: A scalable tree boosting system", Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Association for Computing Machinery, 785-794. https://doi.org/10.1145/2939672.2939785.
  9. Cortes, C. and Vapnik, V. (1995), "Support-vector networks", Machine Learning 1995 20:3, 20(3), 273-97. https://doi.org/10.1007/BF00994018.
  10. Cover, T.M. and Hart, P.E. (1967), "Nearest neighbor pattern classification", IEEE Transactions on Information Theory, 13(1), 21-27. https://doi.org/10.1109/TIT.1967.1053964.
  11. Dadhich, S., Kumar, J. and Madhav, S. (2023), "Assessment of earthquake-induced liquefaction susceptibility using ensemble learning", Multiscale Multidiscip. Model. Exper. Design. https://doi.org/10.1007/s41939-023-00146-z.
  12. Das, B.M. (2014), Principles of Geotechncial Engineering, 8th ed. Cengage Learning, India. ISBN: 9788131526132.
  13. Das, S.K., Mohanty, R., Mohanty, M. and Mahamaya, M. (2020), "Multi-Objective feature selection (MOFS) algorithms for prediction of liquefaction susceptibility of soil based on in situ test methods", Nat. Hazards, 103(2), 2371-2793. https://doi.org/10.1007/s11069-020-04089-3.
  14. Davis, J. and Goadrich, M. (2006), "The relationship between precision-recall and ROC curves", Proceedings of the ACM International Conference Proceeding Series, 148(2), 33-40. https://dl.acm.org/doi/10.1145/1143844.1143874.
  15. Demir, S. and Sahin, E.K. (2022), "Comparison of tree-based machine learning algorithms for predicting liquefaction potential using canonical correlation forest, rotation forest, and random forest based on CPT data", Soil Dyn. Earthq. Eng., 154, 107130. https://doi.org/10.1016/j.soildyn.2021.107130.
  16. Domingos, P. (2012), "A few useful things to know about machine learning", Communications of the ACM, 55(10), 79-88. http://dx.doi.org/10.1145/2347736.2347755.
  17. Duman, E.S., Ikizler, S.B., Angin, Z. and Demir, G. (2014), "Assessment of liquefaction potential of the erzincan, eastern Turkey", Geomech. Eng., 7(6), 589-612. https://doi.org/10.12989/gae.2014.7.6.589.
  18. Eisavi, V. and Homayouni, S. (2016), "Performance evaluation of random forest and support vector regressions in natural hazard change detection", J. Appl. Remote Sens., 10(4), 046030. https://doi.org/10.1117/1.JRS.10.046030.
  19. Erzin, Y. and Ecemis, N. (2015), "The use of neural networks for CPT-based liquefaction screening", Bull. Eng. Geol. Environ., 74(1), 103-116. https://doi.org/10.1007/s10064-014-0606-8.
  20. Eslami, A., Moshfeghi, S., MolaAbasi, H. and Eslami, M.M. (2020), "Soil behavior classification (SBC) using CPT and CPTu records", Piezocone and Cone Penetration Test (CPTu and CPT) Applications in Foundation Engineering, 111-144. http://dx.doi.org/10.1016/B978-0-08-102766-0.00005-5.
  21. Freund, Y. and Schapire, R.E. (1997), "A decision-theoretic generalization of on-line learning and an application to boosting", J. Comput. Syst. Sci., 55(1), 119-139. https://doi.org/10.1006/jcss.1997.1504.
  22. Friedman, J.H. (2001), "Greedy function approximation: a gradient boosting machine.", Annals of statistics, 29(5), 1189-1232. http://dx.doi.org/10.1214/aos/1013203451.
  23. Gandomi, A.H. and Alavi, A.H. (2012), "Krill herd: A new bioinspired optimization algorithm", Communications in Nonlinear Science and Numerical Simulation, 17(12), 4831-4845. https://doi.org/10.1016/j.cnsns.2012.05.010.
  24. Gandomi, A.H., Fridline, M.M. and Roke, D.A. (2013), "Decision tree approach for soil liquefaction assessment", The Scientific World J., https://doi.org/10.1155/2013/346285.
  25. Geurts, P., Ernst, D. and Wehenkel, L. (2006), "Extremely randomized trees", Machine Learning, 63(1), 3-42. https://doi.org/10.1007/s10994-006-6226-1.
  26. Ghanizadeh, A.R., Aziminejad, A., Asteris, P.G. and Armaghani, D.J. (2023), "Soft computing to predict earthquake-induced soil liquefaction via CPT results", Infrastructures, 8(8) 125. https://doi.org/10.3390/infrastructures8080125.
  27. Goh, A.T.C. (1994), "Seismic liquefaction potential assessed by neural networks", J. Geotech. Eng., 120(9), 1467-1480. https://doi.org/10.1061/(ASCE)0733-9410(1994)120:9(1467).
  28. Goh, A.T.C. and Goh, S.H. (2007), "Support vector machines: their use in geotechnical engineering as illustrated using seismic liquefaction data", Comput. Geotech., 34(5), 410-421. https://doi.org/10.1016/j.compgeo.2007.06.001.
  29. Guo, H., Zhuang, X., Chen, P., Alajlan, N. and Rabczuk, T. (2022a), "Stochastic deep collocation method based on neural architecture search and transfer learning for heterogeneous porous media", Eng. Comput., 38(6), 5173-5198. https://doi.org/10.1007/s00366-021-01586-2.
  30. Guo, H., Rabczuk, T., Zhu, Y., Cui, H., Su, C. and Zhuang, X. (2022b), "Soil liquefaction assessment by using hierarchical gaussian process model with integrated feature and instance based domain adaption for multiple data sources", AI Civ. Eng., 1(1), 1-32. https://link.springer.com/article/10.1007/s43503-022-00004-w.
  31. Gupta, T., Ramana, G.V. and Elgamal, A. (2023), "A hybrid numerical-probabilistic approach for machine learning-based prediction of liquefaction-induced settlement using CPT data", Arabian J. Geosci., 16(6), 1-16. https://doi.org/10.1007/s12517-023-11500-3.
  32. Hanna, A.M., Ural, D. and Saygili, G. (2007), "Neural network model for liquefaction potential in soil deposits using Turkey and Taiwan earthquake data", Soil Dyn. Earthq. Eng., 27(6), 521-540. https://doi.org/10.1016/j.soildyn.2006.11.001.
  33. Hastie, T., Tibshirani, R. and Friedman, J. (2009), "The elements of statistical learning: data mining, inference, and prediction", Springer, 2, New York. https://doi.org/10.1007/978-0-387-84858-7
  34. Hoang, N.D. and Bui, D.T. (2018), "Predicting earthquake-induced soil liquefaction based on a hybridization of kernel fisher discriminant analysis and a least squares support vector machine: A multi-dataset study", Bull. Eng. Geol. Environ., 77(1), 191-204. https://doi.org/10.1007/s10064-016-0924-0.
  35. Hwang, J.H. and Yang, C.W. (2001), "Verification of critical cyclic strength curve by Taiwan Chi-Chi earthquake data", Soil Dyn. Earthq. Eng., 21(3), 237-257. https://doi.org/10.1016/S0267-7261(01)00002-1.
  36. Idriss, I.M. and Boulanger, R.W. (2006), "Semi-empirical procedures for evaluating liquefaction potential during earthquakes", Soil Dyn. Earthq. Eng., 26(2-4), 115-130. https://doi.org/10.1016/j.soildyn.2004.11.023.
  37. Iwasaki, T., Arakawa, T. and Tokida, K.I. (1984), "Simplified procedures for assessing soil liquefaction during earthquakes", Soil Dyn. Earthq. Eng., 3(1), 49-58. https://doi.org/10.1016/0261-7277(84)90027-5.
  38. Jiao, W., Hao, X. and Qin, C. (2021), "The image classification method with CNN-XGBoost model based on adaptive particle swarm optimization", Information, 12(4), 156. https://doi.org/10.3390/info12040156.
  39. Juang, C.H., Chen, C.J., Tang, W.H. and Rosowsky, D.V. (2000), "CPT-based liquefaction analysis, part 1: determination of limit state function", Geotechnique, 50(5), 583-592. https://doi.org/10.1680/geot.2000.50.5.583.
  40. Juang, C.H., Yuan, H., Li, D.K., Yang, S.H. and Christopher, R.A. (2005), "Estimating severity of liquefaction-induced damage near foundation", Soil Dyn. Earthq. Eng., 25(5), 403-411. https://doi.org/10.1016/j.soildyn.2004.11.001.
  41. Suryadi, M.K., Herteno, R., Saputro, S.W., Faisal, M.R. and Nugroho, R.A. (2024), "Comparative study of various hyperparameter tuning on random forest classification with SMOTE and feature selection using genetic algorithm in software defect prediction", J. Electron. Electromedical Eng. Medical Inform., 6(2), 137-147. https://doi.org/10.35882/jeeemi.v6i2.375.
  42. Kayen, R., Moss, R.E.S., Thompson, E.M., Seed, R.B., Cetin, K.O., Kiureghian, A. Der, Tanaka, Y. and Tokimatsu, K. (2013), "Shear-Wave velocity-based probabilistic and deterministic assessment of seismic soil liquefaction potential", J. Geotech. Geoenviron. Eng., 139(3), 407-419. https://doi.org/10.1061/(ASCE)GT.1943-5606.0000743.
  43. Kotsiantis, S.B., Zaharakis, I.D. and Pintelas, P.E. (2006), "Machine learning: A review of classification and combining techniques", Artif. Intell. Review, 26(3), 159-190. https://doi.org/10.1007/s10462-007-9052-3.
  44. Kramer, S.L. (1996), Geotechnical Earthquake Engineering, Prentice Hall, New Jersey, USA.
  45. Ku, C.S., Lee, D.H. and Wu, J.H. (2004), "Evaluation of soil liquefaction in the Chi-Chi, Taiwan earthquake using CPT", Soil Dyn. Earthq. Eng., 24(9-10), 659-673. https://doi.org/10.1016/j.soildyn.2004.06.009.
  46. Kubat, M., Kubat, M. and Matwin, S. (1997), "Addressing the curse of imbalanced training sets: One-sided selection", Proceedings of the 14th International Conference on Machine Learning. https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.43.4487.
  47. Kulhawy, F.H. and Mayne, P.W. (1990), "Manual on estimating soil properties for foundation design", Ostigov, 299. http://www.osti.gov/energycitations/product.biblio.jsp?osti_id=6653074.
  48. Kuncheva, L.I. (2004), "Combining pattern classifiers: methods and algorithms", John wiley & sons, Inc. Publication, Hoboken. https://doi.org/10.1002/0471660264.
  49. Kurnaz, T.F., Erden, C., Kokcam, A.H., Dagdeviren, U. and Demir, A.S. (2023), "A hyper parameterized artificial neural network approach for prediction of the factor of safety against liquefaction", Eng. Geol., 319, 107-109. https://doi.org/10.1016/j.enggeo.2023.107109.
  50. Laghmati, S., Hamida, S., Hicham, K., Cherradi, B. and Tmiri, A. (2024), "An improved breast cancer disease prediction system using ML and PCA", Multimedia Tools Appl., 83(11), 33785-821. https://link.springer.com/article/10.1007/s11042-023-16874-w.
  51. Le, T.T.H., Shin, Y., Kim, M. and Kim, H. (2024), "Towards unbalanced multiclass intrusion detection with hybrid sampling methods and ensemble classification", Appl. Soft Comput., 157, 111-517. https://doi.org/10.1016/j.asoc.2024.111517.
  52. Liaw, A. and Wiener M. (2002), "Classification and regression by random forest", Open J. Stat., 4(7).
  53. Machado, M.R., Karray, S. and De Sousa, I.T. (2019), "LightGBM: An effective decision tree gradient boosting method to predict customer loyalty in the finance industry", Proceedings of the 14th International Conference on Computer Science and Education, ICCSE 2019, (Nips). https://doi.org/10.1109/ICCSE.2019.8845529
  54. Mishra, P.N., Suman, S. and Das, S.K. (2017), "Experimental investigation and prediction models for thermal conductivity of biomodified buffer materials for hazardous waste disposal", J. Hazardous, Toxic, and Radioactive Waste, 21(2), 1-13. https://doi.org/10.1061/(ASCE)HZ.2153-5515.0000327
  55. Moss, R.E., Seed, R.B., Kayen, R.E., Stewart, J.P., Der Kiureghian, A. and Cetin, K.O. (2006), "CPT-based probabilistic and deterministic assessment of in situ seismic soil liquefaction potential", J. Geotech. Geoenviron. Eng., 132(8), 1032-1051. https://doi.org/10.1061/(ASCE)1090-0241(2006)132:8(1032).
  56. Muduli, P.K. and Das, S.K. (2013), "SPT-based probabilistic method for evaluation of liquefaction potential of soil using multi-gene genetic programming", Int. J. Geotech. Earthq. Eng., 4(1), 42-60. https://doi.org/10.4018/jgee.2013010103.
  57. Muduli, P.K. and Das, S.K. (2015), "Model uncertainty of SPT-based method for evaluation of seismic soil liquefaction potential using multi-gene genetic programming", Soils Found., 55(2), 258-275. https://doi.org/10.1016/j.sandf.2015.02.003.
  58. Naser, M.Z. and Alavi, A.H. (2021), "Error metrics and performance fitness indicators for artificial intelligence and machine learning in engineering and sciences", Architecture, Struct. Constr., 1-19. https://doi.org/10.1007/s44150-021-00015-8.
  59. Oommen, T., Baise, L.G. and Vogel, R. (2010a), "Validation and application of empirical liquefaction models", J. Geotech. Geoenviron. Eng., 136(12), 1618-1633. https://doi.org/10.1007/s44150-021-00015-8/10.1061/ASCEGT.1943-5606.0000395.
  60. Ozsagir, M., Erden, C., Bol, E., Sert, S. and Ozocak, A. (2022), "Machine learning approaches for prediction of fine-grained soils liquefaction", Comput. Geotech., 152, 105014. https://doi.org/10.1016/j.compgeo.2022.105014.
  61. Pal, M. (2006), "Support vector machines-based modelling of seismic liquefaction potential", Int. J. Numer. Anal. Method. Geomech., 30(10), 983-996. https://doi.org/10.1002/nag.509.
  62. Powers, D.M.W. (2011), "Evaluation: from precision, recall and Fmeasure to ROC, Informedness, Markedness and Correlation", J. Mach. Learn. Tech., 2(1), 37-63. https://doi.org/10.48550/arXiv.2010.16061.
  63. Robertson, P.K. and Wride, C. (1998), "Evaluating cyclic liquefaction potential using the cone penetration test", Can. Geotech. J., 35(3), 442-459. https://doi.org/10.1139/t98-017.
  64. Sahin, E.K. and Demir, S. (2023), "Greedy-AutoML: A novel greedy-based stacking ensemble learning framework for assessing soil liquefaction potential", Eng. Appl. Artif. Intell., 119, 105732. https://doi.org/10.1016/j.engappai.2022.105732.
  65. Samaniego, E., Anitescu, C., Goswami, S., Nguyen-Thanh, V.M., Guo, H., Hamdia, K., Zhuang, X. and Rabczuk, T. (2020), "An energy approach to the solution of partial differential equations in computational mechanics via machine learning: concepts, implementation and applications", Comput. Method. Appl. Mech. Eng., 36(2), 112790. https://doi.org/10.1016/j.cma.2019.112790.
  66. Samui, P. (2007), "Seismic liquefaction potential assessment by using relevance vector machine", Earthq. Eng. Eng. Vib., 6(4), 331-336. https://doi.org/10.1007/s11803-007-0766-7.
  67. Samui, P. and Sitharam, T.G. (2011), "Machine learning modelling for predicting soil liquefaction susceptibility", Nat. Hazards Earth Syst. Sci., 11(1), 1-9. https://doi.org/10.5194/nhess-11-1-2011.
  68. Seed, H.B. and Idriss, I.M. (1971), "Simplified procedure for evaluating soil liquefaction potential", J. Soil Mech. Found. Division, 97(9), 1249-1273. https://doi.org/10.1061/JSFEAQ.0001662.
  69. Sonmezer, Y.B., Akyuz, A., Kayabali, K., Sonmezer, Y.B., Akyuz, A. and Kayabali, K. (2020), "Investigation of the effect of grain size on liquefaction potential of sands", Geomech. Eng., 20(3), 243-254. https://doi.org/10.12989/gae.2020.20.3.243.
  70. Stokoe, K., Roesset, J., Bierschwale, J.G. and Aouad, M. (1988), "Liquefaction potential of sands from shear wave velocity", Proceedings of the 9th World Conference on Earthquake Engineering, Tokyo-Kyoto, Japan.
  71. Sui, Q., Chen, Q., Wang, D. and Tao, Z. (2023), "Application of machine learning to the Vs-based soil liquefaction potential assessment", J. Mountain Sci., 20(8), 2197-2213. https://doi.org/10.1007/s11629-022-7809-4.
  72. Tokimatsu, K. and Uchida, A. (1990), "Correlation between liquefaction resistance and shear wave velocity", Soils Found., 30(2), 33-42. https://doi.org/10.3208/sandf1972.30.2_33.
  73. Wang, X., Wang, L., Wang, S., Chen, J. and Wu, C. (2021), "An XGBoost-enhanced fast constructive algorithm for food delivery route planning problem", Comput. Ind. Eng., 152(4), 107029. https://doi.org/10.1016/j.cie.2020.107029.
  74. Weiss, G.M. and Provost, F. (2003), "Learning when training data are costly: the effect of class distribution on tree induction", J. Artif. Intell. Res., 19, 315-354. https://doi.org/10.48550/arXiv.1106.4557.
  75. Wolpert, D.H. (1992), "Stacked generalization", Neural Networks, 5(2), 241-259. https://doi.org/10.1016/S0893-6080(05)80023-1.
  76. Wu, M.H., Wang, J.P., Sung, C.Y. (2023), "Performance of HBF method for soil liquefaction assessment", J. GeoEng., 18(4), 195-202. https://doi.org/10.6310/jog.202312_18(4).3.
  77. Yuan, B. and Liu, W. (2012), "A measure oriented training scheme for imbalanced classification problems", In New Frontiers in Applied Data Mining: PAKDD 2011 International Workshops, Shenzhen, China. Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-642-28320-8_25.
  78. Yuan, X., Sun, C. and Chen, S. (2024), "A clustering-based adaptive undersampling ensemble method for highly unbalanced data classification", Appl. Soft Comput., 159, 111-659.
  79. Zhang, C. and Ma, Y. (2012), Ensemble Machine Learning: Methods and Applications, https://doi.org/10.1007/978-1-4419-9326-7.
  80. Zhang, W. and Goh, A.T.C. (2016), "Evaluating seismic liquefaction potential using multivariate adaptive regression splines and logistic regression", Geomech. Eng., 10(3), 269-284. https://doi.org/10.12989/gae.2016.10.3.269.
  81. Zhang, Y., Qiu, J., Zhang, Y. and Xie, Y. (2021a), "The adoption of a support vector machine optimized by GWO to the prediction of soil liquefaction", Environ. Earth Sci., 80(9), 1-9. https://doi.org/10.1007/s12665-021-09648-w.
  82. Zhang, W., Wu, C., Zhong, H., Li, Y. and Wang, L. (2021b), "Prediction of undrained shear strength using extreme gradient boosting and random forest based on bayesian optimization", Geosci. Front., 12(1), 469-477. https://doi.org/10.1016/j.gsf.2020.03.007.
  83. Zhou, J., Huang, S., Wang, M. and Qiu, Y. (2021), "Performance evaluation of hybrid GA-SVM and GWO-SVM models to predict earthquake-induced liquefaction potential of soil: A multi-dataset investigation", Eng. Comput., 38(5), 1-19. https://doi.org/10.1007/s00366-021-01418-3