DOI QR코드

DOI QR Code

Predicting As Contamination Risk in Red River Delta using Machine Learning Algorithms

  • Ottong, Zheina J. (School of Earth Sciences and Environmental Engineering, Gwangju Institute of Science and Technology (GIST)) ;
  • Puspasari, Reta L. (School of Earth Sciences and Environmental Engineering, Gwangju Institute of Science and Technology (GIST)) ;
  • Yoon, Daeung (Chonnam National University) ;
  • Kim, Kyoung-Woong (School of Earth Sciences and Environmental Engineering, Gwangju Institute of Science and Technology (GIST))
  • 투고 : 2022.03.03
  • 심사 : 2022.04.03
  • 발행 : 2022.04.28

초록

Excessive presence of As level in groundwater is a major health problem worldwide. In the Red River Delta in Vietnam, several million residents possess a high risk of chronic As poisoning. The As releases into groundwater caused by natural process through microbially-driven reductive dissolution of Fe (III) oxides. It has been extracted by Red River residents using private tube wells for drinking and daily purposes because of their unawareness of the contamination. This long-term consumption of As-contaminated groundwater could lead to various health problems. Therefore, a predictive model would be useful to expose contamination risks of the wells in the Red River Delta Vietnam area. This study used four machine learning algorithms to predict the As probability of study sites in Red River Delta, Vietnam. The GBM was the best performing model with the accuracy, precision, sensitivity, and specificity of 98.7%, 100%, 95.2%, and 100%, respectively. In addition, it resulted the highest AUC of 92% and 96% for the PRC and ROC curves, with Eh and Fe as the most important variables. The partial dependence plot of As concentration on the model parameters showed that the probability of high level of As is related to the low number of wells' depth, Eh, and SO4, along with high PO43- and NH4+. This condition triggers the reductive dissolution of iron phases, thus releasing As into groundwater.

키워드

과제정보

The authors highly appreciate Lenny H. E. Winkel, Pham Thi Kim Trang, Vi Mai Lan, Caroline Stengel, Manouchehr Amini, Nguyen Thi Ha, Pham Hung Viet, and Michael Berg for the publicly-available hydrochemical As data of Red River Delta. This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2021R1A2C1094272).

참고문헌

  1. Ayotte, J.D., Medalie, L., Qi, S.L., Backer, L.C. and Nolan, B.T. (2017) Estimating the high-arsenic domestic-well population in the conterminous United States. Env. Sci. Tech., v.51(21), p.12443-12454. doi: 10.1021/acs.est.7b02881
  2. Berg, M., Stengel, C., Trang, P.T.K., Viet, P.H., Sampson, M.L., Leng, M., Samreth, S. and Fredericks, D. (2007) Magnitude of arsenic pollution in the Mekong and red river deltas-Cambodia and Vietnam. Sci. Tot. Environ., v.372, p.413-425. doi: 10.1016/j.scitotenv.2006.09.010
  3. Berg, M., Tran, H.C., Nguyen, T.C., Pham, H.V., Schertenleib, R., Giger, W. (2001) Arsenic contamination of groundwater and drinking water in Vietnam: a human health threat. Env. Sci. Tech., v.35, p.2621-2626. doi: 10.1021/es010027y
  4. Breiman, L. (2001) Random forests. Mach. Learn 45, 5-32. doi: https://doi.org/10.1023/A:1010933404324
  5. Buschmann, J. and Berg, M. (2009) Impact of sulfate reduction on the scale of arsenic contamination in groundwater of the Mekong, Bengal and red river deltas. Appl. Geochem., v.24, p.1278-1286. doi: 10.1016/j.apgeochem.2009.04.002
  6. Carrard, N., Foster, T. and Willetts, J. (2019) Groundwater as a source of drinking water in Southeast Asia and the Pacific: A multi-country review of current reliance and resource concerns. Water 2019 11, 1065. https://doi.org/10.3390/w11081605
  7. Chen, T. and Guestrin, C. (2016) Xgboost: A scalable tree boosting system, in: Proceedings of the 22nd ACM SIGKDD, pp. 785-794. https://doi.org/10.1145/2939672.2939785
  8. Choi, K.-W., Park, S.-S., Kang, C.-U., Lee, J.H. and Kim, S.J. (2021) A comparison study of alum sludge and ferric hydroxide based absorbents for arsenic adsorption from mine water. Econ. Environ. Geol., v.54, p.689-698. https://doi.org/10.9719/EEG.2021.54.6.689
  9. Dramsch, J.S. (2020) 70 years of machine learning in geoscience in review. ADGEO. https://doi.org/10.1016/bs.agph.2020.08.002
  10. Dreiseitl, S. and Ohno-Machado, L. (2002) Logistic regression and artificial neural network classification models: a methodology review. JBI, v.35, p.352-359. https://doi.org/10.1016/S1532-0464(03)00034-0
  11. Fendorf, S., Michael, H.A. and van Geen, A. (2010) Spatial and temporal variations of groundwater arsenic in South and Southeast Asia. Science, v.328, p.1123-1127. doi: 10.1126/science.1172974
  12. Friedman, J.H. (2001) Greedy function approximation: a gradient boosting machine. Ann. Stat., p.1189-1232.
  13. Guo, H., Li, X., Xiu, W., He, W., Cao, Y., Zhang, D. and Wang, A. (2019) Controls of organic matter bioreactivity on arsenic mobility in shallow aquifers of the Hetao Basin, P.R. China. Journal of Hydrology, v.571, p.448-459. https://doi.org/10.1016/j.jhydrol.2019.01.076.
  14. Hastie, T., Tibshirani, R. and Friedman, J. (2009) The elements of statistical learning: data mining, inference, and prediction. Springer Science & Business Media. https://doi.org/10.1007/978-0-387-21606-5
  15. Kinniburgh, D. (2001) Arsenic contamination of groundwater in Bangladesh. final report. http://www.bgs.ac.uk/research/groundwater/health/arsenic/Bangladesh/reports.
  16. Kwon, O.-H., Park, H.-S., Lee, J.S. and Ji, W.H. (2020) A field study on the application of pilot-scale vertical flow reactor system into the removal of Fe, As and Mn in mine drainage. Econ. and Environ. Geol., v.53, p.695-701. https://doi.org/10.9719/EEG.2020.53.6.695
  17. Lee, J.H., Ji, W.H., Lee, J.S., Park, S.S., Choi, K.W., Kang, C.U. and Kim, S.J. (2020) A study of fluoride and arsenic adsorption from aqueous solution using alum sludge based absorbent. Econ. Environ. Geol., v.53, p.667-675. https://doi.org/10.9719/EEG.2020.53.6.667
  18. Liu, X., Lai, X. and Zhang, L. (2020) A hierarchical missing value imputation method by correlation-based K-Nearest Neighbors. In: Bi Y., Bhatia R., Kapoor S. (eds.) Intelligent Systems and Applications. Advances in Intelligent Systems and Computing, 1037. Springer, Cham. https://doi.org/10.1007/978-3-030-29516-5_38.
  19. Mantovani, R.G., Rossi, A.L., Vanschoren, J., Bischl, B. and De Carvalho, A.C. (2015) Effectiveness of random search in svm hyper-parameter tuning, in: 2015 International Joint Conference on Neural Networks (IJCNN), IEEE. pp. 1-8. doi: 10.1109/IJCNN.2015.7280664
  20. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al. (2011) Scikit-learn: Machine learning in Python. J. Mach. Learn. Res., v.12, p.2825-2830. https://dl.acm.org/doi/10.5555/1953048.2078195
  21. Podgorski, J. and Berg, M. (2020) Global threat of arsenic in groundwater. Science, v.368, p.845-850. doi: 10.1126/science.aba1510
  22. Podgorski, J., Wu, R., Chakravorty, B. and Polya, D.A. (2020) Groundwater arsenic distribution in India by machine learning geospatial modeling. Int. J. Environ. Res. Public Health, v.17, p.7119. https://doi.org/10.3390/ijerph17197119
  23. Podgorski, J.E., Labhasetwar, P., Saha, D. and Berg, M. (2018) Prediction modelling and mapping of groundwater fluoride contamination throughout India. Environ. Sci. Technol., v.52(17), p.9889-9898. https://doi.org/10.1021/acs.est.8b01679
  24. Postma, D., Larsen, F., Thai, N.T., Trang, P.T.K., Jakobsen, R., Nhan, P.Q., Long, T.V., Viet, P.H. and Murray, A.S. (2012) Groundwater arsenic concentrations in Vietnam controlled by sediment age. Nat. Geosci., v.5, p.656-661. https://doi.org/10.1038/ngeo1540
  25. Ravenscroft, P., Brammer, H. and Richards, K. (2011) Arsenic pollution: a global synthesis. volume 94. John Wiley & Sons.
  26. Ravenscroft, P., Burgess, W.G., Ahmed, K.M., Burren, M. and Perrin, J. (2005) Arsenic in ground- water of the bengal basin, bangladesh: Distribution, field relations, and hydrogeological setting. Hydrogeol. J., v.13, p.727-751. doi: 10.1007/s10040-003-0314-0
  27. Smith, R., Knight, R. and Fendorf, S. (2018) Overpumping leads to California groundwater arsenic threat. Nature Communications, v.9(2), p.115. doi: 10.1038/s41467-018-04475-3
  28. Strobl, C., Boulesteix, A.L., Kneib, T., Augustin, T. and Zeileis, A. (2008) Conditional variable importance for random forests. BMC Bioinformatics, v.9, p.307. doi: 10.1186/1471-2105-9-307
  29. Tan, Z., Yang, Q. and Zheng, Y. (2020) Machine learning models of groundwater arsenic spatial distribution in bangladesh: Influence of holocene sediment depositional history. Environ. Sci. Technol., v.54, p.9454-9463. https://doi.org/10.1021/acs.est.0c03617
  30. Touzani, S., Granderson, J. and Fernandes, S. (2018) Gradient boosting machine for modeling the energy consumption of commercial buildings. Energy and Buildings, v.158, p.1533-1543. https://doi.org/10.1016/j.enbuild.2017.11.039
  31. United Nations, Department of Economic and Social Affairs Population Division (2019) World Population Prospects 2019, Volume II: Demographic Profiles Vietnam. Available at: https://population.un.org/wpp/Graphs/1_Demographic%20Profiles/Viet%20Nam.pdf.
  32. Wallis, I., Prommer, H., Berg, M., Siade, A.J., Sun, J. and Kipfer, R. (2020) The river-groundwater interface as a hotspot for arsenic release. Nature Geoscience, v.13, p.288-295. doi: 10.1038/s41561-020-0557-6
  33. Winkel, L., Berg, M., Amini, M., Hug, S.J. and Johnson, C.A. (2008) Predicting groundwater arsenic contamination in Southeast Asia from surface parameters. Nat. Geosci., v.1, p.536-542. https://doi.org/10.1038/ngeo254
  34. Winkel, L.H., Trang, P.T.K., Lan, V.M., Stengel, C., Amini, M., Ha, N.T., Viet, P.H. and Berg, M. (2011) Arsenic pollution of groundwater in Vietnam exacerbated by deep aquifer exploitation for more than a century. PNAS, v.108, p.1246-1251. doi: 10.1073/pnas.1011915108