Browse > Article
http://dx.doi.org/10.7850/jkso.2022.27.2.071

WQI Class Prediction of Sihwa Lake Using Machine Learning-Based Models  

KIM, SOO BIN (Marine Environmental Research Center, Korea Institute of Ocean Science & Technology (KIOST))
LEE, JAE SEONG (Marine Environmental Research Center, Korea Institute of Ocean Science & Technology (KIOST))
KIM, KYUNG TAE (Marine Environmental Research Center, Korea Institute of Ocean Science & Technology (KIOST))
Publication Information
The Sea:JOURNAL OF THE KOREAN SOCIETY OF OCEANOGRAPHY / v.27, no.2, 2022 , pp. 71-86 More about this Journal
Abstract
The water quality index (WQI) has been widely used to evaluate marine water quality. The WQI in Korea is categorized into five classes by marine environmental standards. But, the WQI calculation on huge datasets is a very complex and time-consuming process. In this regard, the current study proposed machine learning (ML) based models to predict WQI class by using water quality datasets. Sihwa Lake, one of specially-managed coastal zone, was selected as a modeling site. In this study, adaptive boosting (AdaBoost) and tree-based pipeline optimization (TPOT) algorithms were used to train models and each model performance was evaluated by metrics (accuracy, precision, F1, and Log loss) on classification. Before training, the feature importance and sensitivity analysis were conducted to find out the best input combination for each algorithm. The results proved that the bottom dissolved oxygen (DOBot) was the most important variable affecting model performance. Conversely, surface dissolved inorganic nitrogen (DINSur) and dissolved inorganic phosphorus (DIPSur) had weaker effects on the prediction of WQI class. In addition, the performance varied over features including stations, seasons, and WQI classes by comparing spatio-temporal and class sensitivities of each best model. In conclusion, the modeling results showed that the TPOT algorithm has better performance rather than the AdaBoost algorithm without considering feature selection. Moreover, the WQI class for unknown water quality datasets could be surely predicted using the TPOT model trained with satisfactory training datasets.
Keywords
Water quality index; Machine learning; Sihwa Lake; Adaptive boosting (AdaBoost); Tree-based pipeline optimization (TPOT);
Citations & Related Records
Times Cited By KSCI : 4  (Citation Analysis)
연도 인용수 순위
1 Jeon, S.B., H.Y. Oh and M.H. Jeong, 2020. Estimation of sea water quality level using machine learning. J. Korean Society for Geospatial Information Sci., 28(4): 145-152.   DOI
2 Kouadri, S., A. Elbeltagi, A.M.T. Islam and S. Kateb, 2021. Performance of machine learning methods in predicting water quality index based on irregular data set: application on Illizi region(Algerian southeast). Applied Water Science, 11(12): 190.   DOI
3 Lee, Y., J.K. Kim, S. Jung, J. Eum, C. Kim and B. Kim, 2014. Development of a water quality index model for lakes and reservoirs. Paddy and Water Environment, 12: 19-28.   DOI
4 MOF (Ministry of Oceans and Fisheries), 2021. Project to improve the marine environment of Sihwa Lake.
5 Rho, T.K., T.S. Lee, S.R. Lee, M.S. Choi, C. Park, J.H. Lee, J.Y. Lee and S.S. Kim, 2012. Reference values and water quality assessment based on the regional environmental characteristics. The Sea, 17(2): 45-58.   DOI
6 Tiyasha, T.M. Tung and Z.M. Yaseen, 2021. Deep Learning for Prediction of Water Quality Index Classification: Tropical Catchment Environmental Assessment. Natural Resources Research, 30(6): 4235-4254.   DOI
7 Uddin, M.G., S. Nash and A.I. Olbert, 2021. A review of water quality index models and their use for assessing surface water quality. Ecological Indicators, 122: 107218.   DOI
8 Yaseen, Z.M., M.M. Ramal, L. Diop, O. Jaafar, V. Demir and O. Kisi, 2018. Hybrid Adaptive Neuro-Fuzzy Models for Water Quality Index Estimation. Water Resources Management, 32(7): 2227-2245.   DOI
9 Gharibi, H., A.H. Mahvi, R. Nabizadeh, H. Arabalibeik, M. Yunesian and M.H. Sowlat, 2012. A novel approach in water quality assessment based on fuzzy logic. Journal of Environmental Management, 112: 87-95.   DOI
10 Hameed, M., S.S. Sharqi, Z.M. Yaseen, H.A. Afan, A. Hussain and A. Elshafie, 2017. Application of artificial intelligence (AI) techniques in water quality index prediction: a case study in tropical region, Malaysia. Neural Computing & Applications, 28: 893-905.   DOI
11 Ho, J.Y., H.A. Afan, A.H. El-Shafie, S.B. Koting, N.S. Mohd, W.Z.B. Jaafar, L.S. Hin, M.A. Malek, A.N. Ahmed, W.H.M.W. Mohtar, A. Elshorbagy and A. El-Shafie, 2019. Towards a time and cost effective approach to water quality index class prediction. Journal of Hydrology, 575: 148-165.   DOI
12 Jang, E.N., J.H. Im, S.H. Ha, S.Y. Lee and Y.G. Park, 2016. Estimation of water quality index for coastal areas in Korea using GOCI satellite data based on machine learning approaches. Korean Journal of Remote Sensing, 32(3): 221-234.   DOI
13 Khozani, Z.S., M. Iranmehr and W.H.M.W. Mohtar, 2022. Improving Water Quality Index prediction for water resources management plans in Malaysia: application of machine learning techniques. Geocarto International, Early Access.
14 Olson, R.S., R.J. Urbanowicz, P.C. Andrews, N.A. Lavender, L.C. Kidd and J.H. Moore, 2016. Automating biomedical data science through tree-based pipeline optimization. Applications of Evolutionary Computation, Evoapplications 2016, PT I, 9597: 123-137.
15 Gazzaz, N.M., M.K. Yusoff, A.Z. Aris, H. Juahir and M.F. Ramli, 2012. Artificial neural network modeling of the water quality index for Kinta River (Malaysia) using water quality variables as predictors. Marine Pollution Bulletin, 64(11): 2409-2420.   DOI
16 Ra, K.T., J.K. Kim, E.S. Kim, K.T. Kim, J.M. Lee, S.K. Kim, E.Y. Kim, S.Y. Lee and E.J. Park, 2013. Evaluation of spatial and temporal variations of water quality in Lake Shihwa and outer sea by using water quality index in Korea: A case study of influence of tidal power plant operation. J. the Korean Society for Marine Environment and Energy, 16(2): 102-114.   DOI
17 Tripathi, M. and S.K. Singal, 2019. Use of Principal Component Analysis for parameter selection for development of a novel Water Quality Index: A case study of river Ganga India. Ecological Indicators, 96: 430-436.   DOI
18 Abba, S.I., R.A. Abdulkadir, S.S. Sammen, Q.B. Pham, A.A. Lawan, P. Esmaili, A. Malik and N. Al-Ansari, 2022. Integrating feature extraction approaches with hybrid emotional neural networks for water quality index modeling. Applied Soft Computing, 114: 108036.   DOI
19 Asadollah, S.B.H.S., A. Sharafati, D. Motta and Z.M. Yaseen, 2021. River water quality index prediction and uncertainty analysis: A comparative study of machine learning models. Journal of Environmental Chemical Engineering, 9(1): 104599.   DOI
20 Bui, D.T., K. Khosravi, J. Tiefenbacher, H. Nguyen and N. Kazakis, 2020. Improving prediction of water quality indices using novel hybrid machine-learning algorithms. Science of the Total Environment, 721: 137612.   DOI
21 Guo, J. and J.H.W. Lee, 2021. Development of predictive models for "very poor" beach water quality gradings using class-imbalance learning. Environ. Sci. Technol., 55: 14990-15000.   DOI
22 Zhu, J., H. Zou, S. Rosset and T. Hastie, 2009. Multi-class AdaBoost. Statistics and its Interface, 2: 349-360.   DOI
23 Prasad, D.V.V., P.S. Kumar, L.Y. Venkataramana, G. Prasannamedha, S. Harshana, S.J. Srividya, K. Harrinei, S. Indraganti, 2021. Automating water quality analysis using ML and auto ML techniques. Environmental Research, 202: 111720.   DOI
24 Li, J., H.A. Abdulmohsin, S.S. Hasan, K.M. Li, B. Al-Khateeb, M.I. Ghareb and M.N. Mohammed, 2019. Hybrid soft computing approach for determining water quality indicator: Euphrates River. Neural Computing & Applications, 31(3): 827-837.   DOI
25 Abba, S.I., S.J. Hadi, S.S. Sammen, S.Q. Salih, R.A. Abdulkadir, Q.B. Pham and Z.M. Yaseen, 2020. Evolutionary computational intelligence algorithm coupled with self-tuning predictive model for water quality index determination. Journal of Hydrology, 587: 124974.   DOI
26 Deng, T.N., K.W. Chau and H.F. Duan, 2021. Machine learning based marine water quality prediction for coastal hydro-environment management. Journal of Environmental Management, 284: 112051.   DOI
27 Imani, M., M.M. Hasan, L.F. Bittencourt, K. McClymont and Z. Kapelan, 2021. A novel machine learning application: Water quality resilience prediction model. Science of the Total Environment, 768: 144459.   DOI