Browse > Article
http://dx.doi.org/10.3741/JKWRA.2021.54.S-1.1037

Novel two-stage hybrid paradigm combining data pre-processing approaches to predict biochemical oxygen demand concentration  

Kim, Sungwon (Department of Railroad Construction and Safety Engineering, Dongyang University)
Seo, Youngmin (Department of Constructional and Environmental Engineering, Kyungpook National University)
Zakhrouf, Mousaab (URMER Laboratory, Department of Hydraulics, Faculty of Technology, University of Tlemcen)
Malik, Anurag (Punjab Agricultural University, Regional Research Station)
Publication Information
Journal of Korea Water Resources Association / v.54, no.spc1, 2021 , pp. 1037-1051 More about this Journal
Abstract
Biochemical oxygen demand (BOD) concentration, one of important water quality indicators, is treated as the measuring item for the ecological chapter in lakes and rivers. This investigation employed novel two-stage hybrid paradigm (i.e., wavelet-based gated recurrent unit, wavelet-based generalized regression neural networks, and wavelet-based random forests) to predict BOD concentration in the Dosan and Hwangji stations, South Korea. These models were assessed with the corresponding independent models (i.e., gated recurrent unit, generalized regression neural networks, and random forests). Diverse water quality and quantity indicators were implemented for developing independent and two-stage hybrid models based on several input combinations (i.e., Divisions 1-5). The addressed models were evaluated using three statistical indices including the root mean square error (RMSE), Nash-Sutcliffe efficiency (NSE), and correlation coefficient (CC). It can be found from results that the two-stage hybrid models cannot always enhance the predictive precision of independent models confidently. Results showed that the DWT-RF5 (RMSE = 0.108 mg/L) model provided more accurate prediction of BOD concentration compared to other optimal models in Dosan station, and the DWT-GRNN4 (RMSE = 0.132 mg/L) model was the best for predicting BOD concentration in Hwangji station, South Korea.
Keywords
Biochemical oxygen demand; Gated recurrent unit; Generalized regression neural networks; Random forests; Discrete wavelet transform; Water quality indicator;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Jouanneau, S., Recoules, L., Durand, M.J., Boukabache, A., Picot, V., Primault, Y., Lakel, A., Sengelin, M., Barillon, B., and Thouand, G. (2014). "Methods for assessing biochemical oxygen demand (BOD): A review." Water Research, Vol. 49, pp. 62-82.   DOI
2 Khaled, B., Abdellah, A., Noureddine, D., Salim, H., and Sabeha, A. (2017). "Modelling of biochemical oxygen demand from limited water quality variable by ANFIS using two partition methods." Water Quality Research Journal of Canada, Vol. 53, No. 1, pp. 24-40.
3 Kim, S. (2000). "The application of neural networks method for the flood discharge forecasting in the river basin." Journal of Korean Society of Civil Engineers, Vol. 20, No. 6-B, pp. 801-811 (in Korean).
4 Kim, S., and Kim, H.S. (2007). "Neural networks-genetic algorithm model for modeling of nonlinear evaporation and evapotranspiration time series 1. Theory and application of the model." Journal of Korean Water Resources Association, Vol. 40, No. 1, pp. 73-88. (in Korean)   DOI
5 Kim, S., Kim, J.H., and Park, K.B. (2009). "Statistical learning theory for the disaggregation of the climatic data." Proceedings of the 33rd IAHR Congress, Vancouver, Canada, pp. 1154-1162.
6 Kim, S., Kisi, O., Seo, Y., Singh, V.P., and Lee, C.J. (2017). "Assessment of rainfall aggregation and disaggregation using data-driven models and wavelet decomposition." Hydrology Research, Vol. 48, No. 1, pp. 99-116.   DOI
7 Taylor, K.E. (2001). "Summarizing multiple aspects of model performance in a single diagram." Journal of Geophysical Research: Atmospheres, Vol. 106, No. D7, pp. 7183-7192.   DOI
8 Yaseen, Z.M., Karami, H., Ehteram, M., Mohd, N.S., Mousavi, S.F., Hin, L.S., Kisi, O., Farzin, S., Kim, S., and El-Shafie, A. (2018). "Optimization of reservoir operation using new hybrid algorithm." KSCE Journal of Civil Engineering, Vol. 22, No. 11, pp. 4668-4680.   DOI
9 Kim, S., Seo, Y., and Lee, C.J. (2016). "Modeling of rainfall by combining neural computation and wavelet technique." Procedia Engineering, Vol. 154, pp. 1231-1236.   DOI
10 Kim, S., Park, K.B., and Seo, Y.M. (2012). "Estimation of pan evaporation using neural networks and climate-based models." Disaster Advances, Vol. 5, No. 3, pp. 34-43.
11 Ladlani, I., Houichi, L., Djemili, L., Heddam, S., and Belouz, K. (2012). "Modeling daily reference evapotranspiration (ETo) in the north of Algeria using generalized regression neural networks (GRNN) and radial basis function neural networks (RBFNN): A comparative study." Meteorology and Atmospheric Physics, Vol. 118, No. 3, pp. 163-178.   DOI
12 Li, J., Abdulmohsin, H.A., Hasan, S.S., Kaiming, L., Al-Khateeb, B., Ghareb, M.I., and Mohammed, M.N. (2019). "Hybrid soft computing approach for determining water quality indicator: Euphrates River." Neural Computing and Applications, Vol. 31, No. 3, pp. 827-837.   DOI
13 Mallat, S.G. (1989). "A theory of multiresolution signal decomposition: the wavelet representation." IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 11, No. 3, pp. 674-693.   DOI
14 Ministry of Environment (ME) (2020). Full-scale implementation of the total water pollution control system in the 2030 phase of the Four Major Rivers (7.15). Press release.
15 Percival, D.B., and Walden, A.T. (2000). Wavelet methods for time series analysis. Cambridge University Press, New York, NY, U.S.
16 Yang, G., Lee, H., and Lee, G. (2020). "A hybrid deep learning model to forecast particulate matter concentration levels in Seoul, South Korea." Atmosphere, Vol. 11, No. 4, p. 348.   DOI
17 Zou, R., Lung, W.S., and Wu, J. (2007). "An adaptive neural network embedded genetic algorithm approach for inverse water quality modeling." Water Resources Research, Vol. 43, No. 8, W08427.   DOI
18 Zhang, Y., Pulliainen, J., Koponen, S., and Hallikainen, M. (2002). "Application of an empirical neural network to surface water quality estimation in the Gulf of Finland using combined optical data and microwave data." Remote Sensing of Environment, Vol. 81, No. 2-3, pp. 327-336.   DOI
19 Zakhrouf, M., Bouchelkia, H., Stamboul, M., and Kim, S. (2020). "Novel hybrid approaches based on evolutionary strategy for streamflow forecasting in the Chellif River, Algeria." Acta Geophysica, Vol. 68, No. 1, pp.167-180.   DOI
20 Zakhrouf, M., Bouchelkia, H., Stamboul, M., Kim, S., and Heddam, S. (2018). "Time series forecasting of river flow using an integrated approach of wavelet multi-resolution analysis and evolutionary data-driven models. A case study: Sebaou River (Algeria)." Physical Geography, Vol. 39, No. 6, pp. 506-522.   DOI
21 Zounemat-Kermani, M., Rajaee, T., Ramezani-Charmahineh, A., and Adamowski, J.F. (2017). "Estimating the aeration coefficient and air demand in bottom outlet conduits of dams using GEP and decision tree methods." Flow Measurement and Instrumentation, Vol. 54, pp. 9-19.   DOI
22 Deo, R.C., Sahin, M., Adamowski, J.F., and Mi, J. (2019). "Universally deployable extreme learning machines integrated with remotely sensed MODIS satellite predictors over Australia to forecast global solar radiation: A new approach." Renewable and Sustainable Energy Reviews, Vol. 104, pp. 235-261.   DOI
23 Dogan, E., Sengorur, B., and Koklu, R. (2009). "Modeling biological oxygen demand of the Melen River in Turkey using an artificial neural network technique." Journal of Environmental Management, Vol. 90, Issue 2, pp. 1229-1235.   DOI
24 Kim, S., Alizamir, M., Zounemat-Kermani, M., Kisi, O., and Singh, V.P. (2020). "Assessing the biochemical oxygen demand using neural networks and ensemble tree approaches in South Korea." Journal of Environmental Management, Vol. 270, p. 110834.   DOI
25 Noori, R., Yeh, H.D., Abbasi, M., Kachoosangi, F.T., and Moazami, S. (2015). "Uncertainty analysis of support vector machine for online prediction of five-day biochemical oxygen demand." Journal of Hydrology, Vol. 527, pp. 833-843.   DOI
26 Tao, H., Bobaker, A.M., Ramal, M.M., Yaseen, Z.M., Hossain, M.S., and Shahid, S. (2019). "Determination of biochemical oxygen demand and dissolved oxygen for semi-arid river environment: application of soft computing models." Environmental Science and Pollution Research, Vol. 26, No. 1, pp. 923-937.   DOI
27 Ahmadi, A., Nasseri, M., and Solomatine, D.P. (2019). "Parametric uncertainty assessment of hydrological models: coupling UNEEC-P and a fuzzy general regression neural network." Hydrological Sciences Journal, Vol. 64, No. 9, pp. 1080-1094.   DOI
28 Ahmed, A.A.M., and Shah, S.M.A. (2017). "Application of adaptive neuro-fuzzy inference system (ANFIS) to estimate the biochemical oxygen demand (BOD) of Surma River." Journal of King Saud University-Engineering Sciences, Vol. 29, No. 3, pp. 237-243.   DOI
29 Ay, M., and Kisi, O. (2012). "Modeling of dissolved oxygen concentration using different neural network techniques in Foundation Creek, El Paso County, Colorado." Journal of Environmental Engineering, Vol. 138, No. 6, pp. 654-662.   DOI
30 Garrick, M., Cunnane, C., and Nash, J.E. (1978). "A criterion of efficiency for rainfall-runoff models." Journal of Hydrology, Vol. 36, No. 3-4, pp. 375-381.   DOI
31 Sahay, R.R., and Srivastava, A. (2014). "Predicting monsoon floods in rivers embedding wavelet transform, genetic algorithm and neural network." Water Resources Management, Vol. 28, No. 2, pp. 301-317.   DOI
32 Kim, S. (2011). "Nonlinear hydrologic modeling using the stochastic and neural networks approach." Disaster Advances, Vol. 4, No. 1, pp. 53-63.
33 Kim, S., and Kim, H.S. (2008). "Neural networks and genetic algorithm approach for nonlinear evaporation and evapotranspiration modeling." Journal of Hydrology, Vol. 351, No. 3-4, pp. 299-317.   DOI
34 Kalteh, A.M. (2015). "Wavelet genetic algorithm-support vector regression (wavelet GA-SVR) for monthly flow forecasting." Water Resources Management, Vol. 29, No. 4, pp.1283-1293.   DOI
35 Kim, S., Maleki, N., Rezaie-Balf, M., Singh, V.P., Alizamir, M., Kim, N.W., Lee, J.T., and Kisi, O. (2021). "Assessment of the total organic carbon employing the different nature-inspired approaches in the Nakdong River, South Korea." Environmental Monitoring and Assessment, Vol. 193, No. 7, pp.1-22.   DOI
36 Kisi, O. (2006). "Generalized regression neural networks for evapotranspiration modelling." Hydrological Sciences Journal, Vol. 51, No. 6, pp. 1092-1105.   DOI
37 Li, X., Zecchin, A.C., and Maier, H.R. (2014). "Selection of smoothing parameter estimators for general regression neural networks - applications to hydrological and water resources modelling." Environmental Modelling and Software, Vol. 59, pp. 162-186.   DOI
38 Nash, J.E., and Sutcliffe, J.V. (1970). "River flow forecasting through conceptual models, Part 1 - A discussion of principles." Journal of Hydrology, Vol. 10, No. 3, pp. 282-290.   DOI
39 Seo, Y., Kim, S., Kisi, O., and Singh, V.P. (2015). "Daily water level forecasting using wavelet decomposition and artificial intelligence techniques." Journal of Hydrology, Vol. 520, pp. 224-243.   DOI
40 Solgi, A., Pourhaghi, A., Bahmani, R., and Zarei, H. (2017). "Improving SVR and ANFIS performance using wavelet transform and PCA algorithm for modeling and predicting biochemical oxygen demand (BOD)." Ecohydrology and Hydrobiology, Vol. 17, No. 2, pp.164-175.   DOI
41 Ahmadi, A., Fatemi, Z., and Nazari, S. (2018). "Assessment of input data selection methods for BOD simulation using data-driven models: A case study." Environmental Monitoring and Assessment, Vol. 190, No. 4, p. 239.   DOI
42 Seo, Y., Kim, S., and Singh, V.P. (2018). "Comparison of different heuristic and decomposition techniques for river stage modeling." Environmental Monitoring and Assessment, Vol. 190, No. 7, pp. 1-22.   DOI
43 Rezaie-Balf, M., Maleki, N., Kim, S., Ashrafian, A., Babaie-Miri, F., Kim, N.W., Chung, I.M., and Alaghmand, S. (2019). "Forecasting daily solar radiation using CEEMDAN decomposition-based MARS model trained by crow search algorithm." Energies, Vol. 12, No. 8, p. 1416.   DOI
44 Royal Commission on Sewage Disposal (1908). Fifth report on methods of treating and disposing of sewage. UK.
45 Seo, Y., and Kim, S. (2016). "Hydrological forecasting using hybrid data-driven approach." American Journal of Applied Sciences, Vol. 13, No. 8, pp.891-899.   DOI
46 Seo, Y., Kim, S., Kisi, O., Singh, V.P., and Parasuraman, K. (2016). "River stage forecasting using wavelet packet decomposition and machine learning models." Water Resources Management, Vol. 30, No. 11, pp. 4011-4035.   DOI
47 Simard, M., Saatchi, S.S., and De Grandi, G. (2000). "The use of decision tree and multiscale texture for classification of JERS-1 SAR data over tropical forest." IEEE Transactions on Geoscience and Remote Sensing, Vol. 38, No. 5, pp. 2310-2321.   DOI
48 Zounemat-Kermani, M., Seo, Y., Kim, S., Ghorbani, M.A., Samadianfard, S., Naghshara, S., Kim, N.W., and Singh, V.P. (2019). "Can decomposition approaches always enhance soft computing models? Predicting the dissolved oxygen concentration in the St. Johns River, Florida." Applied Sciences, Vol. 9, No. 12, p. 2534.   DOI
49 Emamgholizadeh, S., Kashi, H., Marofpoor, I., and Zalaghi, E. (2014). "Prediction of water quality parameters of Karoon River (Iran) by artificial intelligence-based models." International Journal of Environmental Science and Technology, Vol. 11, No. 3, pp. 645-656.   DOI
50 Raheli, B., Aalami, M.T., El-Shafie, A., Ghorbani, M.A., and Deo, R.C. (2017). "Uncertainty assessment of the multilayer perceptron (MLP) neural network model with implementation of the novel hybrid MLP-FFA method for prediction of biochemical oxygen demand and dissolved oxygen: A case study of Langat River." Environmental Earth Sciences, Vol. 76, No. 14, p. 503.   DOI
51 Breiman, L. (2001). "Random forests." Machine Learning, Vol. 45, No. 1, pp. 5-32.   DOI
52 Specht, D.F. (1991). "A general regression neural network." IEEE Transactions on Neural Networks, Vol. 2, No. 6, pp. 568-576.   DOI
53 Willmott, C.J., and Matsuura, K. (2005). "Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance." Climate Research, Vol. 30, No. 1, pp. 79-82.   DOI
54 Alizamir, M., Kim, S., Zounemat-Kermani, M., Heddam, S., Shahrabadi, A.H., and Gharabaghi, B. (2021). "Modelling daily soil temperature by hydro-meteorological data at different depths using a novel data-intelligence model: Deep echo state network model." Artificial Intelligence Review, Vol. 54, No. 4, pp. 2863-2890.   DOI
55 Cho, K., Van Merrienboer, B., Bahdanau, and D., Bengio, Y. (2014). "On the properties of neural machine translation: Encoder-decoder approaches." arXiv preprint arXiv, 1409. 1259. doi: 10.3115/v1/W14-4012   DOI
56 Diamantopoulou, M.J., Antonopoulos, V.Z., and Papamichail, D.M. (2007). "Cascade correlation artificial neural networks for estimating missing monthly values of water quality parameters in rivers." Water Resources Management, Vol. 21, No. 3, pp. 649-662.   DOI
57 Fallah, H., Kisi, O., Kim, S., and Rezaie-Balf, M. (2019). "A new optimization approach for the least-cost design of water distribution networks: Improved crow search algorithm." Water Resources Management, Vol. 33, No. 10, pp. 3595-3613.   DOI
58 Friedman, J.H. (2002). "Stochastic gradient boosting." Computational Statistics and Data Analysis, Vol. 38, No. 4, pp. 367-378.   DOI
59 Granata, F., Papirio, S., Esposito, G., Gargano, R., and de Marinis, G. (2017). "Machine learning algorithms for the forecasting of wastewater quality indicators." Water, Vol. 9, No. 2, p. 105.   DOI