Browse > Article
http://dx.doi.org/10.15681/KSWE.2019.35.5.432

A Study on Predicting TDI(Trophic Diatom Index) in tributaries of Han river basin using Correlation-based Feature Selection technique and Random Forest algorithm  

Kim, Minkyu (ETWaters Inc.)
Yoon, Chun Gyeong (Department of Environmental Health Science, Konkuk University)
Rhee, Han-Pil (ETWaters Inc.)
Hwang, Soon-Jin (Department of Environmental Health Science, Konkuk University)
Lee, Sang-Woo (Department of Forestry and Landscape Architecture, Konkuk University)
Publication Information
Abstract
The purpose of this study is to predict Trophic Diatom Index (TDI) in tributaries of the Han River watershed using the random forest algorithm. The one year (2017) and supplied aquatic ecology health data were used. The data includes water quality(BOD, T-N, $NH_3-N$, T-P, $PO_4-P$, water temperature, DO, pH, conductivity, turbidity), hydraulic factors(water width, average water depth, average velocity of water), and TDI score. Seven factors including water temperature, BOD, T-N, $NH_3-N$, T-P, $PO_4-P$, and average water depth are selected by the Correlation Feature Selection. A TDI prediction model was generated by random forest using the seven factors. To evaluate this model, 2017 data set was used first. As a result of the evaluation, $R^2$, % Difference, NSE(Nash-Sutcliffe Efficiency), RMSE(Root Mean Square Error) and accuracy rate show that this model is compatible with predicting TDI. To be more concrete, $R^2$ is 0.93, % Difference is -0.37, NSE is 0.89, RMSE is 8.22 and accuracy rate is 70.4%. Also, additional evaluation using data set more than 17 times the measured point was performed. The results were similar when the 2017 data set were used. The Wilcoxon Signed Ranks Test shows there was no statistically significant difference between actual and predicted data for the 2017 data set. These results can specify the elements which probably affect aquatic ecology health. Also, these will provide direction relative to water quality management for a watershed that must be continuously preserved.
Keywords
Aquatic ecology health; Correlation-based Feature Selection; Random forest; TDI Prediction;
Citations & Related Records
Times Cited By KSCI : 5  (Citation Analysis)
연도 인용수 순위
1 An, K. G., Lee, J. Y., and Jang, H. N. (2005). Ecological health assessments and water quality patterns in Youdeung stream, Korean Journal of Limnology, 38(3), 341-351. [Korean Literature]
2 Breiman, L. (2001). Random forests, Machine Learning, 45(1), 5-32.   DOI
3 Cho, I. H., Kim, H. K., Choi, M. Y., Kwon. Y. S., Hwang, S. J., Kim, S. H., and Kim. B. H. (2015). Distribution and species prediction of epilithic diatom in the Geum river basin, South Korea, Korean Journal of Ecology and Environment, 48(3), 153-167. [Korean Literature]   DOI
4 Choi, J. S., Lee, J. H., and Kim, H. S. (2017). The epilithic diatom community and biological water quality assessment of Naeseongcheon located at the upper region of Nakdong river, Korean Journal of Ecology and Environment, 50(4), 470-477. [Korean Literature]   DOI
5 Duda, P. B., Hummel, P. R., Donigian, Jr, A. S., and Imhoff, J. C. (2012). BASINS/HSPF: Model use, calibration and validation, Transactions of the American Society of Agricultural and Biological Engineers, 55(4), 1523-1547.
6 Hall, M. A. (1999). Correlation-based feature selection for machine learning, PhD Thesis, Department of Computer Science, The University of Waikato, New Zealand.
7 Hwang, S. J., Kim, N. Y., Won, D. H., An. K. K., Lee, J. K., and Kim, C. S. (2006). Biological assessment of water quality by using epilithic diatoms in major river systems (Geum, Youngsan, Seomjin River), Korea, Journal of Korean Society on Water Environment, 22(5) 784-795. [Korean Literature]
8 Kal, B. S., Park, J. B., Kim, S. H., and Im, T. H. (2017). Assessment of tributary water quality using integrated water quality index, Journal of Wetlands Research, 19(3), 311-317. [Korean Literature]   DOI
9 Kelly, M. G. and Whitton, B. A. (1995). The trophic diatom Index: a new index for monitoring eutrophication in rivers, Journal of Applied phycology, 7, 433-444.   DOI
10 Kong, D. S. (2002). Necessity and approach of establishing biological water quality standards, Korean Journal of Environmental Biology, 20(Special issue), 38-49. [Korean Literature]
11 Ministry of Land, Infrastructure and Transport (MOLIT). (2014). Korea river catalog, Ministry of Land, Infrastructure and Transport, 3-5. [Korean Literature]
12 Moriasi, D. N., Arnold, J. G., Van Liew, M. W., Bingner, R. L., Harmel, R. D., and Veith, T. L. (2007). Model evaluation guidelines for systematic quantification of accuracy in watershed simulations, Transactions of the American Society of Agricultural and Biological Engineers, 50(3), 885-900.
13 Na, S. M., Lim, T. H., Lee, J. Y., Kwon, H. G., and Cheon, S. U. (2015). Flow rate⋅water quality characteristics of tributaries and a grouping method for tributary management in Nakdong river, Journal of Wetlands Research, 17(4), 380-390. [Korean Literature]   DOI
14 Nash, J. E. and Sutcliffe, J. V. (1970). River flow forecasting through conceptual models. Part I - A discussion of principles, Journal of hydrology, 10(3), 282-290.   DOI
15 National Institute of Environmental Research (NIER). (2011). National aquatic ecological monitoring program, National Institute of Environmental Research, 7-8. [Korean Literature]
16 National Institute of Environmental Research (NIER). (2013). Nationwide aquatic ecological monitoring program, National Institute of Environmental Research, 329-350. [Korean Literature]
17 National Institute of Environmental Research (NIER). (2017). Biomonitoring survey and assessment manual, National Institute of Environmental Research, 1-75. [Korean Literature]
18 Prygiel, J. and Coste, M. (1993). The assessment of water quality in the Artois-Picardie water basin(France) by the use of diatom indices, Hydrobiology 269(270), 343-349.   DOI
19 Woo, S. Y., Jung, C. G., Kim, J. U., and Kim, S. J. (2018). Assessment of climate change impact on aquatic ecology health indices in Han river basin using SWAT and random forest, Journal of Korea Water Resources Association, 51(10), 863-874. [Korean Literature]