DOI QR코드

DOI QR Code

A Study on Predicting TDI(Trophic Diatom Index) in tributaries of Han river basin using Correlation-based Feature Selection technique and Random Forest algorithm

Correlation-based Feature Selection 기법과 Random Forest 알고리즘을 이용한 한강유역 지류의 TDI 예측 연구

  • Received : 2019.06.18
  • Accepted : 2019.09.25
  • Published : 2019.09.30

Abstract

The purpose of this study is to predict Trophic Diatom Index (TDI) in tributaries of the Han River watershed using the random forest algorithm. The one year (2017) and supplied aquatic ecology health data were used. The data includes water quality(BOD, T-N, $NH_3-N$, T-P, $PO_4-P$, water temperature, DO, pH, conductivity, turbidity), hydraulic factors(water width, average water depth, average velocity of water), and TDI score. Seven factors including water temperature, BOD, T-N, $NH_3-N$, T-P, $PO_4-P$, and average water depth are selected by the Correlation Feature Selection. A TDI prediction model was generated by random forest using the seven factors. To evaluate this model, 2017 data set was used first. As a result of the evaluation, $R^2$, % Difference, NSE(Nash-Sutcliffe Efficiency), RMSE(Root Mean Square Error) and accuracy rate show that this model is compatible with predicting TDI. To be more concrete, $R^2$ is 0.93, % Difference is -0.37, NSE is 0.89, RMSE is 8.22 and accuracy rate is 70.4%. Also, additional evaluation using data set more than 17 times the measured point was performed. The results were similar when the 2017 data set were used. The Wilcoxon Signed Ranks Test shows there was no statistically significant difference between actual and predicted data for the 2017 data set. These results can specify the elements which probably affect aquatic ecology health. Also, these will provide direction relative to water quality management for a watershed that must be continuously preserved.

Keywords

References

  1. An, K. G., Lee, J. Y., and Jang, H. N. (2005). Ecological health assessments and water quality patterns in Youdeung stream, Korean Journal of Limnology, 38(3), 341-351. [Korean Literature]
  2. Breiman, L. (2001). Random forests, Machine Learning, 45(1), 5-32. https://doi.org/10.1023/A:1010933404324
  3. Cho, I. H., Kim, H. K., Choi, M. Y., Kwon. Y. S., Hwang, S. J., Kim, S. H., and Kim. B. H. (2015). Distribution and species prediction of epilithic diatom in the Geum river basin, South Korea, Korean Journal of Ecology and Environment, 48(3), 153-167. [Korean Literature] https://doi.org/10.11614/KSL.2015.48.3.153
  4. Choi, J. S., Lee, J. H., and Kim, H. S. (2017). The epilithic diatom community and biological water quality assessment of Naeseongcheon located at the upper region of Nakdong river, Korean Journal of Ecology and Environment, 50(4), 470-477. [Korean Literature] https://doi.org/10.11614/KSL.2017.50.4.470
  5. Duda, P. B., Hummel, P. R., Donigian, Jr, A. S., and Imhoff, J. C. (2012). BASINS/HSPF: Model use, calibration and validation, Transactions of the American Society of Agricultural and Biological Engineers, 55(4), 1523-1547.
  6. Hall, M. A. (1999). Correlation-based feature selection for machine learning, PhD Thesis, Department of Computer Science, The University of Waikato, New Zealand.
  7. Hwang, S. J., Kim, N. Y., Won, D. H., An. K. K., Lee, J. K., and Kim, C. S. (2006). Biological assessment of water quality by using epilithic diatoms in major river systems (Geum, Youngsan, Seomjin River), Korea, Journal of Korean Society on Water Environment, 22(5) 784-795. [Korean Literature]
  8. Kal, B. S., Park, J. B., Kim, S. H., and Im, T. H. (2017). Assessment of tributary water quality using integrated water quality index, Journal of Wetlands Research, 19(3), 311-317. [Korean Literature] https://doi.org/10.17663/JWR.2017.19.3.311
  9. Kelly, M. G. and Whitton, B. A. (1995). The trophic diatom Index: a new index for monitoring eutrophication in rivers, Journal of Applied phycology, 7, 433-444. https://doi.org/10.1007/BF00003802
  10. Kong, D. S. (2002). Necessity and approach of establishing biological water quality standards, Korean Journal of Environmental Biology, 20(Special issue), 38-49. [Korean Literature]
  11. Ministry of Land, Infrastructure and Transport (MOLIT). (2014). Korea river catalog, Ministry of Land, Infrastructure and Transport, 3-5. [Korean Literature]
  12. Moriasi, D. N., Arnold, J. G., Van Liew, M. W., Bingner, R. L., Harmel, R. D., and Veith, T. L. (2007). Model evaluation guidelines for systematic quantification of accuracy in watershed simulations, Transactions of the American Society of Agricultural and Biological Engineers, 50(3), 885-900.
  13. Na, S. M., Lim, T. H., Lee, J. Y., Kwon, H. G., and Cheon, S. U. (2015). Flow rate⋅water quality characteristics of tributaries and a grouping method for tributary management in Nakdong river, Journal of Wetlands Research, 17(4), 380-390. [Korean Literature] https://doi.org/10.17663/JWR.2015.17.4.380
  14. Nash, J. E. and Sutcliffe, J. V. (1970). River flow forecasting through conceptual models. Part I - A discussion of principles, Journal of hydrology, 10(3), 282-290. https://doi.org/10.1016/0022-1694(70)90255-6
  15. National Institute of Environmental Research (NIER). (2011). National aquatic ecological monitoring program, National Institute of Environmental Research, 7-8. [Korean Literature]
  16. National Institute of Environmental Research (NIER). (2013). Nationwide aquatic ecological monitoring program, National Institute of Environmental Research, 329-350. [Korean Literature]
  17. National Institute of Environmental Research (NIER). (2017). Biomonitoring survey and assessment manual, National Institute of Environmental Research, 1-75. [Korean Literature]
  18. Prygiel, J. and Coste, M. (1993). The assessment of water quality in the Artois-Picardie water basin(France) by the use of diatom indices, Hydrobiology 269(270), 343-349. https://doi.org/10.1007/BF00028033
  19. Woo, S. Y., Jung, C. G., Kim, J. U., and Kim, S. J. (2018). Assessment of climate change impact on aquatic ecology health indices in Han river basin using SWAT and random forest, Journal of Korea Water Resources Association, 51(10), 863-874. [Korean Literature]