DOI QR코드

DOI QR Code

Development of empirical formula for imbalanced transverse dispersion coefficient data set using SMOTE

SMOTE를 이용한 편중된 횡 분산계수 데이터에 대한 추정식 개발

  • Lee, Sunmi (Department of Civil Engineering, Seoul National University of Science and Technology) ;
  • Yoon, Taewon (Department of Civil Engineering, Seoul National University of Science and Technology) ;
  • Park, Inhwan (Department of Civil Engineering, Seoul National University of Science and Technology)
  • 이선미 (서울과학기술대학교 건설시스템공학과) ;
  • 윤태원 (서울과학기술대학교 건설시스템공학과) ;
  • 박인환 (서울과학기술대학교 건설시스템공학과)
  • Received : 2021.10.26
  • Accepted : 2021.11.25
  • Published : 2021.12.31

Abstract

In this study, a new empirical formula for 2D transverse dispersion coefficient was developed using the results of previous tracer test studies, and the performance of the formula was evaluated. Since many tracer test studies have been conducted under the conditions where the width-to-depth ratio is less than 50, the existing empirical formulas developed using these imbalanced tracer test results have limitations in applying to rivers with a width-to-depth ratio greater than 50. Therefore, in order to develop an empirical formula for transverse dispersion coefficient using the imbalanced tracer test data, the Synthetic Minority Oversampling TEchnique (SMOTE) was used to oversample new data representing the properties of the existing tracer test data. The hydraulic data and the transverse dispersion coefficients in conditions of width-to-depth ratio greater than 50 were oversampled using the SMOTE. The reliability of the oversampled data was evaluated using the ROC (Receiver Operating Characteristic) curve. The empirical formula of transverse dispersion coefficient was developed including the oversampled data, and the performance of the results were compared with the empirical formulas suggested in previous studies using R2. From the comparison results, the value of R2 was 0.81 for the range of W/H < 50 and 0.92 for 50 < W/H, which were improved accuracy compared to the previous studies.

본 연구에서는 과거 추적자실험결과를 이용하여 2차원 횡분산계수에 대한 새로운 추정식을 개발하고 추정식을 이용한 횡 분산계수 산정결과의 정확도를 검증했다. 다수의 추적자실험이 하폭 대 수심비가 50보다 작은 조건에서 수행되었기 때문에 기존 추적자실험결과만을 이용하여 개발한 추정식은 하폭 대 수심비가 50보다 큰 조건의 하천에 적용하는데 한계를 보인다. 따라서 특정 수리조건에 편중된 횡 분산계수 자료로부터 횡 분산계수 추정식을 개발하기 위해 SMOTE (Synthetic Minority Oversampling TEchnique)를 적용하여 기존 자료의 특성을 반영한 새로운 데이터를 생성했다. SMOTE 기법으로 하폭 대 수심비가 50보다 큰 조건에 대한 수리량과 횡 분산계수 데이터를 생성하였으며, ROC (Receiver Operating Characteristic) 곡선으로부터 생성된 데이터의 신뢰성을 검증했다. 새롭게 생성된 데이터를 포함하여 횡 분산계수 추정식을 개발했고, 추정식을 이용하여 계산한 횡 분산계수의 R2(결정계수)를 계산하여 기존 연구에서 제안한 추정식과의 정확도를 비교했다. 그 결과, 본 연구에서 개발한 추정식을 이용하여 계산한 횡 분산계수의 R2가 W/H < 50인 조건에서 0.81, 50 < W/H 인 조건에서 0.92를 나타내어 기존 추정식과 비교하여 향상된 정확도를 나타냈다.

Keywords

Acknowledgement

본 결과물은 환경부의 재원으로 한국환경산업기술원의 미세플라스틱 측정 및 위해성평가 기술개발사업의 지원을 받아 연구되었습니다(과제번호: 2021003110003).

References

  1. Almquist, C.W., and Holley, E.R. (1985). Transverse mixing in meandering laboratory channels with rectangular and naturally varying cross sections. Technical Report CRWR-205, University of Texas, Austin, TX, U.S.
  2. Baek, K.O., and Seo, I.W. (2007). "Evaluating coefficient of transverse dispersion induced by shear flow." Journal of the Korean Society of Civil Engineers B, KSCE, Vol. 27, No. 1B, pp. 21-28.
  3. Baek, K.O., and Seo, I.W. (2013). "Empirical equation for transverse dispersion coefficient based on theoretical background in river bends." Environmental Fluid Mechanics, Vol. 13, No. 5, pp. 465-477. https://doi.org/10.1007/s10652-013-9276-5
  4. Baek, K.O., and Seo, I.W. (2017). "Estimation of transverse dispersion coefficient for two-dimensional mixing in natural streams." Journal of Hydro-environment Research, Vol. 15, pp. 67-74. https://doi.org/10.1016/j.jher.2017.01.003
  5. Baek, K.O., Seo, I.W., and Jung, S.J. (2005). "2-D mixing of instantaneous pollutants in meandering channels : II. Determination and analysis of dispersion coefficients." Journal of the Korean Society of Civil Engineers B, KSCE, Vol. 25, No. 6B, pp. 463-471.
  6. Baek, K.O., Seo, I.W., and Jung, S.J. (2006). "Evaluation of transverse dispersion coefficient in meandering channel from transient tracer tests." Journal of Hydraulic Engineering, Vol. 132, No. 10, pp. 1021-1032. https://doi.org/10.1061/(ASCE)0733-9429(2006)132:10(1021)
  7. Bansal, M.K. (1970). Dispersion and reaeration in natural stream. Ph. D. dissertation, Univesite de Kansas Laurence, KS, U.S.
  8. Bansal, M.K. (1971). "Dispersion in natural streams." Journal of the Hydraulics Division, ASCE, Vol. 97, No. 11, pp. 1867-1886. https://doi.org/10.1061/JYCEAJ.0003142
  9. Barus, S., Islam, M.M., Yao,X., and Murase, K. (2014). "MWMOTE - Majority weighted minority oversampling technique for imbalanced data set learning." IEEE Transactions on Knowledge and Data Engineering, Vol. 26, No. 2, pp. 405-425. https://doi.org/10.1109/TKDE.2012.232
  10. Beltaos, S. (1980). "Transverse mixing tests in natural streams." Journal of the Hydraulics Division, ASCE, Vol. 106, No. HY10, pp. 1607-1625. https://doi.org/10.1061/JYCEAJ.0005532
  11. Beltaos, S., and Day, T.J. (1978). "A field study of longitudinal dispersion." Canadian Journal of Civil Engineering, Vol. 5, pp. 572-585. https://doi.org/10.1139/l78-062
  12. Boxall, J.B., Guymer, I., and Mariion, A. (2003). "Transverse mixing in sinuous natural open channel flows." Journal of Hydraulic Research, IAHR, Vol. 41, No. 2, pp. 153-165. https://doi.org/10.1080/00221680309499958
  13. Bradley, A.P. (1997). "The use of the area under the ROC curve in theevaluation of machine learning algorithms." Pattern Recognition, Vol. 30, No. 7, pp. 1145-1159. https://doi.org/10.1016/S0031-3203(96)00142-2
  14. Chawla, N.V., Bowyer, K.W., Hall, L.O., and Kegelmeyer, W.P. (2002). "SMOTE : Synthetic minority over-sampling technique." Journal of Artificial Intelligence Research, Vol. 16, pp. 321-357. https://doi.org/10.1613/jair.953
  15. Deng, Z., Singh, V.P., and Bengtsson, L. (2001). "Longitudinal dispersion coefficient in straight rivers." Journal of Hydraulic Engineering, Vol. 127, No. 11, pp. 919-927. https://doi.org/10.1061/(ASCE)0733-9429(2001)127:11(919)
  16. Douzas, G., Bacao, F., and Last, F. (2018). "Improving imbalanced learning through a heuristic oversampling method based on k-means and SMOTE." Information Sciences, Vol. 465, pp. 1-20. https://doi.org/10.1016/j.ins.2018.06.056
  17. Engmann, J.E.O., and Kellerhals, R. (1974). "Transverse mixing in an ice-covered river." Water Resources Research, Vol. 10, pp. 775-784. https://doi.org/10.1029/WR010i004p00775
  18. Fischer, H.B. (1969). "The effect of bends on dispersion coefficients in streams." Water Resources Research, Vol. 5, pp. 496-506. https://doi.org/10.1029/WR005i002p00496
  19. Fischer, H.B. (1973). "Longitudinal dispersion and turbulent mixing in open- channel flow." Annual Review of Fluid Mechanics, Vol. 5, pp.59-78. https://doi.org/10.1146/annurev.fl.05.010173.000423
  20. Fischer, H.B., List, E.J., Koh, R.C.Y., Imberger, J., and Brooks, N.H. (1979). Mixing in inland and coastal waters. Academic Press, NY, U.S.
  21. Gharbi, S., and Verrette, J. (1998). "Relation between longitudinal and transversal mixing coefficients in natural streams." Journal of Hydraulic Research, IAHR, Vol. 36, No. 1, pp. 43-53. https://doi.org/10.1080/00221689809498376
  22. Han, E.J., Kim, Y.D., Baek, K.O., and Seo, I.W. (2017). "Analytical and experimental study on dispersion and diffusion by tracer test." Water for Future, Vol. 50, No. 6, pp. 58-65.
  23. Holley, E.R. (1971). Transverse mixing in rivers. Laboratory Report, No. S-132, Delft Hydraulics Lab, Netherlands.
  24. Holley, E.R., and Abraham, G. (1973). "Field tests on transverse mixing in rivers." Journal of Hydraulic Division, ASCE, Vol. 99, No. HY12, pp. 313-2331.
  25. Holley, F.M.Jr., and Nerat, G. (1983). "Field calibration of stream-tube dispersion model." Journal of Hydraulic Engineering, ASCE, Vol. 109, No. 11, pp. 1455-1470. https://doi.org/10.1061/(ASCE)0733-9429(1983)109:11(1455)
  26. Jeon, T.M., Baek, K.O., and Seo, I.W. (2007). "Development of an empirical equation for the transverse dispersion coefficient in natural streams." Environmental Fluid Mechanics, Vol. 7, pp. 317-329. https://doi.org/10.1007/s10652-007-9027-6
  27. Jurafsky, D., and Martin J.M. (2017). Speech and language processing: An introduction to natural language processing, computational linguistics, and speech recognition, 3rd ed, Pearson Eucation, London, UK, p. 67.
  28. Krishnappan, B.G., and Lau, Y.L. (1977). "Transverse mixing in meandering channels with varying bottom topography." Journal of Hydraulic Research, IAHR, Vol. 15, No. 4, pp. 351-371. https://doi.org/10.1080/00221687709499640
  29. Lau, Y.L., and Krishnappan, B.G. (1981). "Modeling transverse mixing in natural streams." Journal of the Hydraulic Division, ASCE, Vol. 107, No. HY2, pp. 209-226. https://doi.org/10.1061/JYCEAJ.0005612
  30. Mahamud, K.R.K., Zorkeflee, M., and Din, A.M. (2016). "Fuzzy distance-based undersampling technique for imbalanced flood data." Proceedings of the Knowledge Management International Conference, UUM, Chiang Mai, Thailand, pp. 509-513.
  31. Nitesh, V.C., Kevin W.B., Lawrence, O.H., and Philip, W.K. (2002). "SMOTE: synthetic minority over-sampling technique." Journal of Artificial Intelligence Research, Vol. 16, pp. 321-357. https://doi.org/10.1613/jair.953
  32. Nokes, R.I., and Wood, I.R. (1988). "Vertical and lateral turbulent dispersion: Some experimental results." Journal of Fluid Mechanics, Vol. 187, pp. 373-394. https://doi.org/10.1017/S0022112088000473
  33. Noori, R., Karbassi, A., Farokhnia, A., and Dehghani, M. (2009). "Predicting the longitudinal dispersion coefficient using support vector machine and adaptive neuro-fuzzy inference system techniques." Environmental Engineering Science, Vol. 26, No.10, pp.1503-1510. https://doi.org/10.1089/ees.2008.0360
  34. Rutherford, J.C. (1994). River mixing, John Wiley and Sons, Chichester, UK.
  35. Sayre, W.W. (1979). "Shore-attached thermal plumes in rivers." Modelling in rivers, Edited by Shen, H.W., Wiley-Interscience, London, UK, pp.15.1-15.44.
  36. Sayre, W.W., and Chang, F.M. (1968) A laboratory investigation of open channel dispersion processes for dissolved, suspended, and floating dispersants. Professional Paper, No. 433-E. US Geological Survey, U.S., pp. 1-71.
  37. Seo, I.W., Baek, K.O., and Jeon, T.M. (2006). "Analysis of transverse mixing in natural streams under slug tests." Journal of Hydraulic Research, Vol. 44, No. 3, pp. 350-362. https://doi.org/10.1080/00221686.2006.9521687
  38. Seo, I.W., Choi, H.J., Kim, Y.D., and Han, E.J. (2016). "Analysis of two-dimensional mixing in natural streams based on transient tracer tests." Journal of Hydraulic Engineering, Vol. 142, No. 8, pp. 1-16.
  39. Seo, I.W., Jeon, T.M., and Baek, K.O. (2005). "Development of empirical equation of transverse dispersion coefficient for analysis of 2-D mixing in natural streams." Journal of the Korean Society of Civil Engineers B, KSCE, Vol. 25, No. 4B, pp. 247-255.
  40. Shin, J., Seo, I.W., and Baek, D. (2020). "Longitudinal and transverse dispersion coefficients of 2D contaminant transport model for mixing analysis in open channels." Journal of Hydrology, Vol. 583, pp. 1-15.
  41. Snieder, E., Abogadil, K., and Khan, U.T. (2021). "Resampling and ensemble techniques for improving ANN-based high-flow forecast accuracy." Hydrology and Earth System Sciences, Vol. 25, pp. 2543-2566. https://doi.org/10.5194/hess-25-2543-2021
  42. Swets, J.A. (1988). "Measuring the accuracy of diagnostic systems." American Association for the Advancement of Science, Vol. 240, No. 4857, pp.1285-1293. https://doi.org/10.1126/science.3287615
  43. Webel, G., and Schatzmann, M. (1984). "Transverse mixing in open channel flow." Journal of Hydraulic Engineering, ASCE, Vol. 110, No. 4, pp. 423-435. https://doi.org/10.1061/(ASCE)0733-9429(1984)110:4(423)
  44. Wu, Y., Ding, Y., and Feng, J. (2020). "SMOTE-Boost-based sparse Bayesian model for flood prediction." EURASIP Journal on Wireless Communications and Networking, Vol. 78, pp.1-12.
  45. Yotsukura, N., and Cobb, E.D. (1972). Transverse diffusion of solutes in natural streams, Professional Paper, No.582-C, U.S. Geological Survey, U.S., pp. 1-19.
  46. Yotsukura, N. Fischer, H.B., and Sayre, W.W. (1970). Measurement of mixing characteristics of the Missouri River between Sioux City, Iowa and Plattsmouth, Nebraska. U.S. Geological Survey Water-Supply Paper, Washington D.C, U.S.
  47. Yotsukura, N., and Sayre, W.W. (1976). "Transverse mixing in natural channels." Water Resources Reseach, Vol. 12, No. 4, pp. 695-704. https://doi.org/10.1029/WR012i004p00695
  48. Yotsukura, N., Sayre, W.W., and Alsaffar, A.M. (1968). "Discussion of The mechanics of dispersion in natural streams by HB Fischer." Journal of the Hydraulics Division, Vol. 95, pp. 1009-1038.
  49. Zhu, F., Lin, Y., and Liu, Y. (2017). "Synthetic minority oversampling technique for multiclass imbalance problems." Pattern Recognition, Vol. 72, pp. 327-340. https://doi.org/10.1016/j.patcog.2017.07.024