DOI QR코드

DOI QR Code

머신러닝 기법의 산림 총일차생산성 예측 모델 비교

Predicting Forest Gross Primary Production Using Machine Learning Algorithms

  • 이보라 (국립산림과학원 기후변화생태연구과) ;
  • 장근창 (국립산림과학원 기후변화생태연구과) ;
  • 김은숙 (국립산림과학원 기후변화생태연구과) ;
  • 강민석 (국가농림기상센터) ;
  • 천정화 (국립산림과학원 연구기획과) ;
  • 임종환 (국립산림과학원 기후변화생태연구과)
  • Lee, Bora (Forest Ecology & Climate Change Division, National Institute of Forest Science) ;
  • Jang, Keunchang (Forest Ecology & Climate Change Division, National Institute of Forest Science) ;
  • Kim, Eunsook (Forest Ecology & Climate Change Division, National Institute of Forest Science) ;
  • Kang, Minseok (National Center for AgroMeteorology) ;
  • Chun, Jung-Hwa (Research Planning and Coordination, National Institute of Forest Science) ;
  • Lim, Jong-Hwan (Forest Ecology & Climate Change Division, National Institute of Forest Science)
  • 투고 : 2019.03.08
  • 심사 : 2019.03.27
  • 발행 : 2019.03.30

초록

산림생태계에서 총일차생산성(Gross Primary Production, GPP)은 기후변화에 따른 산림의 생산성과 그에 영향을 미치는 식물계절, 건강성, 탄소 순환 등을 대표하는 지표이다. 총일차생산성을 추정하기 위해서는 에디공분산 타워 자료나 위성영상관측자료를 이용하기도 하고 물리지형적 한계나 기후변화 등을 고려하기 위해 기작기반모델링을 활용하기도 한다. 그러나 총일차생산성을 포함한 산림 탄소 순환의 기작기반 모델링은 식물의 생물, 생리, 화학적 기작들의 반응과 지형, 기후 및 시간 등과 같은 환경 조건들이 복잡하게 얽혀 있어 비선형적이고 유연성이 떨어져 반응에 영향을 주는 조건들을 모두 적용하기가 어렵다. 본 연구에서는 산림 생산성 추정 모델을 에디공분산 자료와 인공위성영상 정보를 사용하여 기계학습 알고리즘을 사용한 모델들로 구축해 보고 그 사용 및 확장 가능성을 검토해 보고자 하였다. 설명변수들로는 에디공분산자료와 인공위성자료에서 나온 대기기상인자들을 사용하였고 검증자료로 에디공분산 타워에서 관측된 총일차생산성을 사용하였다. 산림생산성 추정 모델은 1) 에디공분산 관측 기온($T_{air}$), 태양복사($R_d$), 상대습도(RH), 강수(PPT), 증발산(ET) 자료, 2) MODIS 관측 기온(T), 일사량($R_{sd}$), VPD 자료(개량식생지수 제외), 3) MODIS 관측 기온(T), 일사량($R_{sd}$), VPD, 개량식생지수(EVI) 자료를 사용하는 세 가지 경우로 나누어 구축하여 2006 - 2013년 자료로 훈련시키고 2014, 2015년 자료로 검증하였다. 기계학습 알고리즘은 support vector machine (SVM), random forest (RF), artificial neural network (ANN)를 사용하였고 단순 비교를 위해 고전적 방법인 multiple linear regression model (LM)을 사용하였다. 그 결과, 에디공분산 입력자료로 훈련시킨 모델의 예측력은 피어슨 상관계수 0.89 - 0.92 (MSE = 1.24 - 1.62), MODIS 입력자료로 훈련시킨 모델의 예측력은 개량식생지수 제외된 모델은 0.82 - 0.86 (MSE = 1.99 - 2.45), 개량식생지수가 포함된 모델은 0.92 - 0.93(MSE = 1.00 - 1.24)을 보였다. 이러한 결과는 산림총일차생산성 추정 모델 구축에 있어 MODIS인공위성 영상 정보 기반으로 기계학습 알고리즘을 사용하는 것에 대한 높은 활용가능성을 보여주었다.

Terrestrial Gross Primary Production (GPP) is the largest global carbon flux, and forest ecosystems are important because of the ability to store much more significant amounts of carbon than other terrestrial ecosystems. There have been several attempts to estimate GPP using mechanism-based models. However, mechanism-based models including biological, chemical, and physical processes are limited due to a lack of flexibility in predicting non-stationary ecological processes, which are caused by a local and global change. Instead mechanism-free methods are strongly recommended to estimate nonlinear dynamics that occur in nature like GPP. Therefore, we used the mechanism-free machine learning techniques to estimate the daily GPP. In this study, support vector machine (SVM), random forest (RF) and artificial neural network (ANN) were used and compared with the traditional multiple linear regression model (LM). MODIS products and meteorological parameters from eddy covariance data were employed to train the machine learning and LM models from 2006 to 2013. GPP prediction models were compared with daily GPP from eddy covariance measurement in a deciduous forest in South Korea in 2014 and 2015. Statistical analysis including correlation coefficient (R), root mean square error (RMSE) and mean squared error (MSE) were used to evaluate the performance of models. In general, the models from machine-learning algorithms (R = 0.85 - 0.93, MSE = 1.00 - 2.05, p < 0.001) showed better performance than linear regression model (R = 0.82 - 0.92, MSE = 1.24 - 2.45, p < 0.001). These results provide insight into high predictability and the possibility of expansion through the use of the mechanism-free machine-learning models and remote sensing for predicting non-stationary ecological processes such as seasonal GPP.

키워드

NRGSBM_2019_v21n1_29_f0001.png 이미지

Fig. 1. Schematic diagram showing data flow and analysis. Models were trained in the testing period (2006 - 2013) by three different types of input data sets as follows, Type 1: air temperature (Tair), relative humidity (RH), daily net radiation (Rd), precipitation (PPT), and evapotranspiration (ET) from eddy flux measurement, Type 2: air temperature (T), daily shortwave radiation (Rsd), and vapor pressure deficit (VPD) from MODIS, and Type 3: air temperature (T), daily shortwave radiation (Rsd), and vapor pressure deficit (VPD), and EVI from MODIS. Gross Primary Production (GPP) calculated based on multiple linear regression model (LM), support vector machine (SVM), random forest (RF), and artificial neural network (ANN) and evaluated with eddy covariance GPP in 2014 and 2015.

NRGSBM_2019_v21n1_29_f0002.png 이미지

Fig. 2. Daily GPP prediction obtained with linear regression model (LM), Support Vector Machine (SVM), Random Forest (RF), and Artificial Neural Network (ANN) (black opened circle) and GPP obtained eddy covariance (EC) measurement (gray closed circle) using EC measurement datasets in 2014 and 2015.

NRGSBM_2019_v21n1_29_f0003.png 이미지

Fig. 3. Comparisons of Eddy Covariance (EC) measurement GPP and modeled GPP from the trained models by EC measurement datasets in 2014 and 2015. LM = linear regression model, SVM = Support Vector Machine, RF = Random Forest, ANN = Artificial Neural Network.

NRGSBM_2019_v21n1_29_f0004.png 이미지

Fig. 4. Daily GPP prediction obtained with linear regression model (LM), Support Vector Machine (SVM), Random Forest (RF), and Artificial Neural Network (ANN) (black opened circle) and GPP obtained eddy covariance (EC) measurement (gray closed circle) using MODIS datasets (without Enhanced Vegetation Index, EVI) in 2014 and 2015.

NRGSBM_2019_v21n1_29_f0005.png 이미지

Fig. 5. Comparisons of Eddy Covariance (EC) measurement GPP and modeled GPP from the trained models by MODIS datasets (without Enhanced Vegetation Index, EVI) in 2014 and 2015. LM = linear regression model, SVM = Support Vector Machine, RF = Random Forest, ANN = Artificial Neural Network.

NRGSBM_2019_v21n1_29_f0006.png 이미지

Fig. 6. Daily GPP prediction obtained with linear regression model (LM), Support Vector Machine (SVM), Random Forest (RF), and Artificial Neural Network (ANN) (black opened circle) and GPP obtained eddy covariance (EC) measurement (gray closed circle) using MODIS datasets (with EVI) in 2014 and 2015.

NRGSBM_2019_v21n1_29_f0007.png 이미지

Fig. 7. Comparisons of Eddy Covariance (EC) measurement GPP and modeled GPP from the trained models by MODIS datasets (with EVI) in 2014 and 2015. LM = linear regression model, SVM = Support Vector Machine, RF = Random Forest, ANN = Artificial Neural Network.

Table 1. Summary of MODIS Land Products used in explanatory variables from 2006 to 2015

NRGSBM_2019_v21n1_29_t0001.png 이미지

Table 2. Summary of MODIS Atmosphere Products used in explanatory variables from 2006 to 2015

NRGSBM_2019_v21n1_29_t0002.png 이미지

Table 3. Table Summary of statistics for the different algorithms based on the three different input sets. R, RMSE, STD and MSE denote correlation coefficient, root mean square error, standard deviation and mean squared error, respectively

NRGSBM_2019_v21n1_29_t0003.png 이미지

참고문헌

  1. Baldocchi, D., E. Falge, L. Gu, R. Olson, D. Hollinger, S. Running, P. Anthoni, C. Bernhofer, K. Davis, R. Evans, J. Fuentes, A. Goldstein, G. Katul, B. Law, X. Lee, Y. Malhi, T. Meyers, W. Munger, W. Oechel, K. T. Paw U, K. Pilegaard, H. P. Schmid, R. Valentini, S. Verma, T. Vesala, K. Wilson, and S. Wofsy, 2001: FLUXNET: a new tool to study the temporal and spatial variability of ecosystem-scale carbon dioxide, water vapor, and energy flux densities. American Meteorological Society 82(11), 2415-2434. https://doi.org/10.1175/1520-0477(2001)082<2415:FANTTS>2.3.CO;2
  2. Beck, P. S. A., C. Atzberger, K. A. Hogda, B. Johansen, and A. K. Skidmore, 2006: Improved monitoring of vegetation dynamics at very high latitudes: A new method using MODIS NDVI. Remote Sensing of Environment 100(3), 321-334. https://doi.org/10.1016/j.rse.2005.10.021
  3. Beer, C., M. Reichstein, E. Tomelleri, P. Ciais, M. Jung, N. Carvalhais, C. Rodenbeck, M. A. Arain, D. Baldocchi, G. B. Bonan, and A. Bondeau, 2010: Terrestrial gross carbon dioxide uptake: global distribution and covariation with climate. Science 329(5993), 834-838. https://doi.org/10.1126/science.1184984
  4. Bird, R. E., and R. L. Hulstrom, 1981: Simplified clear sky model for direct and diffuse insolation on horizontal surfaces (No. SERI/TR-642-761). Solar Energy Research Institute, Golden, CO (USA).
  5. Borbas, E. E., S. W. Seemann, A. Kern, L. Moy, J. Li, L. Gumley, and W. P. Menzel, 2011: MODIS atmospheric profile retrieval algorithm theoretical basis document. technical report, gfsc. nasa.
  6. Breiman, L., 1996: Bagging predictors. Machine learning 24(2), 123-140. https://doi.org/10.1023/A:1018054314350
  7. Breiman, L., 2001: Random Forests. Machine Learning 45(1), 5-32. https://doi.org/10.1023/A:1010933404324
  8. Cortes, C., and V. Vapnik, 1995: Support-Vector Networks. Machine Learning 20(3), 273-297. https://doi.org/10.1007/BF00994018
  9. Dimitriadou, E., K. Hornik, F. Leisch, D. Meyer, and A. Weingessel, 2008: Misc functions of the Department of Statistics (e1071), TU Wien. R package 1, 5-24.
  10. Dingman, S. L., 2008: Physical hydrology. Second Edition ed. Long Grove, Illinois, USA: Waveland Press, 636pp.
  11. Dou, X., Y. Yang, and J. Luo, 2018: Estimating forest carbon fluxes using machine learning techniques based on eddy covariance measurements. Sustainability 10, 203pp. https://doi.org/10.3390/su10010203
  12. Fritsch, S., F. Guenther, and M. F. Guenther, 2019: Package 'neuralnet'. Training of Neural Networks.
  13. Gray, A. N., and T. R. Whittier, 2014: Carbon stocks and changes on Pacific Northwest national forests and the role of disturbance, management, and growth. Forest ecology and management 328, 167-178. https://doi.org/10.1016/j.foreco.2014.05.015
  14. Houborg, R. M., and H. Soegaard, 2004: Regional simulation of ecosystem $CO_2$ and water vapor exchange for agricultural land using NOAA AVHRR and Terra MODIS satellite data. Application to Zealand, Denmark. Remote Sensing of Environment 93(1-2), 150-167. https://doi.org/10.1016/j.rse.2004.07.001
  15. Ichii, K., M. Ueyama, M. Kondo, N. Saigusa, J. Kim, M. C. Alberto, J. Ardo, J., E.S. Euskirchen, M. Kang, T. Hirano, and J. Joiner, 2017: New data-driven estimation of terrestrial $CO_2$ fluxes in Asia using a standardized database of eddy covariance measurements, remote sensing data, and support vector regression. Journal of Geophysical Research: Biogeosciences 122(4), 767-795. https://doi.org/10.1002/2016JG003640
  16. Jang, K., S. Kang, and S. Y. Hong, 2014: Comparisons of Collection 5 and 6 Aqua MODIS07_L2 air and dew temperature products with ground-based observation dataset. Korean Journal of Remote Sensing 30(5), 571-586. https://doi.org/10.7780/kjrs.2014.30.5.3
  17. Jang, K., S. Kang, J. S. Kimball, and S. Y. Hong, 2014b: Retrievals of all-weather daily air temperature using MODIS and AMSR-E data. Remote Sensing 6(9), 8387-8404. https://doi.org/10.3390/rs6098387
  18. Jang, K., M. Won, and S. Yoon, 2017: Evaluation of the satellite-based air temperature for all sky conditions using the Automated Mountain Meteorology Station (AMOS) records: Gangwon Province case study. Korean Journal of Agricultural and Forest Meteorology 19(1), 19-26. https://doi.org/10.5532/KJAFM.2017.19.1.19
  19. Kang, M., H. Kwon, J.-H. Cheon, and J. Kim, 2012: On estimating wet canopy evaporation from deciduous and coniferous forests in the Asian monsoon climate. Journal of Hydrometeorology 13(3), 950-965. https://doi.org/10.1175/JHM-D-11-07.1
  20. Kang, M., J. Kim, H.-S. Kim, B. M. Thakuri, and J.-H. Chun, 2014: On the nighttime correction of CO2 flux measured by eddy covariance over temperate forests in complex terrain. Korean Journal of Agricultural and Forest Meteorology 16(3), 233-245. https://doi.org/10.5532/KJAFM.2014.16.3.233
  21. Kang, M., J. Kim, B. M. Thakuri, J. Chun, and C. Cho, 2018: New gap-filling and partitioning technique for H2O eddy fluxes measured over forests. Biogeosciences 15(2), 631-647. https://doi.org/10.5194/bg-15-631-2018
  22. Kim, N. S., D. Han, J. Y. Cha, Y. S. Park, H. J. Cho, H. J. Kwon, Y.-C. Cho, S.-H. Oh, and C. S. Lee, 2015: A detection of novel habitats of Abies koreana by using species distribution models (SDMs) and its application for plant conservation. Journal of the Korea Society of Environmental Restoration Technology 18(6), 135-149. https://doi.org/10.13087/KOSERT.2015.18.6.135
  23. Kwon, H., S. Park, M. Kang, J. Yoo, R. Yuan, and J. Kim, 2007: Quality control and assurance of eddy covariance data at the two KoFlux sites. Korean Journal of Agricultural and Forest Meteorology 9(4), 260-267. https://doi.org/10.5532/KJAFM.2007.9.4.260
  24. Lantz, B., 2013: Machine learning with R. Packt Publishing Ltd., 125-135.
  25. Lee, J. H., S. Kang, K. C. Jang, J. H. Ko, and S. Y. Hong, 2011: The evaluation of meteorological inputs retrieved from MODIS for estimation of gross primary productivity in the US corn belt region. Korean Journal of Remote Sensing 27(4), 481-494. https://doi.org/10.7780/kjrs.2011.27.4.481
  26. Lee, B., W. Kang, C.-K. Kim, G. Kim, and C.-H. Lee, 2017: Estimating carbon uptake in forest and agricultural ecosystems of Korea and other countries using eddy covariance flux data. Journal of Environmental Impact Assessment 26(2), 127-139. https://doi.org/10.14249/eia.2017.26.2.127
  27. Lee, B., E. Kim, J. Lee, J.-M. Chung, and J.-H. Lim, 2018: Detecting phenology using MODIS vegetation indices and forest type map in South Korea. Korean Journal of Remote Sensing 34(2), 267-282. https://doi.org/10.7780/kjrs.2018.34.2.1.9
  28. Liaw, A., and M. Wiener, 2002: Classification and regression by randomForest. R news 2(3), 18-22.
  29. Masuoka, E., A. Fleig, R.E. Wolfe, and F. Patt, 1998: Key characteristics of MODIS data products. IEEE Transactions on Geoscience and Remote Sensing 36(4), 1313-1323. https://doi.org/10.1109/36.701081
  30. Oh, H. J., 2010: Landslide detection and landslide susceptibility mapping using aerial photos and artificial neural networks. Korean journal of remote sensing 26(1), 47-57.
  31. Recknagel, F., 2001: Applications of machine learning to ecological modelling. Ecological Modelling 146(1-3), 303-310. https://doi.org/10.1016/S0304-3800(01)00316-7
  32. Reeves, M. C., M. Zhao, and S. W. Running, 2005: Usefulness and limits of MODIS GPP for estimating wheat yield. International Journal of Remote Sensing 26(7), 1403-1421 https://doi.org/10.1080/01431160512331326567
  33. Running, S. W., D. D. Baldocchi, D. P. Turner, S. T. Gower, P. S. Bakwin, and K. A. Hibbard, 1999: A global terrestrial monitoring network integrating tower fluxes, flask sampling, ecosystem modeling and EOS satellite data. Remote Sensing of Environment 70(1), 108-127. https://doi.org/10.1016/S0034-4257(99)00061-9
  34. Running, S. W., P. E. Thornton, R. Nemani, and J. M. Glassy, 2000: Global terrestrial gross and net primary productivity from the Earth Observing System. In Methods in ecosystem science, 44-57. Springer, New York, NY (USA).
  35. Running, S. W., and M. Zhao, 2015: Daily GPP and annual NPP (MOD17A2/A3) products NASA Earth Observing System MODIS land algorithm. MOD17 User's Guide.
  36. Ryu, Y., S. Kang, S. K. Moon, and J. Kim, 2008: Evaluation of land surface radiation balance derived from moderate resolution imaging spectroradiometer (MODIS) over complex terrain and heterogeneous landscape on clear sky days. agricultural and forest meteorology 148(10), 1538-1552. https://doi.org/10.1016/j.agrformet.2008.05.008
  37. Schindler, D. E., and R. Hilborn, 2015: Prediction, precaution, and policy under global change. Science 347(6225), 953-954. http://doi.org/10.1126/science.1261824
  38. Sims, D. A., A. F. Rahman, V. D. Cordova, B. Z. El-Masri, D. D. Baldocchi, L. B. Flanagan, A. H. Goldstein, D. Y. Hollinger, L. Misson, R. K. Monson, and W. C. Oechel, 2006: On the use of MODIS EVI to assess gross primary productivity of North American ecosystems. Journal of Geophysical Research: Biogeosciences 111(G4).
  39. Smith, P., G. Lanigan, W. L. Kutsch, N. Buchmann, W. Eugster, M. Aubinet, E. Ceschia, P. Beziat, J. B. Yeluripati, B. Osborne, and E. J. Moors, 2010: Measurements necessary for assessing the net ecosystem carbon budget of croplands. Agriculture, ecosystems and environment 139(3), 302-315. https://doi.org/10.1016/j.agee.2010.04.004
  40. Smola, A. J., and B. Scholkopf, 2004: A tutorial on support vector regression. Statistics and computing 14(3), 199-222. https://doi.org/10.1023/B:STCO.0000035301.49549.88
  41. Tramontana, G., K. Ichii, G. Camps-Valls, E. Tomelleri, and D. Papale, 2015: Uncertainty analysis of gross primary production upscaling using Random Forests, remote sensing and eddy covariance data. Remote Sensing of Environment 168, 360-373. https://doi.org/10.1016/j.rse.2015.07.015
  42. Wang, X., Y. Yao, S. Zhao, K. Jia, X. Zhang, Y. Zhang, L. Zhang, and X. Chen, 2017: MODIS-based estimation of terrestrial latent heat flux over North America using three machine learning algorithms. Remote Sensing 9(12), 1326pp. https://doi.org/10.3390/rs9121326
  43. White, M. A., P. E. Thornton, S. W. Running, and R. R. Nemani, 2000: Parameterization and sensitivity analysis of the BIOME-BGC terrestrial ecosystem model: net primary production controls. Earth interactions 4(3), 1-85. https://doi.org/10.1175/1087-3562(2000)004<0001:ITEASI>2.3.CO;2
  44. Vidal, R., J. Bruna, R. Giryes, and S. Soatto, 2017: Mathematics of deep learning. arXiv preprint arXiv:1712.04741.
  45. Ye, H., R. J. Beamish, S. M. Glaser, S. C. Grant, C. H. Hsieh, L. J. Richards, J. T. Schnute, and G. Sugihara, 2015: Equation-free mechanistic ecosystem forecasting using empirical dynamic modeling. Proceedings of the National Academy of Sciences 112(13), E1569-E1576. https://doi.org/10.1073/pnas.1417063112