Browse > Article
http://dx.doi.org/10.7780/kjrs.2021.37.6.1.21

Comparative Assessment of Linear Regression and Machine Learning for Analyzing the Spatial Distribution of Ground-level NO2 Concentrations: A Case Study for Seoul, Korea  

Kang, Eunjin (Department of Urban and Environmental Engineering, Ulsan National Institute of Science and Technology)
Yoo, Cheolhee (Department of Urban and Environmental Engineering, Ulsan National Institute of Science and Technology)
Shin, Yeji (Department of Urban and Environmental Engineering, Ulsan National Institute of Science and Technology)
Cho, Dongjin (Department of Urban and Environmental Engineering, Ulsan National Institute of Science and Technology)
Im, Jungho (Department of Urban and Environmental Engineering, Ulsan National Institute of Science and Technology)
Publication Information
Korean Journal of Remote Sensing / v.37, no.6_1, 2021 , pp. 1739-1756 More about this Journal
Abstract
Atmospheric nitrogen dioxide (NO2) is mainly caused by anthropogenic emissions. It contributes to the formation of secondary pollutants and ozone through chemical reactions, and adversely affects human health. Although ground stations to monitor NO2 concentrations in real time are operated in Korea, they have a limitation that it is difficult to analyze the spatial distribution of NO2 concentrations, especially over the areas with no stations. Therefore, this study conducted a comparative experiment of spatial interpolation of NO2 concentrations based on two linear-regression methods(i.e., multi linear regression (MLR), and regression kriging (RK)), and two machine learning approaches (i.e., random forest (RF), and support vector regression (SVR)) for the year of 2020. Four approaches were compared using leave-one-out-cross validation (LOOCV). The daily LOOCV results showed that MLR, RK, and SVR produced the average daily index of agreement (IOA) of 0.57, which was higher than that of RF (0.50). The average daily normalized root mean square error of RK was 0.9483%, which was slightly lower than those of the other models. MLR, RK and SVR showed similar seasonal distribution patterns, and the dynamic range of the resultant NO2 concentrations from these three models was similar while that from RF was relatively small. The multivariate linear regression approaches are expected to be a promising method for spatial interpolation of ground-level NO2 concentrations and other parameters in urban areas.
Keywords
Spatial Interpolation; gap-filling; ground-level NO2 concentration; random forest; support vector regression; regression kriging; multi linear regression;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Hengl, T., G.B. Heuvelink, and D.G. Rossiter, 2007. About regression-kriging: From equations to case studies, Computers and Geosciences, 33(10): 1301-1315.   DOI
2 Boardman, M. and T. Trappenberg, 2006. A heuristic for free parameter optimization with support vector machines, Proc. of In the 2006 IEEE International Joint Conference on Neural Network, Vancouver, BC, CAN, Jul. 16-21, pp. 610-617.
3 Boersma, K.F., H.J. Eskes, J.P. Veefkind, E.J. Brinksma, R.J. van der A, M. Sneep, G.H.J. van den Oord, P.F. Levelt, P. Stammes, J.F. Gleason, and E.J. Bucsela, 2007. Near-real time retrieval of tropospheric NO2 from OMI, Atmospheric Chemistry and Physics, 7(8): 2103-2118.   DOI
4 Cui, Y., L. Jiang, W. Zhang, H. Bao, B. Geng, Q. He, L. Zhang, and D.G. Streets, 2019. Evaluation of China's environmental pressures based on satellite NO2 observation and the extended STIRPAT model, International Journal of Environmental Research and Public Health, 16(9): 1487.   DOI
5 Ghahremanloo, M., Y. Lops, Y. Choi, and S. Mousavinezhad, 2021. Impact of the COVID-19 outbreak on air pollution levels in East Asia, Science of the Total Environment, 754: 142226.   DOI
6 Goldberg, D.L., S.C. Anenberg, D. Griffin, C.A. McLinden, Z. Lu, and D.G. Streets, 2020. Disentangling the impact of the COVID-19 lockdowns on urban NO2 from natural variability, Geophysical Research Letters, 47(17): e2020GL089269.
7 Sun, S., J.D. Stewart, M.N. Eliot, J.D. Yanosky, D. Liao, L.F. Tinker, C.B. Eaton, E.A. Whitsel, G.A. Wellenius, 2019. Short-term exposure to air pollution and incidence of stroke in the Women's Health Initiative, Environment International, 132: 105065.   DOI
8 Wang, J., S. Qin, Q. Zhou, and H. Jiang, 2015. Medium-term wind speeds forecasting utilizing hybrid models for three different sites in Xinjiang, China, Renewable Energy, 76: 91-101.   DOI
9 Wang, L., H. Liu, H. Su, and J. Wang, 2019. Bathymetry retrieval from optical images with spatially distributed support vector machines, GIScience and Remote Sensing, 56(3): 323-337.   DOI
10 Willmott, C.J., S.M. Robeson, and K. Matsuura, 2012. A refined index of model performance, International Journal of Climatology, 32(13): 2088-2094.   DOI
11 Choi, H., Y. Kang, and J. Im, 2021. Estimation of TROPOMI-derived Ground-level SO2 Concentrations Using Machine Learning Over East Asia, Korean Journal of Remote Sensing, 37(2): 275-290 (in Korean with English abstract).   DOI
12 Caballero, S., R. Esclapez, N. Galindo, E. Mantilla, and J. Crespo, 2012. Use of a passive sampling network for the determination of urban NO2 spatiotemporal variations, Atmospheric Environment, 63: 148-155.   DOI
13 Chao, Z., L. Wang, M. Che, and S. Hou, 2020. Effects of different urbanization levels on land surface temperature change: taking tokyo and shanghai for example, Remote Sensing, 12(12): 2022.   DOI
14 Cho, D., C. Yoo, J. Im, Y. Lee, and J. Lee, 2020. Improvement of spatial interpolation accuracy of daily maximum air temperature in urban areas using a stacking ensemble technique, GIScience and Remote Sensing, 57(5): 633-649.   DOI
15 Zhu, Y., Y. Zhan, B. Wang, Z. Li, Y. Qin, and K. Zhang, 2019. Spatiotemporally mapping of the relationship between NO2 pollution and urbanization for a megacity in Southwest China during 2005-2016, Chemosphere, 220: 155-162.   DOI
16 Oliver, M.A. and R. Webster, 2014. A tutorial guide to geostatistics: Computing and modelling variograms and kriging, Catena, 113: 56-69.   DOI
17 Ren, X., Z. Mi, and P.G. Georgopoulos, 2020. Comparison of Machine Learning and Land Use Regression for fine scale spatiotemporal estimation of ambient air pollution: Modeling ozone concentrations across the contiguous United States, Environment International, 142: 105827.   DOI
18 Ryu, Y.H., J.J. Baik, K.H. Kwak, S. Kim, and N. Moon, 2013. Impacts of urban land-surface forcing on ozone air quality in the Seoul metropolitan area, Atmospheric Chemistry and Physics, 13(4): 2177-2194.   DOI
19 Wanninkhof, R., 2014. Relationship between wind speed and gas exchange over the ocean revisited, Limnology and Oceanography: Methods, 12(6): 351-362.   DOI
20 WHO (World Health Organizations), 2005. Particulate matter, ozone, nitrogen dioxide and sulfur dioxide. In Air Quality Guidelines: Global Update, 2005. http://www.euro.who.int/__data/assets/pdf_file/0005/78638/E90038.pdf, Accessed Aug. 15, 2016.
21 Kim, S. Y., S.J. Yi, Y.S. Eum, H.J. Choi, H. Shin, H.G. Ryou, and H. Kim, 2014. Ordinary kriging approach to predicting long-term particulate matter concentrations in seven major Korean cities, Environmental Health and Toxicology, 29: e2014012.   DOI
22 Chen, J., K. de Hoogh, J. Gulliver, B. Hoffmann, O. Hertel, M. Ketzel, M. Bauwelinck, A. van Donkelaar, U.A. Hvidtfeldt, K. Katsouyanni, N.A.H. Janssen, R.V. Martin, E. Samoli, P.E. Schwartz, M. Stafoggia, T. Bellander, M. Strak, K. Wolf, D. Vienneau, R. Vermeulen, B. Brunekreef, and G. Hoek, 2019. A comparison of linear regression, regularization, and machine learning algorithms to develop Europe-wide spatial models of fine particles and nitrogen dioxide, Environment International, 130: 104934.   DOI
23 Shukla, K., P. Kumar, G.S. Mann, and M. Khare, 2020. Mapping spatial distribution of particulate matter using Kriging and Inverse Distance Weighting at supersites of megacity Delhi, Sustainable Cities and Society, 54: 101997.   DOI
24 EPA (United States Environmental Protection Agency), 2013. Integrated Science Assessment (ISA) of Ozone and Related Photochemical Oxidants Final Report, EPA, Washington, DC, USA.
25 Vienneau, D., K. de Hoogh, M.J. Bechle, R. Beelen, A. van Donkelaar, R.V. Martin, D.B. Millet, G. Hoek, and J.D. Marshall, 2013. Western European land use regression incorporating satellite-and ground-based measurements of NO2 and PM10, Environmental Science and Technology, 47(23): 13555-13564.   DOI
26 Wu, C.D., Y.T. Zeng, and S.C.C. Lung, 2018. A hybrid kriging/land-use regression model to assess PM2.5 spatial-temporal variability, Science of the Total Environment, 645: 1456-1464.   DOI
27 Zuniga, J., M. Tarajia, V. Herrera, W. Urriola, B. Gomez, and J. Motta, 2016. Assessment of the possible association of air pollutants PM10, O3, NO2 with an increase in cardiovascular, respiratory, and diabetes mortality in Panama City: a 2003 to 2013 data analysis, Medicine, 95(2): e2464.   DOI
28 Kang, Y., H. Choi, J. Im, S. Park, M. Shin, C.K. Song, and S. Kim, 2021. Estimation of surface-level NO2 and O3 concentrations using TROPOMI data and machine learning over East Asia, Environmental Pollution, 288: 117711.   DOI
29 Kim, M., D. Brunner, and G. Kuhlmann, 2021. Importance of satellite observations for high-resolution mapping of near-surface NO2 by machine learning, Remote Sensing of Environment, 264: 112573.   DOI
30 Cui, Y., W. Zhang, H. Bao, C. Wang, W. Cai, J. Yu, and D.G. Streets, 2019. Spatiotemporal dynamics of nitrogen dioxide pollution and urban development: Satellite observations over China, 2005-2016, Resources, Conservation and Recycling, 142: 59-68.   DOI
31 Harris, P., A.S. Fotheringham, R. Crespo, and M. Charlton, 2010. The use of geographically weighted regression for spatial prediction: an evaluation of models using simulated data sets, Mathematical Geosciences, 42(6): 657-680.   DOI
32 Christensen, R., 2020. Plane answers to complex questions: the theory of linear models, Springer Science and Business Media, Berlin, GER.
33 Draper, N.R. and H. Smith, 1998. Applied regression analysis, Third Edition (Vol. 326), John Wiley and Sons, Hoboken, NJ, USA.
34 Graler, B., M. Rehr, L. Gerharz, and E. Pebesma, 2012. Spatio-temporal analysis and interpolation of PM10 measurements in Europe for 2009, ETC/ACM Technical Paper, 8: 1-29.
35 Holben, B.N., 1986. Characteristics of maximum-value composite images from temporal AVHRR data, International Journal of Remote Sensing, 7(11): 1417-1434.   DOI
36 Park, S., J. Im, S. Park, and J. Rhee, 2017. Drought monitoring using high resolution soil moisture through multi-sensor satellite data fusion over the Korean peninsula, Agricultural and Forest Meteorology, 237: 257-269.   DOI
37 Krotkov, N.A., C.A. McLinden, C. Li, L.N. Lamsal, E.A. Celarier, S.V. Marchenko, W.H. Swartz, E.J. Bucsela, J. Joiner, B.N. Duncan, K.F. Boersma, J.P. V, Pieternel F. Levelt, V.E. Fioletov, R.R. Dickerson, H. He, Z. Lu, and D.G. Streets, 2016. Aura OMI observations of regional SO2 and NO2 pollution changes from 2005 to 2015, Atmospheric Chemistry and Physics, 16(7): 4605-4629.   DOI
38 Li, J. and A.D. Heap, 2014. Spatial interpolation methods applied in the environmental sciences: A review, Environmental Modelling and Software, 53: 173-189.   DOI
39 LUINTEL, N., W. Ma, Y. Ma, B. Wang, and S. SUBBA, 2019. Spatial and temporal variation of daytime and nighttime MODIS land surface temperature across Nepal, Atmospheric and Oceanic Science Letters, 12(5): 305-312.   DOI
40 Nguyen, H.T., K.H. Kim, and C. Park, 2015. Long-term trend of NO2 in major urban areas of Korea and possible consequences for health, Atmospheric Environment, 106: 347-357.   DOI
41 Park, S., M. Kim, and J. Im, 2021. Estimation of Ground-level PM10 and PM2.5 Concentrations Using Boosting-based Machine Learning from Satellite and Numerical Weather Prediction Data, Korean Journal of Remote Sensing, 37(2): 321-335 (in Korean with English Abstract).   DOI
42 Kuhnlein, M., T. Appelhans, B. Thies, and T. Nauss, 2014. Improving the accuracy of rainfall rates from optical satellite sensors with machine learning-A random forests-based approach applied to MSG SEVIRI, Remote Sensing of Environment, 141: 129-143.   DOI
43 Guo, Z., S.D. Wang, M.M. Cheng, and Y. Shu, 2012. Assess the effect of different degrees of urbanization on land surface temperature using remote sensing images, Procedia Environmental Sciences, 13: 935-942.   DOI
44 Wu, S., B. Huang, J. Wang, L. He, Z. Wang, Z. Yan, X. Lao, F. Zhang, R. Liu, and Z. Du 2021. Spatiotemporal mapping and assessment of daily ground NO2 concentrations in China using high-resolution TROPOMI retrievals, Environmental Pollution, 273: 116456.   DOI
45 Yoo, J.M., Y.-R. Lee, D. Kim, M.-J. Jeong, W.R. Stockwell, P.K. Kundu, S.-M. Oh, D.-B. Shin, and S.-J. Lee, 2014. New indices for wet scavenging of air pollutants (O3, CO, NO2, SO2, and PM10) by summertime rain, Atmospheric Environment, 82: 226-237.   DOI
46 Zhan, Y., Y. Luo, X. Deng, K. Zhang, M. Zhang, M.L. Grieneisen, and B. Di, 2018. Satellite-based estimates of daily NO2 exposure in China using hybrid random forest and spatiotemporal kriging model, Environmental Science and Technology, 52(7): 4180-4189.   DOI
47 Gupta, A.K., K. Karar, S. Ayoob, and K. John, 2008. Spatio-temporal characteristics of gaseous and particulate pollutants in an urban region of Kolkata, India, Atmospheric Research, 87(2): 103-115.   DOI
48 Hengl, T., G.B. Heuvelink, and A. Stein, 2004. A generic framework for spatial prediction of soil variables based on regression-kriging, Geoderma, 120(1-2): 75-93.
49 Lin, J.T., Z. Liu, Q. Zhang, H. Liu, J. Mao, and G. Zhuang, 2012. Modeling uncertainties for tropospheric nitrogen dioxide columns affecting satellite-based inverse modeling of nitrogen oxides emissions, Atmospheric Chemistry and Physics, 12(24): 12255-12275.   DOI
50 Houborg, R. and M.F. McCabe, 2018. A hybrid training approach for leaf area index estimation via Cubist and random forests machine-learning, ISPRS Journal of Photogrammetry and Remote Sensing, 135: 173-188.   DOI
51 Kaminska, J.A., 2019. A random forest partition model for predicting NO2 concentrations from traffic flow and meteorological conditions, Science of the Total Environment, 651: 475-483.   DOI
52 Li, X., A. Luo, J. Li, and Y. Li, 2019. Air pollutant concentration forecast based on support vector regression and quantum-behaved particle swarm optimization, Environmental Modeling and Assessment, 24(2): 205-222.   DOI
53 Breiman, L., 2001. Random forests, Machine Learning, 45(1): 5-32.   DOI
54 Horning, N., 2013. Introduction to decision trees and random forests, American Museum of Natural History, Manhattan, NY, USA.
55 Ialongo, I., H. Virta, H. Eskes, J. Hovila, and J. Douros, 2020. Comparison of TROPOMI/Sentinel-5 Precursor NO2 observations with ground-based measurements in Helsinki, Atmospheric Measurement Techniques, 13(1): 205-218.   DOI