Browse > Article
http://dx.doi.org/10.9717/kmms.2020.23.6.747

Improvement of PM Forecasting Performance by Outlier Data Removing  

Jeon, Young Tae (Dept. of Computer Eng., Anyang University)
Yu, Suk Hyun (Dept. of Information & Communication Eng., Anyang University)
Kwon, Hee Yong (Dept. of Computer Eng., Anyang University)
Publication Information
Abstract
In this paper, we deal with outlier data problems that occur when constructing a PM2.5 fine dust forecasting system using a neural network. In general, when learning a neural network, some of the data are not helpful for learning, but rather disturbing. Those are called outlier data. When they are included in the training data, various problems such as overfitting occur. In building a PM2.5 fine dust concentration forecasting system using neural network, we have found several outlier data in the training data. We, therefore, remove them, and then make learning 3 ways. Over_outlier model removes outlier data that target concentration is low, but the model forecast is high. Under_outlier model removes outliers data that target concentration is high, but the model forecast is low. All_outlier model removes both Over_outlier and Under_outlier data. We compare 3 models with a conventional outlier removal model and non-removal model. Our outlier removal model shows better performance than the others.
Keywords
$PM_{2.5}$ Forecasting; Deep Neural Network; Outlier; AI;
Citations & Related Records
Times Cited By KSCI : 6  (Citation Analysis)
연도 인용수 순위
1 S. Lee, C. Ho, and Y. Choi, "High-PM10 Concentration Episodes in Seoul, Korea: Background Sources and Related Meteorological Conditions," Atmospheric Environment, Vol. 45, Issue 39, pp. 7240-7247, 2011.   DOI
2 S. Lee, C. Ho, Y. Lee, H. Choi, and C. Song, "Influence of Transboundary Air Pollutants for China on the High PM10 Episode in Seoul, Korea for the Period October 16-20, 2008," Atmospheric Environment, Vol. 77, pp. 430-439, 2013.   DOI
3 H. Oh, C. Ho, J. Kim, D. Chen, S. Lee, Y. Choi, et al., "Long-Range Transport of Air Pollutants Originating in China: A possible Major Cause of Mulit-Day High-PM10 Episodes During Cold Season in Seoul, Korea," Atmospheric Environment, Vol. 109, pp. 23-30, 2015.   DOI
4 NIER, A Study of Construction of Air Quality Forecasting System Using Artificial Intelligence(I), NIER-SP2017-148, 11-1480523-000 3221-01, 2017.
5 A.B. Chelani, D.G. Gajghate, and M.Z. Hasan, "Prediction of Ambient PM10 and Toxic Metals Using Artificial Neural Networks," Journal of the Air and Waste Management Association, Vol. 52, Issue 7, pp. 805-810, 2002.   DOI
6 I.G. McKendry, "Evaluation of Artificial Neural Networks for Fine Particulate Pollution ($PM_{10}$ and $PM_{2.5}$) Forecasting," Journal of the Air and Waste Management Association, Vol. 52, Issue 9, pp. 1096-1101, 2002.   DOI
7 A. Chaloulakou, G. Grivas, and N. Spyrellis, "Neural Network and Multiple Regression Models for $PM_{10}$ Prediction in Athens: A Comparative Assessment," Journal of the Air and Waste Management Association, Vol. 53, Issue 10, pp. 1183-1190, 2003.   DOI
8 G. Corani, "Air Quality Prediction in Milan: Feed-Forward Neural Networks, Pruned Neural Networks and Lazy Learning," Ecological Modelling, Vol. 185, Issue 2-4, pp. 513-529, 2005.   DOI
9 M. Cai, Y. Yin, and M. Xie, "Prediction of Hourly Air Pollutant Concentrations Near Urban Arterials Using Artificial Neural Network Approach," Transportation Research Part D: Transport and Environment, Vol. 14, Issue 1, pp. 32-41, 2009.   DOI
10 S. Thomas and R.B. Jacko, "Model for Forecasting Expressway Fine Particulate Matter and Carbon Monoxide Concentration: Application of Regression and Neural Network Models," Journal of the Air and Waste Management Association, Vol. 57, Issue 4, pp. 480-488, 2012.   DOI
11 J. Fan, Q. Li, J. Hou, X. Feng, H. Karimian, and S. Lin, "A Spatiotemporal Prediction Framework for Air Pollution Based on Deep RNN," Proceeding of International Society for Photogrammetry and Remote Sensing Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume IV-4/W2, International Symposium on Spatiotemporal Computing, pp. 15-2, 2017.
12 B.S. Freeman, G. Taylor, B. Gharabaghi, and J. The, "Forecasting Air Quality Time Series Using Deep Learning," Journal of the Air and Waste Management Association, Vol. 68, Issue 8, pp. 866-886, 2018.   DOI
13 S. Yu and Y. Jeon, "Improvement of PM10 Forecasting Performance Using DNN and Secondary Data," Journal of Korea Multimedia Society, Vol. 22, No. 10, pp. 1187-1198, 2019.
14 Confidence Interval(2018), https://ko.wikipedia.org/wiki/%EC%8B%A0%EB%A2%B0_%EA%B5%AC%EA%B0%84 (accessed April 10, 2020).
15 S. Yu, "Development of PM10 Forecasting Model for Seoul Based on DNN Using East Asian Wide Area Data," Journal of Korea Multimedia Society, Vol. 22, No. 11, pp. 1300-1312, 2019.   DOI
16 Outlier(2000), https://terms.naver.com/entry.nhn?docId=1943645&cid=41989&categoryId=41989 (accessed March 12, 2020).
17 J.W. Tukey, Exploratory Data Analysis, Addison-Wesley, 1977.
18 Understanding Boxplots(2018), https://towardsdatascience.com/understanding-boxplots-5e2df7bcbd51 (accessed March 12, 2020).
19 S. Yu, Y. Jeon, and H. Kwon, "Improvement of $PM_{10}$ Forecasting Performance Using Membership Function and DNN," Journal of Korea Multimedia Society, Vol. 22, No. 9, pp. 1069-1079, 2019.   DOI