Browse > Article
http://dx.doi.org/10.15207/JKCS.2022.13.03.033

Linear interpolation and Machine Learning Methods for Gas Leakage Prediction Base on Multi-source Data Integration  

Dashdondov, Khongorzul (Department of Computer Engineering, Chungbuk National University)
Jo, Kyuri (Department of Computer Engineering, Chungbuk National University)
Kim, Mi-Hye (Department of Computer Engineering, Chungbuk National University)
Publication Information
Journal of the Korea Convergence Society / v.13, no.3, 2022 , pp. 33-41 More about this Journal
Abstract
In this article, we proposed to predict natural gas (NG) leakage levels through feature selection based on a factor analysis (FA) of the integrating the Korean Meteorological Agency data and natural gas leakage data for considering complex factors. The paper has been divided into three modules. First, we filled missing data based on the linear interpolation method on the integrated data set, and selected essential features using FA with OrdinalEncoder (OE)-based normalization. The dataset is labeled by K-means clustering. The final module uses four algorithms, K-nearest neighbors (KNN), decision tree (DT), random forest (RF), Naive Bayes (NB), to predict gas leakage levels. The proposed method is evaluated by the accuracy, area under the ROC curve (AUC), and mean standard error (MSE). The test results indicate that the OrdinalEncoder-Factor analysis (OE-F)-based classification method has improved successfully. Moreover, OE-F-based KNN (OE-F-KNN) showed the best performance by giving 95.20% accuracy, an AUC of 96.13%, and an MSE of 0.031.
Keywords
Natural Gas; Leak prediction; Linear Interpolation; K-nearest neighbors; Convergence;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 J. Peppanen, X. Zhang, S. Grijalva & M. J. Reno. (2016, September). Handling bad or missing smart meter data through advanced data imputation. In 2016 IEEE Power & Energy Society Innovative Smart Grid Technologies Conference (ISGT) (pp. 1-5). IEEE.
2 Department for International Development. Live Data Page for Energy and Water Consumption. Available online: http://data.gov.uk/dataset/dfid-energy-and-water-consumption (accessed on 8 March 2021).
3 Ministry of Public Safety and Security. (2019) 2019th Yearbook of Disaster, Ministry of Public Safety and Security; Ministry of Public Safety and Security: Sejong, Korea.
4 D. Khongorzul, M. H. Kim & S. M. Lee. (2019). OrdinalEncoder based DNN for Natural Gas Leak Prediction. J. Korea Convergence Society, 10(10), 7-13.   DOI
5 Available website: UPO company, http://www.upokorea.com/new/pdf/UPO_Catalogue.pdf
6 D. Khongorzul & M. H. Song. (2022). Factorial Analysis for Gas Leakage Risk Predictions from a Vehicle-Based Methane Survey. Applied Sciences 12(1), 115. DOI : 10.3390/app12010115   DOI
7 USDT. Leak Detection Technology Study for PIPES Act; Tech. Rep.; U.S. Department of Transportation: Washington, DC, USA, 2007.
8 M. Fagiani, S. Squartini, L. Gabrielli, M. Severini & F. Piazza. (2016). A statistical framework for automatic leakage detection in smart water and gas grids. Energies, 9, 665. DOI : 10.3390/en9090665   DOI
9 N. M. Noor, M. M. Al Bakri Abdullah, A. S. Yahaya & N. A. Ramli. (2015) Comparison of Linear Interpolation Method and Mean Method to Replace the Missing Values in Environmental Data Set. Materials Science Forum, 803, 278-281.   DOI
10 C. M. Salgado, C. Azevedo, H. Proenca & S. M. Vieira. (2016). Missing data. Secondary analysis of electronic health records, 143-162.
11 Available website: Korean public data portal. https://www.data.go.kr/dataset/15000099/openapi.do
12 Y. K. Kim & H. G. Sohn. (2018). Disasters from 1948 to 2015 in Korea and power-law distribution. In Disaster Risk Management in the Republic of Korea; pp. 77-97. Springer, Singapore.
13 T. Kim, W. Ko & J. Kim. (2019). Analysis and impact evaluation of missing data imputation in day-ahead PV generation forecasting. Applied Sciences, 9(1), 204.   DOI
14 D. Khongorzul, S. M. Lee, Y. K. Kim & M. H. Kim. (2019). Image Denoising Methods based on DAECNN for Medication Prescriptions. Journal of the Korea Convergence Society, 10(5), 17-26. DOI : 10.15207/JKCS.2019.10.5.017.   DOI
15 V. N. Vapnik. (1995). The nature of statistical learning theory. New York: Springer.