DOI QR코드

DOI QR Code

Multivariate Outlier Removing for the Risk Prediction of Gas Leakage based Methane Gas

메탄 가스 기반 가스 누출 위험 예측을 위한 다변량 특이치 제거

  • Dashdondov, Khongorzul (Department of Computer Engineering, Chungbuk National University) ;
  • Kim, Mi-Hye (Department of Computer Engineering, Chungbuk National University)
  • Received : 2020.10.30
  • Accepted : 2020.12.20
  • Published : 2020.12.28

Abstract

In this study, the relationship between natural gas (NG) data and gas-related environmental elements was performed using machine learning algorithms to predict the level of gas leakage risk without directly measuring gas leakage data. The study was based on open data provided by the server using the IoT-based remote control Picarro gas sensor specification. The naturel gas leaks into the air, it is a big problem for air pollution, environment and the health. The proposed method is multivariate outlier removing method based Random Forest (RF) classification for predicting risk of NG leak. After, unsupervised k-means clustering, the experimental dataset has done imbalanced data. Therefore, we focusing our proposed models can predict medium and high risk so best. In this case, we compared the receiver operating characteristic (ROC) curve, accuracy, area under the ROC curve (AUC), and mean standard error (MSE) for each classification model. As a result of our experiments, the evaluation measurements include accuracy, area under the ROC curve (AUC), and MSE; 99.71%, 99.57%, and 0.0016 for MOL_RF respectively.

본 연구에서는, 천연가스(NG) 데이터와 가스 관련 환경 요소 간의 관계를 기계학습 알고리즘을 사용하여 가스 누출 데이터를 직접 측정하지 않고 가스 누출 위험 수준을 예측하였다. 이번 연구는 서버가 제공하는 오픈 데이터인 IoT 기반 원격 제어 피카로(Picarro) 가스 센서 사양을 기반으로 사용했다. 천연 가스는 공기 중으로 누출이 되며, 대기 오염, 환경, 그리고 건강에 큰 문제가 된다. 본 연구에서 제안하는 방법은 천연 가스의 누출 위험 예측을 위한 랜덤 포레스트(Random Forest) 분류 기반 다변량 특이치 제거 방법이다. 비지도 k-평균 클러스터링 후에 실험 데이터 집합은 불균형 데이터이다. 따라서 우리는 제안된 모델이 중간과 높은 위험 수준을 가장 잘 예측할 수 있다는 점에 초점을 맞춘다. 이 경우 각 분류 모델에 대한 수신자 조작 특성(ROC) 곡선, 정확도, 평균 표준 오차(MSE)를 비교했다. 실험 결과로 정확도, 수신자 조작 특성의 곡선 아래 영역(AUC, Area Under the ROC Curve), MSE가 각각 MOL_RF의 경우 99.71%, 99.57%, 및 0.0016의 결과 값을 얻었다.

Keywords

References

  1. Z. D. Weller, D. K. Yang & J. C. Fischer. (2019). An open source algorithm to detect natural gas leaks from mobile methane survey data. PLOS ONE, 14(2), e0212287. https://doi.org/10.1371/journal.pone.0212287
  2. V. N. Vapnik. (1995). The nature of statistical learning theory. New York: Springer.
  3. J. C. von Fischer & D. Cooley et. al. (2017). Rapid, Vehicle-Based Identification of Location and Magnitude of Urban Natural Gas Pipeline Leaks. Environmental Science & Technology, 51(7), 4091-4099. DOI: 10.1021/acs.est.6b06095.
  4. P. Xue, Y. Jiang, Z. Zhou, X. Chen, X. Fang & J. Liu. (2020). Machine learning-based leakage fault detection for district heating networks. Energy and Buildings, 223, 110161, DOI: 10.1016/j.enbuild.2020.110161.
  5. Y. M. Ju, H. S. Lee & J. C. Oh. (2018). Design and Implementation of Gas Leakage Alarm IoT System for Safety Helmet. ournal of the Korea Convergence Society, 13(6), 1411-1416. DOI: 10.13067/JKIECS.2018.13.6.1411.
  6. J. A. Lee & M. H. Kim. (2018). Gas Safety Monitoring App. Development Design for Gas Workers. Journal of the Korea Convergence Society, 9(10), 61-67. DOI: 10.15207/JKCS.2018.9.10.061.
  7. J. A. Lee & M. H. Kim. (2017). Work Type Classification of Gas Safety Workers and Interaction Function Design for IoT-based App. Development, Journal of the Korea Convergence Society, 8(5), 45-52. DOI: 10.15207/JKCS.2017.8.5.045.
  8. D. Khongorzul, M. H. Kim & S. M. Lee. (2019). OrdinalEncoder based DNN for Natural Gas Leak Prediction. J. Korea Convergence Society, 10(10), 7-13.
  9. Y. Xu, X. Zhao, Y. Chen & Z. Yang. (2019). Research on a Mixed Gas Classification Algorithm Based on Extreme Random Tree. Appl. Sci., 9, 1728. https://doi.org/10.3390/app9091728
  10. S. B. Zhu, Z. L. Li, S. M. Zhang, L. L. Liang & H. F. Zhang. (2018). Natural gas pipeline valve leakage rate estimation via factor and cluster analysis of acoustic emissions. Measurement, 125, 48-55. DOI: 10.1016/j.measurement.2018.04.076
  11. D. Khongorzul, S. M. Lee, Y. K. Kim & M. H. Kim. (2019). Image Denoising Methods based on DAECNN for Medication Prescriptions. Journal of the Korea Convergence Society, 10(5), 17-26. DOI: 10.15207/JKCS.2019.10.5.017.
  12. M. Jupri & R. Sarno. (2018). Taxpayer compliance classification using C4.5, SVM, KNN, Naive Bayes and MLP. Int. Conf. on Inform. & Commun. Technology on Proceedings, pp. 297-303. Yogyakarta.
  13. E. Cabana, R. E. Lillo & H. Laniado. (2019). Multivariate outlier detection based on a robust Mahalanobis distance with shrinkage estimators. Stat Papers. DOI: 10.1007/s00362-019-01148-1
  14. Q. Yan, J. Chen & L. D. Strycker. (2018). An Outlier Detection Method Based on Mahalanobis Distance for Source Localization. Sensors, 18(7), 2186. DOI: 10.3390/s1807218.
  15. https://github.com/JVF-CSU/MobileMethaneSurveys/tree/master/Scripts/SampleRawData