DOI QR코드

DOI QR Code

Subset 샘플링 검증 기법을 활용한 MSCRED 모델 기반 발전소 진동 데이터의 이상 진단

Anomaly Detection In Real Power Plant Vibration Data by MSCRED Base Model Improved By Subset Sampling Validation

  • 홍수웅 (인하대학교 컴퓨터공학과) ;
  • 권장우 (인하대학교 컴퓨터공학과)
  • Hong, Su-Woong (Department of Computer-Engineering, Inha University) ;
  • Kwon, Jang-Woo (Department of Computer-Engineering, Inha University)
  • 투고 : 2021.11.20
  • 심사 : 2022.01.20
  • 발행 : 2022.01.28

초록

본 논문은 전문가 독립적 비지도 신경망 학습 기반 다변량 시계열 데이터 분석 모델인 MSCRED(Multi-Scale Convolutional Recurrent Encoder-Decoder)의 실제 현장에서의 적용과 Auto-encoder 기반인 MSCRED 모델의 한계인, 학습 데이터가 오염되지 않아야 된다는 점을 극복하기 위한 학습 데이터 샘플링 기법인 Subset Sampling Validation을 제시한다. 라벨 분류가 되어있는 발전소 장비의 진동 데이터를 이용하여 1) 학습 데이터에 비정상 데이터가 섞여 있는 상황을 재현하고, 이를 학습한 경우 2) 1과 같은 상황에서 Subset Sampling Validation 기법을 통해 학습 데이터에서 비정상 데이터를 제거한 경우의 Anomaly Score를 비교하여 MSCRED와 Subset Sampling Validation 기법을 유효성을 평가한다. 이를 통해 본 논문은 전문가 독립적이며 오류 데이터에 강한 이상 진단 프레임워크를 제시해, 다양한 다변량 시계열 데이터 분야에서의 간결하고 정확한 해결 방법을 제시한다.

This paper applies an expert independent unsupervised neural network learning-based multivariate time series data analysis model, MSCRED(Multi-Scale Convolutional Recurrent Encoder-Decoder), and to overcome the limitation, because the MCRED is based on Auto-encoder model, that train data must not to be contaminated, by using learning data sampling technique, called Subset Sampling Validation. By using the vibration data of power plant equipment that has been labeled, the classification performance of MSCRED is evaluated with the Anomaly Score in many cases, 1) the abnormal data is mixed with the training data 2) when the abnormal data is removed from the training data in case 1. Through this, this paper presents an expert-independent anomaly diagnosis framework that is strong against error data, and presents a concise and accurate solution in various fields of multivariate time series data.

키워드

과제정보

This article was supported in part by the Korea Hydro & Nuclear Power Co., Ltd., Republic of Korea(No. L18-S065-0000), This article was supported in part of smart factory technology development project(Cloud-based data platform) by the Ministry of SMEs and Startups

참고문헌

  1. C. Zhang et al. (2019, July). A deep neural network for unsupervised anomaly detection and diagnosis in multivariate time series data. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 33, No. 01, pp. 1409-1416).
  2. M. Ojala & G. C. Garriga. (2010). Permutation tests for studying classifier performance. Journal of Machine Learning Research, 11(6).
  3. D. Hallac, S. Vare, S. Boyd & J. Leskovec. (2017, August). Toeplitz inverse covariance-based clustering of multivariate time series data. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 215-223).
  4. J. Long, E. Shelhamer & T. Darrell. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3431-3440).
  5. G. Klambauer, T. Unterthiner, A. Mayr & S. Hochreiter. (2017, December). Self-normalizing neural networks. In Proceedings of the 31st international conference on neural information processing systems (pp. 972-981).
  6. S. H. I. Xingjian et al. (2015). Convolutional LSTM network: A machine learning approach for precipitation nowcasting. In Advances in neural information processing systems (pp. 802-810).
  7. D. Bahdanau, K. Cho & Y. Bengio. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint, arXiv:1409.0473.
  8. D. P. Kingma & J. Ba. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv: 1412.6980.
  9. W. J. Youden. (1950). Index for rating diagnostic tests. Cancer, 3(1), 32 https://doi.org/10.1002/1097-0142(1950)3:1<32::AID-CNCR2820030106>3.0.CO;2-3
  10. N. Merrill & A. Eskandarian. (2020). Modified autoencoder training and scoring for robust unsupervised anomaly detection in deep learning. IEEE Access, 8, 101824-101833. https://doi.org/10.1109/access.2020.2997327