DOI QR코드

DOI QR Code

ADA: Advanced data analytics methods for abnormal frequent episodes in the baseline data of ISD

  • Biswajit Biswal (Department of Computer Science and Mathematics, South Carolina State University) ;
  • Andrew Duncan (Material Sciences and Technology, Savannah River National Laboratory) ;
  • Zaijing Sun (Health Physics and Diagnostic Sciences, University of Nevada)
  • Received : 2022.02.07
  • Accepted : 2022.07.05
  • Published : 2022.11.25

Abstract

The data collected by the In-Situ Decommissioning (ISD) sensors are time-specific, age-specific, and developmental stage-specific. Research has been done on the stream data collected by ISD testbed in the recent few years to seek both frequent episodes and abnormal frequent episodes. Frequent episodes in the data stream have confirmed the daily cycle of the sensor responses and established sequences of different types of sensors, which was verified by the experimental setup of the ISD Sensor Network Test Bed. However, the discovery of abnormal frequent episodes remained a challenge because these abnormal frequent episodes are very small signals and may be buried in the background noise of voltage and current changes. In this work, we proposed Advanced Data Analytics (ADA) methods that are applied to the baseline data to identify frequent episodes and extended our approach by adding more features extracted from the baseline data to discover abnormal frequent episodes, which may lead to the early indicators of ISD system failures. In the study, we have evaluated our approach using the baseline data, and the performance evaluation results show that our approach is able to discover frequent episodes as well as abnormal frequent episodes conveniently.

Keywords

Acknowledgement

This research is supported by the U.S. Department of Energy, Office of Environmental Management (EM) MSIPP program under TOA #T0000456309.

References

  1. M.K. Saggi, S. Jain, A survey towards an integration of big data analytics to big insights for value-creation, Inf. Process. Manag. 54 (2018) 758-790, https://doi.org/10.1016/j.ipm.2018.01.010.
  2. C.W. Tsai, C.F. Lai, H.C. Chao, A.V. Vasilakos, Big data analytics: a survey, J. Big Data 2 (2015) 1-32, https://doi.org/10.1186/s40537-015-0030-3.
  3. J.S. Dhoble, N. Shelke, Investigative research on big data: an analysis, available at, Int. J. Innov. Res. Sci. Eng. Technol. 4 (2015) 4476-4482. :, 06/13/2022, http://www.ijirset.com/upload/2015/june/58_12_Investigative.pdf.
  4. C. Ma, H.H. Zhang, X. Wang, Machine learning for big data analytics in plants, Trends Plant Sci. 19 (2014) 798-808, https://doi.org/10.1016/j.tplants.2014.08.004.
  5. J. Han, M. Kamber, J. Pei, Data Mining: Concepts and Techniques, Morgan Kaufmann Waltham, MA, 2012.
  6. A.K. Jain, M.N. Murty, P.J. Flynn, Data clustering: a review, ACM Comput. Surv. 31 (1999) 264-323, https://doi.org/10.1145/331499.331504.
  7. R. Xu, D. Wunsch, Survey of clustering algorithms, IEEE Trans. Neural Network. 16 (2005) 645-678, https://doi.org/10.1109/TNN.2005.845141.
  8. K.E. Zeigler, B.A. Ferguson, Development of an in-situ decommissioning sensor network test bed for structural condition monitoring, in: Waste Management 2012 Conference, Phoenix, AZ, USA, Feb 26 - March 1, 2012.
  9. K. Zeigler, B. Ferguson, D. Karapatakis, C. Herbst, C. Stripling, Development of a Sensor Network Test Bed for ISD Materials and Structural Condition Monitoring, Savannah River National Laboratory (SRNL), 2011, https://doi.org/10.2172/1018717. SRNL-STI-2011-00193.
  10. Z. Sun, A. Duncan, Y. Kim, K. Zeigler, Applying temporal data mining (TDM) on the baseline data acquired by the in-situ decommissioning (ISD) sensor network test bed, in: Waste Management 2018 Conference, Phoenix, AZ, USA, Mar 18-22, 2018.
  11. X. Ao, P. Luo, C. Li, F. Zhuang and Q. He, Online frequent episode mining, in: 2015 IEEE 31st International Conference on Data Engineering, pp. 891-902, https://doi.org/10.1109/ICDE.2015.7113342.
  12. P.S. Sastry, S. Laxman, K.P. Unnikrishnan, System and Method for Mining of Temporal Data, 2010 patent 7644078.
  13. D. Patnaik, S. Laxman, B. Chandramouli, N. Ramakrishnan, Efficient episode mining of dynamic event streams, Data Mining (ICDM), in: IEEE 12th International Conference, 2012, pp. 605-614, https://doi.org/10.1109/ICDM.2012.84.
  14. D. Patnaik, P. Sastry, K. Unnikrishnan, Inferring neuronal network connectivity from spike data: a temporal data mining approach, Sci. Program. 16 (2018) 49-77, https://doi.org/10.3233/SPR-2008-0242.
  15. S. Laxman, P.S. Sastry, K.P. Unnikrishnan, A fast algorithm for finding frequent episodes in event streams, in: KDD '07: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2017, pp. 410-419, https://doi.org/10.1145/1281192.1281238.
  16. X. Ao, H. Shi, J. Wang, L. Zuo, H. Li, Q. He, Large-scale frequent episode mining from complex event sequences with hierarchies, ACM Trans. Intell. Syst. Technol. 10 (2019) 1-26, https://doi.org/10.1145/3326163.
  17. P. Fournier-Viger, P. Yang, J. Lin, U. Yun, HUE-span: fast high utility episode mining, in: 15th International Conference on Advanced Data Mining and Applications (ADMA 2019), Dalian, China, Nov 21-23, 2019, https://doi.org/10.1007/978-3-030-35231-8_12.
  18. M. Amiri, L. Mohammad-Khanli, R. Mirandola, An online learning model based on episode mining for workload prediction in cloud, Future Generat. Comput. Syst. 87 (2018) 83-101, https://doi.org/10.1016/j.future.2018.04.044.
  19. T. You, Y. Li, B. Sun, C. Du, Multi-source data stream online frequent episode mining, IEEE Access 8 (2020) 107465-107478, https://doi.org/10.1109/ACCESS.2020.2997337.
  20. B. Biswal, A. Duncan, Z. Sun, Applying advanced data analytics methods to the baseline data of ISD sensor network testbed for system failure detection, in: Waste Management 2021 Conference, Phoenix, AZ, USA, Mar 8-12, 2021.
  21. Y. Zhou, Y. Zhu, Q. Ye, Q. Qiu, J. Jiao, Weakly supervised instance segmentation using class peak response, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 3791-3800, https://doi.org/10.1109/CVPR.2018.00399.
  22. B. Biswal, S. Shetty, T. Rogers, Enhanced learning classifier to locate data in cloud data centres, Int. J. Metaheuristics (IJMHeur) 4 (2015) 141-158, https://doi.org/10.1504/IJMHEUR.2015.074248.
  23. R. Suzuki, H. Shimodaira, Pvclust: an R package for assessing the uncertainty in hierarchical clustering, J. Bioinfo. 22 (2006) 1540-1542, https://doi.org/10.1093/bioinformatics/btl117.
  24. C. Negin, C. Urland, A. Szilagyi, DOE-EM'S in-situ decommissioning strategy, in: Waste Management 2021 Conference, Phoenix, AZ, USA, Feb 24-28, 2008.
  25. N. Carino, V. Li, G. Heath, G. Song, C. Wang, P. Ziehl, Development of a Remote Monitoring Sensor Network for in Situ Decommissioned Structures, Savannah River National Laboratory (SRNL), 2010. SRNL-RP-2010-01666.
  26. S. Hansun, A new approach of Brown's double exponential smoothing method in time series analysis, Balkan J. Electr. Comput. Eng. (2016) 75-78, https://doi.org/10.17694/bajece.14351.
  27. R. Benjamin, B. Pierre, T. Nicolas, G. Paulo, V. Pierre, Fourier could be a data scientist: from graph Fourier transform to signal processing on graphs, Compt. Rendus Phys. 20 (2019) 474-488, https://doi.org/10.1016/j.crhy.2019.08.003.
  28. S. Taixin, M. Yang, T. Jin, R.C.C. Flesch, Power harmonic and interharmonic detection method in renewable power based on Nuttall double-window all-phase FFT algorithm, IET Renew. Power Gener. 12 (2018) 953-961, https://doi.org/10.1049/iet-rpg.2017.0115.
  29. N. Samad, P. Hamid, F. Eshagh, Using sub-sampling and ensemble clustering techniques to improve performance of imbalanced classification, Neurocomputing 276 (2018) 55-66, https://doi.org/10.1016/j.neucom.2017.06.082.
  30. J. Zhang, I. Mani, kNN approach to unbalanced data distributions: a case study involving information extraction, available at, in: Proceeding of International Conference on Machine Learning (ICML 2003), Washington DC, Aug 21, 2003. :, 06/13/2022, https://www.site.uottawa.ca/~nat/Workshop2003/jzhang.pdf.