[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.9708/jksci.2022.27.11.019

MLOps workflow language and platform for time series data anomaly detection

Sohn, Jung-Mo (Epozen's research institute)
Kim, Su-Min (Epozen's research institute)

Publication Information

Journal of the Korea Society of Computer and Information / v.27, no.11, 2022 , pp. 19-27 More about this Journal

Abstract

In this study, we propose a language and platform to describe and manage the MLOps(Machine Learning Operations) workflow for time series data anomaly detection. Time series data is collected in many fields, such as IoT sensors, system performance indicators, and user access. In addition, it is used in many applications such as system monitoring and anomaly detection. In order to perform prediction and anomaly detection of time series data, the MLOps platform that can quickly and flexibly apply the analyzed model to the production environment is required. Thus, we developed Python-based AI/ML Modeling Language (AMML) to easily configure and execute MLOps workflows. Python is widely used in data analysis. The proposed MLOps platform can extract and preprocess time series data from various data sources (R-DB, NoSql DB, Log File, etc.) using AMML and predict it through a deep learning model. To verify the applicability of AMML, the workflow for generating a transformer oil temperature prediction deep learning model was configured with AMML and it was confirmed that the training was performed normally.

Keywords

AI; MLOps; Workflow; Time series data; Anomaly detection;

Citations & Related Records

Times Cited By KSCI : 2 (Citation Analysis)

Reference
Cited By KSCI

1	Bae Seong-Wan and Yu Jung-Suk, "Predicting the real estate price index using machine learning methods and time series analysis model," Housing Studies Review, Vol. 26, No. 1, pp. 107-133, 2018, doi: http://dx.doi.org/10.24957/hsr.2018.26.1.107 DOI
2	Sima Siami-Namini, Neda Tavakoli and Akbar Siami Namin, "A Comparison of ARIMA and LSTM in Forecasting Time Series," 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 1394-1401, 2018. DOI: 10.1109/ICMLA.2018.00227 DOI
3	Google, "MLOps: Continuous delivery and automation pipelines in machine learning," https://cloud.google.com/architecture/mlopscontinuous-delivery-and-automation-pipelines-in-machine-learning?hl=en
4	The Kubeflow Authors, "Kubeflow," https://www.kubeflow.org
5	Sepp Hochreiter and Jurgen Schmidhuber, "Long Short-Term Memory," in Neural Computation, vol. 9, no. 8, pp. 1735-1780, 15 Nov. 1997. DOI: 10.1162/neco.1997.9.8.1735 DOI
6	Haoyi Zhou, Shanghang Zhang, Jieqi Peng, Shuai Zhang, Jianxin Li, Hui Xiong and Wancai Zhang, "Electricity Transformer Dataset (ETDataset),", https://github.com/zhouhaoyi/ETDataset
7	Sasu Makinen, Henrik Skogstrom, Eero Laaksonen and Tommi Mikkonen, "Who needs MLOps: What data scientists seek to accomplish and how can MLOps help?," In: 2021 IEEE/ACM 1st Workshop on AI Engineering-Software Engineering for AI (WAIN), IEEE, pp. 109-112, 2021, DOI: 10.1109/WAIN52551.2021.00024 DOI
8	Microsoft, "Create and run machine learning pipelines using components with the Azure Machine Learning studio(Preview)," https://docs.microsoft.com/en-us/azure/machine-learning/how-tocreate-component-pipelines-ui
9	Lee Yo-Seob and Moon Phil-Joo, "A Comparison and Analysis of Deep Learning Framework," The Journal of the Korea institute of electronic communication sciences, Vol. 12, No. 1, pp. 115–122, 2017, DOI: https://doi.org/10.13067/JKIECS.20 17.12.1.115 DOI
10	Samuel Ackerman, Orna Raz, Marcel Zalmanovici and Aviad Zlotnick, "Automatically detecting data drift in machine learning classifiers," arXiv preprint arXiv:2111.05672, 2021, DOI: https://doi.org/10.48550/arXiv.2111.05672
11	Mun Jong-Hyeok, Kim Do-Hyung, Choi Jong-Sun and Choi Jae-Young, "Deep Learning Description Language for Referring to Analysis Model Based on Trusted Deep Learning," KIPS Transactions on Software and Data Engineering, vol. 10, no. 4, pp. 133–142, Apr. 2021. DOI: https://doi.org/10.3745/KTSDE.2021.10.4.133 DOI
12	Georgios Symeonidis, Evangelos Nerantzis, Apostolos Kazakis and George A. Papakostas, "MLOps - Definitions, Tools and Challenges," 2022 IEEE 12th Annual Computing and Communication Workshop and Conference (CCWC), pp. 0453-0460, 2022. DOI: 10.1109/CCWC54503.2022.9720902. DOI
13	Dominik Kreuzberger, Niklas Kuhl and Sebastian Hirschl, "Machine Learning Operations (MLOps): Overview, Definition, and Architecture," arXiv preprint arXiv:2205.02302, 2022. DOI:https://doi.org/10.48550/arXiv.2205.02302
14	MLflow Project, "MLflow - A platform for the machine learning lifecycle," https://mlflow.org
15	Sima Siami-Namini, Neda Tavakoli and Akbar Siami Namin, "A Comparison of ARIMA and LSTM in Forecasting Time Series," 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 1394-1401, 2018. DOI: 10.1109/ICMLA.2018.00227 DOI
16	Microsoft, "Machine Learning operations maturity model," https://docs.microsoft.com/en-us/azure/architecture/example-scenario/mlops/mlops-maturity-model
17	YAML Language Development Team, "YAML Ain't Markup Language (YAML^TM) version 1.2,", https://yaml.org/spec/1.2.2/
18	Haoyi Zhou, Shanghang Zhang, Jieqi Peng, Shuai Zhang, Jianxin Li, Hui Xiong and Wancai Zhang, "Informer: Beyond Efficient Transformer for Long SequenceTime-Series Forecasting," In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35, No. 12, pp. 11106-11115, May 2021. DOI: https://doi.org/10.48550/arXiv.2012.07436 DOI
19	Google, "Tabular Workflows on Vertex AI," https://cloud.google.com/vertex-ai/docs/tabular-data/tabular-workflows/overview?hl=en
20	Gustavo Correa Publio, Diego Esteves, Agnieszka Lawrynowicz, Panče Panov, Larisa Soldatova, Tommaso Soru, Joaquin Vanschoren And Hamid Zafar, "ML-schema: exposing the semantics of machine learning with schemas and ontologies," arXiv preprint arXiv:1807.05351, 2018. DOI: https://doi.org/10.48550/arXiv.1807.05351 DOI