• Title/Summary/Keyword: Anomaly Data

Search Result 789, Processing Time 0.023 seconds

Anomaly Detection Technique of Log Data Using Hadoop Ecosystem (하둡 에코시스템을 활용한 로그 데이터의 이상 탐지 기법)

  • Son, Siwoon;Gil, Myeong-Seon;Moon, Yang-Sae
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.2
    • /
    • pp.128-133
    • /
    • 2017
  • In recent years, the number of systems for the analysis of large volumes of data is increasing. Hadoop, a representative big data system, stores and processes the large data in the distributed environment of multiple servers, where system-resource management is very important. The authors attempted to detect anomalies from the rapid changing of the log data that are collected from the multiple servers using simple but efficient anomaly-detection techniques. Accordingly, an Apache Hive storage architecture was designed to store the log data that were collected from the multiple servers in the Hadoop ecosystem. Also, three anomaly-detection techniques were designed based on the moving-average and 3-sigma concepts. It was finally confirmed that all three of the techniques detected the abnormal intervals correctly, while the weighted anomaly-detection technique is more precise than the basic techniques. These results show an excellent approach for the detection of log-data anomalies with the use of simple techniques in the Hadoop ecosystem.

Online anomaly detection algorithm based on deep support vector data description using incremental centroid update (점진적 중심 갱신을 이용한 deep support vector data description 기반의 온라인 비정상 탐지 알고리즘)

  • Lee, Kibae;Ko, Guhn Hyeok;Lee, Chong Hyun
    • The Journal of the Acoustical Society of Korea
    • /
    • v.41 no.2
    • /
    • pp.199-209
    • /
    • 2022
  • Typical anomaly detection algorithms are trained by using prior data. Thus the batch learning based algorithms cause inevitable performance degradation when characteristics of newly incoming normal data change over time. We propose an online anomaly detection algorithm which can consider the gradual characteristic changes of incoming normal data. The proposed algorithm based on one-class classification model includes both offline and online learning procedures. In offline learning procedure, the algorithm learns the prior data to be close to centroid of the latent space and then updates the centroid of the latent space incrementally by new incoming data. In the online learning, the algorithm continues learning by using the updated centroid. Through experiments using public underwater acoustic data, the proposed online anomaly detection algorithm takes only approximately 2 % additional learning time for the incremental centroid update and learning. Nevertheless, the proposed algorithm shows 19.10 % improvement in Area Under the receiver operating characteristic Curve (AUC) performance compared to the offline learning model when new incoming normal data comes.

A Moving Window Principal Components Analysis Based Anomaly Detection and Mitigation Approach in SDN Network

  • Wang, Mingxin;Zhou, Huachun;Chen, Jia
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.8
    • /
    • pp.3946-3965
    • /
    • 2018
  • Network anomaly detection in Software Defined Networking, especially the detection of DDoS attack, has been given great attention in recent years. It is convenient to build the Traffic Matrix from a global view in SDN. However, the monitoring and management of high-volume feature-rich traffic in large networks brings significant challenges. In this paper, we propose a moving window Principal Components Analysis based anomaly detection and mitigation approach to map data onto a low-dimensional subspace and keep monitoring the network state in real-time. Once the anomaly is detected, the controller will install the defense flow table rules onto the corresponding data plane switches to mitigate the attack. Furthermore, we evaluate our approach with experiments. The Receiver Operating Characteristic curves show that our approach performs well in both detection probability and false alarm probability compared with the entropy-based approach. In addition, the mitigation effect is impressive that our approach can prevent most of the attacking traffic. At last, we evaluate the overhead of the system, including the detection delay and utilization of CPU, which is not excessive. Our anomaly detection approach is lightweight and effective.

An Anomaly Detection Framework Based on ICA and Bayesian Classification for IaaS Platforms

  • Wang, GuiPing;Yang, JianXi;Li, Ren
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.8
    • /
    • pp.3865-3883
    • /
    • 2016
  • Infrastructure as a Service (IaaS) encapsulates computer hardware into a large amount of virtual and manageable instances mainly in the form of virtual machine (VM), and provides rental service for users. Currently, VM anomaly incidents occasionally occur, which leads to performance issues and even downtime. This paper aims at detecting anomalous VMs based on performance metrics data of VMs. Due to the dynamic nature and increasing scale of IaaS, detecting anomalous VMs from voluminous correlated and non-Gaussian monitored performance data is a challenging task. This paper designs an anomaly detection framework to solve this challenge. First, it collects 53 performance metrics to reflect the running state of each VM. The collected performance metrics are testified not to follow the Gaussian distribution. Then, it employs independent components analysis (ICA) instead of principal component analysis (PCA) to extract independent components from collected non-Gaussian performance metric data. For anomaly detection, it employs multi-class Bayesian classification to determine the current state of each VM. To evaluate the performance of the designed detection framework, four types of anomalies are separately or jointly injected into randomly selected VMs in a campus-wide testbed. The experimental results show that ICA-based detection mechanism outperforms PCA-based and LDA-based detection mechanisms in terms of sensitivity and specificity.

Development of deep autoencoder-based anomaly detection system for HANARO

  • Seunghyoung Ryu;Byoungil Jeon ;Hogeon Seo ;Minwoo Lee;Jin-Won Shin;Yonggyun Yu
    • Nuclear Engineering and Technology
    • /
    • v.55 no.2
    • /
    • pp.475-483
    • /
    • 2023
  • The high-flux advanced neutron application reactor (HANARO) is a multi-purpose research reactor at the Korea Atomic Energy Research Institute (KAERI). HANARO has been used in scientific and industrial research and developments. Therefore, stable operation is necessary for national science and industrial prospects. This study proposed an anomaly detection system based on deep learning, that supports the stable operation of HANARO. The proposed system collects multiple sensor data, displays system information, analyzes status, and performs anomaly detection using deep autoencoder. The system comprises communication, visualization, and anomaly-detection modules, and the prototype system is implemented on site in 2021. Finally, an analysis of the historical data and synthetic anomalies was conducted to verify the overall system; simulation results based on the historical data show that 12 cases out of 19 abnormal events can be detected in advance or on time by the deep learning AD model.

Using artificial intelligence to detect human errors in nuclear power plants: A case in operation and maintenance

  • Ezgi Gursel ;Bhavya Reddy ;Anahita Khojandi;Mahboubeh Madadi;Jamie Baalis Coble;Vivek Agarwal ;Vaibhav Yadav;Ronald L. Boring
    • Nuclear Engineering and Technology
    • /
    • v.55 no.2
    • /
    • pp.603-622
    • /
    • 2023
  • Human error (HE) is an important concern in safety-critical systems such as nuclear power plants (NPPs). HE has played a role in many accidents and outage incidents in NPPs. Despite the increased automation in NPPs, HE remains unavoidable. Hence, the need for HE detection is as important as HE prevention efforts. In NPPs, HE is rather rare. Hence, anomaly detection, a widely used machine learning technique for detecting rare anomalous instances, can be repurposed to detect potential HE. In this study, we develop an unsupervised anomaly detection technique based on generative adversarial networks (GANs) to detect anomalies in manually collected surveillance data in NPPs. More specifically, our GAN is trained to detect mismatches between automatically recorded sensor data and manually collected surveillance data, and hence, identify anomalous instances that can be attributed to HE. We test our GAN on both a real-world dataset and an external dataset obtained from a testbed, and we benchmark our results against state-of-the-art unsupervised anomaly detection algorithms, including one-class support vector machine and isolation forest. Our results show that the proposed GAN provides improved anomaly detection performance. Our study is promising for the future development of artificial intelligence based HE detection systems.

Cointegration based modeling and anomaly detection approaches using monitoring data of a suspension bridge

  • Ziyuan Fan;Qiao Huang;Yuan Ren;Qiaowei Ye;Weijie Chang;Yichao Wang
    • Smart Structures and Systems
    • /
    • v.31 no.2
    • /
    • pp.183-197
    • /
    • 2023
  • For long-span bridges with a structural health monitoring (SHM) system, environmental temperature-driven responses are proved to be a main component in measurements. However, anomalous structural behavior may be hidden incomplicated recorded data. In order to receive reliable assessment of structural performance, it is important to study therelationship between temperature and monitoring data. This paper presents an application of the cointegration based methodology to detect anomalies that may be masked by temperature effects and then forecast the temperature-induced deflection (TID) of long-span suspension bridges. Firstly, temperature effects on girder deflection are analyzed with fieldmeasured data of a suspension bridge. Subsequently, the cointegration testing procedure is conducted. A threshold-based anomaly detection framework that eliminates the influence of environmental temperature is also proposed. The cointegrated residual series is extracted as the index to monitor anomaly events in bridges. Then, wavelet separation method is used to obtain TIDs from recorded data. Combining cointegration theory with autoregressive moving average (ARMA) model, TIDs for longspan bridges are modeled and forecasted. Finally, in-situ measurements of Xihoumen Bridge are adopted as an example to demonstrate the effectiveness of the cointegration based approach. In conclusion, the proposed method is practical for actual structures which ensures the efficient management and maintenance based on monitoring data.

MLOps workflow language and platform for time series data anomaly detection

  • Sohn, Jung-Mo;Kim, Su-Min
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.11
    • /
    • pp.19-27
    • /
    • 2022
  • In this study, we propose a language and platform to describe and manage the MLOps(Machine Learning Operations) workflow for time series data anomaly detection. Time series data is collected in many fields, such as IoT sensors, system performance indicators, and user access. In addition, it is used in many applications such as system monitoring and anomaly detection. In order to perform prediction and anomaly detection of time series data, the MLOps platform that can quickly and flexibly apply the analyzed model to the production environment is required. Thus, we developed Python-based AI/ML Modeling Language (AMML) to easily configure and execute MLOps workflows. Python is widely used in data analysis. The proposed MLOps platform can extract and preprocess time series data from various data sources (R-DB, NoSql DB, Log File, etc.) using AMML and predict it through a deep learning model. To verify the applicability of AMML, the workflow for generating a transformer oil temperature prediction deep learning model was configured with AMML and it was confirmed that the training was performed normally.

Wavenumber Correlation Analysis of Statellite Geopotential Anomalies

  • Kim, Jeong-Woo;Kim, Won-Kyun;Kim, Hye-Yun
    • Economic and Environmental Geology
    • /
    • v.33 no.2
    • /
    • pp.111-116
    • /
    • 2000
  • Indentifying anomaly correlations between data sets is the basis for rationalizig geopotential interpretation and theory. A procedure is presented that constitutes an effective process for identifying correlative features between the two or more geopotential data sets. Anomaly features that show direct, inverse, or no correlations between the data may be separated by applying filters in the frequency domains of the data sets. The correlation filter passes or rejects wavenumbers between co-registered data sets based on the correlation coefficient between common wavenumbers as given by the cosine of their phase difference. This study includes an example of Magsat magnetic anomaly profile that illustrates the usefulness of the procedure for extracting correlative features between the data sets.

  • PDF

LSTM-based Anomaly Detection on Big Data for Smart Factory Monitoring (스마트 팩토리 모니터링을 위한 빅 데이터의 LSTM 기반 이상 탐지)

  • Nguyen, Van Quan;Van Ma, Linh;Kim, Jinsul
    • Journal of Digital Contents Society
    • /
    • v.19 no.4
    • /
    • pp.789-799
    • /
    • 2018
  • This article presents machine learning based approach on Big data to analyzing time series data for anomaly detection in such industrial complex system. Long Short-Term Memory (LSTM) network have been demonstrated to be improved version of RNN and have become a useful aid for many tasks. This LSTM based model learn the higher level temporal features as well as temporal pattern, then such predictor is used to prediction stage to estimate future data. The prediction error is the difference between predicted output made by predictor and actual in-coming values. An error-distribution estimation model is built using a Gaussian distribution to calculate the anomaly in the score of the observation. In this manner, we move from the concept of a single anomaly to the idea of the collective anomaly. This work can assist the monitoring and management of Smart Factory in minimizing failure and improving manufacturing quality.