• Title/Summary/Keyword: Anomaly data detection

Search Result 402, Processing Time 0.023 seconds

A Study of Anomaly Detection for ICT Infrastructure using Conditional Multimodal Autoencoder (ICT 인프라 이상탐지를 위한 조건부 멀티모달 오토인코더에 관한 연구)

  • Shin, Byungjin;Lee, Jonghoon;Han, Sangjin;Park, Choong-Shik
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.57-73
    • /
    • 2021
  • Maintenance and prevention of failure through anomaly detection of ICT infrastructure is becoming important. System monitoring data is multidimensional time series data. When we deal with multidimensional time series data, we have difficulty in considering both characteristics of multidimensional data and characteristics of time series data. When dealing with multidimensional data, correlation between variables should be considered. Existing methods such as probability and linear base, distance base, etc. are degraded due to limitations called the curse of dimensions. In addition, time series data is preprocessed by applying sliding window technique and time series decomposition for self-correlation analysis. These techniques are the cause of increasing the dimension of data, so it is necessary to supplement them. The anomaly detection field is an old research field, and statistical methods and regression analysis were used in the early days. Currently, there are active studies to apply machine learning and artificial neural network technology to this field. Statistically based methods are difficult to apply when data is non-homogeneous, and do not detect local outliers well. The regression analysis method compares the predictive value and the actual value after learning the regression formula based on the parametric statistics and it detects abnormality. Anomaly detection using regression analysis has the disadvantage that the performance is lowered when the model is not solid and the noise or outliers of the data are included. There is a restriction that learning data with noise or outliers should be used. The autoencoder using artificial neural networks is learned to output as similar as possible to input data. It has many advantages compared to existing probability and linear model, cluster analysis, and map learning. It can be applied to data that does not satisfy probability distribution or linear assumption. In addition, it is possible to learn non-mapping without label data for teaching. However, there is a limitation of local outlier identification of multidimensional data in anomaly detection, and there is a problem that the dimension of data is greatly increased due to the characteristics of time series data. In this study, we propose a CMAE (Conditional Multimodal Autoencoder) that enhances the performance of anomaly detection by considering local outliers and time series characteristics. First, we applied Multimodal Autoencoder (MAE) to improve the limitations of local outlier identification of multidimensional data. Multimodals are commonly used to learn different types of inputs, such as voice and image. The different modal shares the bottleneck effect of Autoencoder and it learns correlation. In addition, CAE (Conditional Autoencoder) was used to learn the characteristics of time series data effectively without increasing the dimension of data. In general, conditional input mainly uses category variables, but in this study, time was used as a condition to learn periodicity. The CMAE model proposed in this paper was verified by comparing with the Unimodal Autoencoder (UAE) and Multi-modal Autoencoder (MAE). The restoration performance of Autoencoder for 41 variables was confirmed in the proposed model and the comparison model. The restoration performance is different by variables, and the restoration is normally well operated because the loss value is small for Memory, Disk, and Network modals in all three Autoencoder models. The process modal did not show a significant difference in all three models, and the CPU modal showed excellent performance in CMAE. ROC curve was prepared for the evaluation of anomaly detection performance in the proposed model and the comparison model, and AUC, accuracy, precision, recall, and F1-score were compared. In all indicators, the performance was shown in the order of CMAE, MAE, and AE. Especially, the reproduction rate was 0.9828 for CMAE, which can be confirmed to detect almost most of the abnormalities. The accuracy of the model was also improved and 87.12%, and the F1-score was 0.8883, which is considered to be suitable for anomaly detection. In practical aspect, the proposed model has an additional advantage in addition to performance improvement. The use of techniques such as time series decomposition and sliding windows has the disadvantage of managing unnecessary procedures; and their dimensional increase can cause a decrease in the computational speed in inference.The proposed model has characteristics that are easy to apply to practical tasks such as inference speed and model management.

A Big Data Application for Anomaly Detection in VANETs (VANETs에서 비정상 행위 탐지를 위한 빅 데이터 응용)

  • Kim, Sik;Oh, Sun-Jin
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.14 no.6
    • /
    • pp.175-181
    • /
    • 2014
  • With rapid growth of the wireless mobile computing network technologies, various mobile ad hoc network applications converged with other related technologies are rapidly disseminated nowadays. Vehicular Ad Hoc Networks are self-organizing mobile ad hoc networks that typically have moving vehicle nodes with high speeds and maintaining its topology very short with unstable communication links. Therefore, VANETs are very vulnerable for the malicious noise of sensors and anomalies of the nodes in the network system. In this paper, we propose an anomaly detection method by using big data techniques that efficiently identify malicious behaviors or noises of sensors and anomalies of vehicle node activities in these VANETs, and the performance of the proposed scheme is evaluated by a simulation study in terms of anomaly detection rate and false alarm rate for the threshold ${\epsilon}$.

Anomaly Detection Mechanism against DDoS on BcN (BcN 상에서의 DDoS에 대한 Anomaly Detection 연구)

  • Song, Byung-Hak;Lee, Seung-Yeon;Hong, Choong-Seon;Huh, Eui-Nam;Sohn, Seong-Won
    • Journal of Internet Computing and Services
    • /
    • v.8 no.2
    • /
    • pp.55-65
    • /
    • 2007
  • BcN is a high-quality broadband network for multimedia services integrating telecommunication, broadcasting, and Internet seamlessly at anywhere, anytime, and using any device. BcN is Particularly vulnerable to intrusion because it merges various traditional networks, wired, wireless and data networks. Because of this, one of the most important aspects in BcN is security in terms of reliability. So, in this paper, we suggest the sharing mechanism of security data among various service networks on the BcN. This distributed, hierarchical architecture enables BcN to be robust of attacks and failures, controls data traffic going in and out the backbone core through IP edge routers integrated with IDRS. Our proposed anomaly detection scheme on IDRS for BcN service also improves detection rate compared to the previous conventional approaches.

  • PDF

A study on machine learning-based anomaly detection algorithm using current data of fish-farm pump motor (양식장 펌프 모터 전류 데이터를 이용한 머신러닝 기반 이상 감지 알고리즘에 관한 연구)

  • Sae-yong Park;Tae Uk chang;Taeho Im
    • Journal of Internet Computing and Services
    • /
    • v.24 no.2
    • /
    • pp.37-45
    • /
    • 2023
  • In line with the 4th Industrial Revolution, facility maintenance technologies for building smart factories are receiving attention and are being advanced. In addition, technology is being applied to smart farms and smart fisheries following smart factories. Among them, in the case of a recirculating aquaculture system, there is a motor pump that circulates water for a stable quality environment in the tank. Motor pump maintenance activities for recirculating aquaculture system are carried out based on preventive maintenance and data obtained from vibration sensor. Preventive maintenance cannot cope with abnormalities that occur before prior planning, and vibration sensors are affected by the external environment. This paper proposes an anomaly detection algorithm that utilizes ADTK, a Python open source, for motor pump anomaly detection based on data collected through current sensors that are less affected by the external environment than noise, temperature and vibration sensors.

A Study on Atmospheric Data Anomaly Detection Algorithm based on Unsupervised Learning Using Adversarial Generative Neural Network (적대적 생성 신경망을 활용한 비지도 학습 기반의 대기 자료 이상 탐지 알고리즘 연구)

  • Yang, Ho-Jun;Lee, Seon-Woo;Lee, Mun-Hyung;Kim, Jong-Gu;Choi, Jung-Mu;Shin, Yu-mi;Lee, Seok-Chae;Kwon, Jang-Woo;Park, Ji-Hoon;Jung, Dong-Hee;Shin, Hye-Jung
    • Journal of Convergence for Information Technology
    • /
    • v.12 no.4
    • /
    • pp.260-269
    • /
    • 2022
  • In this paper, We propose an anomaly detection model using deep neural network to automate the identification of outliers of the national air pollution measurement network data that is previously performed by experts. We generated training data by analyzing missing values and outliers of weather data provided by the Institute of Environmental Research and based on the BeatGAN model of the unsupervised learning method, we propose a new model by changing the kernel structure, adding the convolutional filter layer and the transposed convolutional filter layer to improve anomaly detection performance. In addition, by utilizing the generative features of the proposed model to implement and apply a retraining algorithm that generates new data and uses it for training, it was confirmed that the proposed model had the highest performance compared to the original BeatGAN models and other unsupervised learning model like Iforest and One Class SVM. Through this study, it was possible to suggest a method to improve the anomaly detection performance of proposed model while avoiding overfitting without additional cost in situations where training data are insufficient due to various factors such as sensor abnormalities and inspections in actual industrial sites.

Anomaly Data Detection Using Machine Learning in Crowdsensing System (크라우드센싱 시스템에서 머신러닝을 이용한 이상데이터 탐지)

  • Kim, Mihui;Lee, Gihun
    • Journal of IKEEE
    • /
    • v.24 no.2
    • /
    • pp.475-485
    • /
    • 2020
  • Recently, a crowdsensing system that provides a new sensing service with real-time sensing data provided from a user's device including a sensor without installing a separate sensor has attracted attention. In the crowdsensing system, meaningless data may be provided due to a user's operation error or communication problem, or false data may be provided to obtain compensation. Therefore, the detection and removal of the abnormal data determines the quality of the crowdsensing service. The proposed methods in the past to detect these anomalies are not efficient for the fast-changing environment of crowdsensing. This paper proposes an anomaly data detection method by extracting the characteristics of continuously and rapidly changing sensing data environment by using machine learning technology and modeling it with an appropriate algorithm. We show the performance and feasibility of the proposed system using deep learning binary classification model of supervised learning and autoencoder model of unsupervised learning.

Network Anomaly Traffic Detection Using WGAN-CNN-BiLSTM in Big Data Cloud-Edge Collaborative Computing Environment

  • Yue Wang
    • Journal of Information Processing Systems
    • /
    • v.20 no.3
    • /
    • pp.375-390
    • /
    • 2024
  • Edge computing architecture has effectively alleviated the computing pressure on cloud platforms, reduced network bandwidth consumption, and improved the quality of service for user experience; however, it has also introduced new security issues. Existing anomaly detection methods in big data scenarios with cloud-edge computing collaboration face several challenges, such as sample imbalance, difficulty in dealing with complex network traffic attacks, and difficulty in effectively training large-scale data or overly complex deep-learning network models. A lightweight deep-learning model was proposed to address these challenges. First, normalization on the user side was used to preprocess the traffic data. On the edge side, a trained Wasserstein generative adversarial network (WGAN) was used to supplement the data samples, which effectively alleviates the imbalance issue of a few types of samples while occupying a small amount of edge-computing resources. Finally, a trained lightweight deep learning network model is deployed on the edge side, and the preprocessed and expanded local data are used to fine-tune the trained model. This ensures that the data of each edge node are more consistent with the local characteristics, effectively improving the system's detection ability. In the designed lightweight deep learning network model, two sets of convolutional pooling layers of convolutional neural networks (CNN) were used to extract spatial features. The bidirectional long short-term memory network (BiLSTM) was used to collect time sequence features, and the weight of traffic features was adjusted through the attention mechanism, improving the model's ability to identify abnormal traffic features. The proposed model was experimentally demonstrated using the NSL-KDD, UNSW-NB15, and CIC-ISD2018 datasets. The accuracies of the proposed model on the three datasets were as high as 0.974, 0.925, and 0.953, respectively, showing superior accuracy to other comparative models. The proposed lightweight deep learning network model has good application prospects for anomaly traffic detection in cloud-edge collaborative computing architectures.

A Method for Region-Specific Anomaly Detection on Patch-wise Segmented PA Chest Radiograph (PA 흉부 X-선 영상 패치 분할에 의한 지역 특수성 이상 탐지 방법)

  • Hyun-bin Kim;Jun-Chul Chun
    • Journal of Internet Computing and Services
    • /
    • v.24 no.1
    • /
    • pp.49-59
    • /
    • 2023
  • Recently, attention to the pandemic situation represented by COVID-19 emerged problems caused by unexpected shortage of medical personnel. In this paper, we present a method for diagnosing the presence or absence of lesional sign on PA chest X-ray images as computer vision solution to support diagnosis tasks. Method for visual anomaly detection based on feature modeling can be also applied to X-ray images. With extracting feature vectors from PA chest X-ray images and divide to patch unit, region-specific abnormality can be detected. As preliminary experiment, we created simulation data set containing multiple objects and present results of the comparative experiments in this paper. We present method to improve both efficiency and performance of the process through hard masking of patch features to aligned images. By summing up regional specificity and global anomaly detection results, it shows improved performance by 0.069 AUROC compared to previous studies. By aggregating region-specific and global anomaly detection results, it shows improved performance by 0.069 AUROC compared to our last study.

A Study on Anomaly Signal Detection and Management Model using Big Data (빅데이터를 활용한 이상 징후 탐지 및 관리 모델 연구)

  • Kwon, Young-baek;Kim, In-seok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.16 no.6
    • /
    • pp.287-294
    • /
    • 2016
  • APT attack aimed at the interruption of information and communication facilities and important information leakage of companies. it performs an attack using zero-day vulnerabilities, social engineering base on collected information, such as IT infra, business environment, information of employee, for a long period of time. Fragmentary response to cyber threats such as malware signature detection methods can not respond to sophisticated cyber-attacks, such as APT attacks. In this paper, we propose a cyber intrusion detection model for countermeasure of APT attack by utilizing heterogeneous system log into big-data. And it also utilizes that merging pattern-based detection methods and abnormality detection method.

Flow-based Anomaly Detection Using Access Behavior Profiling and Time-sequenced Relation Mining

  • Liu, Weixin;Zheng, Kangfeng;Wu, Bin;Wu, Chunhua;Niu, Xinxin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.6
    • /
    • pp.2781-2800
    • /
    • 2016
  • Emerging attacks aim to access proprietary assets and steal data for business or political motives, such as Operation Aurora and Operation Shady RAT. Skilled Intruders would likely remove their traces on targeted hosts, but their network movements, which are continuously recorded by network devices, cannot be easily eliminated by themselves. However, without complete knowledge about both inbound/outbound and internal traffic, it is difficult for security team to unveil hidden traces of intruders. In this paper, we propose an autonomous anomaly detection system based on behavior profiling and relation mining. The single-hop access profiling model employ a novel linear grouping algorithm PSOLGA to create behavior profiles for each individual server application discovered automatically in historical flow analysis. Besides that, the double-hop access relation model utilizes in-memory graph to mine time-sequenced access relations between different server applications. Using the behavior profiles and relation rules, this approach is able to detect possible anomalies and violations in real-time detection. Finally, the experimental results demonstrate that the designed models are promising in terms of accuracy and computational efficiency.