• Title/Summary/Keyword: 이상치 탐지

Search Result 141, Processing Time 0.026 seconds

Study on Lifelog Anomaly Detection using VAE-based Machine Learning Model (VAE(Variational AutoEncoder) 기반 머신러닝 모델을 활용한 체중 라이프로그 이상탐지에 관한 연구)

  • Kim, Jiyong;Park, Minseo
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.4
    • /
    • pp.91-98
    • /
    • 2022
  • Lifelog data continuously collected through a wearable device may contain many outliers, so in order to improve data quality, it is necessary to find and remove outliers. In general, since the number of outliers is less than the number of normal data, a class imbalance problem occurs. To solve this imbalance problem, we propose a method that applies Variational AutoEncoder to outliers. After preprocessing the outlier data with proposed method, it is verified through a number of machine learning models(classification). As a result of verification using body weight data, it was confirmed that the performance was improved in all classification models. Based on the experimental results, when analyzing lifelog body weight data, we propose to apply the LightGBM model with the best performance after preprocessing the data using the outlier processing method proposed in this study.

intrusion detection using training data with intrusion instances (침입 사례를 포함하는 훈련 데이터를 이용한 침입 탐지)

  • 이재흥;박용수;이영기;조유근
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2003.04a
    • /
    • pp.383-385
    • /
    • 2003
  • 침입 탐지 시스템에 이상 탐지 기법(anormal detection)을 적용할 때 정상적인 시스템호출 순서에 대한 훈련이 필요하다. 이 때 발생하는 가장 큰 문제점중 하나는 침입 없는 훈련 데이터의 확보이다. 훈련 데이터에 침입이 있으면 이 침입을 정상으로 간주해서 이후에 같은 침입이 일어나도 이를 탐지해 내지 못하기 때문이다. 하지만, 침입 없는 훈련 데이터를 얻는 것은 매우 어렵다. 본 논문에서는 훈련 데이터에 침입이 포함되어 있더라도 효과적으로 침입을 탐지할 수 있는 시스템 호출 기반 침입 탐지 기법을 제안한다. 제안 기법은 훈련 데이터에 침입이 존재할 경우 침입 부분에서 빈도가 매우 적은 데이터들이 연속적으로 나타나는 성질을 이용한다. 이를 위해 훈련 데이터를 일정 개수씩 블록으로 묶은 뒤 평균 빈도를 계산해서 그 값이 임계치보다 작은 경우 이를 침입 데이터로 간주하여 훈련 데이터에서 제외하는 방법을 사용하였다. 실험 결과 블록 크기를 적절하게 잡았을 경우 기존의 Eskin 기법보다 향상된 결과를 얻을 수 있었다.

  • PDF

Design and Implementation of Machine Learning System for Fine Dust Anomaly Detection based on Big Data (빅데이터 기반 미세먼지 이상 탐지 머신러닝 시스템 설계 및 구현)

  • Jae-Won Lee;Chi-Ho Lin
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.24 no.1
    • /
    • pp.55-58
    • /
    • 2024
  • In this paper, we propose a design and implementation of big data-based fine dust anomaly detection machine learning system. The proposed is system that classifies the fine dust air quality index through meteorological information composed of fine dust and big data. This system classifies fine dust through the design of an anomaly detection algorithm according to the outliers for each air quality index classification categories based on machine learning. Depth data of the image collected from the camera collects images according to the level of fine dust, and then creates a fine dust visibility mask. And, with a learning-based fingerprinting technique through a mono depth estimation algorithm, the fine dust level is derived by inferring the visibility distance of fine dust collected from the monoscope camera. For experimentation and analysis of this method, after creating learning data by matching the fine dust level data and CCTV image data by region and time, a model is created and tested in a real environment.

Regression diagnostics for response transformations in a partial linear model (부분선형모형에서 반응변수변환을 위한 회귀진단)

  • Seo, Han Son;Yoon, Min
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.1
    • /
    • pp.33-39
    • /
    • 2013
  • In the transformation of response variable in partial linear models outliers can cause a bad effect on estimating the transformation parameter, just as in the linear models. To solve this problem the processes of estimating transformation parameter and detecting outliers are needed, but have difficulties to be performed due to the arbitrariness of the nonparametric function included in the partial linear model. In this study, through the estimation of nonparametric function and outlier detection methods such as a sequential test and a maximum trimmed likelihood estimation, processes for transforming response variable robust to outliers in partial linear models are suggested. The proposed methods are verified and compared their effectiveness by simulation study and examples.

A Study on the Air Pollution Monitoring Network Algorithm Using Deep Learning (심층신경망 모델을 이용한 대기오염망 자료확정 알고리즘 연구)

  • Lee, Seon-Woo;Yang, Ho-Jun;Lee, Mun-Hyung;Choi, Jung-Moo;Yun, Se-Hwan;Kwon, Jang-Woo;Park, Ji-Hoon;Jung, Dong-Hee;Shin, Hye-Jung
    • Journal of Convergence for Information Technology
    • /
    • v.11 no.11
    • /
    • pp.57-65
    • /
    • 2021
  • We propose a novel method to detect abnormal data of specific symptoms using deep learning in air pollution measurement system. Existing methods generally detect abnomal data by classifying data showing unusual patterns different from the existing time series data. However, these approaches have limitations in detecting specific symptoms. In this paper, we use DeepLab V3+ model mainly used for foreground segmentation of images, whose structure has been changed to handle one-dimensional data. Instead of images, the model receives time-series data from multiple sensors and can detect data showing specific symptoms. In addition, we improve model's performance by reducing the complexity of noisy form time series data by using 'piecewise aggregation approximation'. Through the experimental results, it can be confirmed that anomaly data detection can be performed successfully.

Data-driven event detection method for efficient management and recovery of water distribution system man-made disasters (상수도관망 재난관리 및 복구를 위한 데이터기반 이상탐지 방법론 개발)

  • Jung, Donghwi;Ahn, Jaehyun
    • Journal of Korea Water Resources Association
    • /
    • v.51 no.8
    • /
    • pp.703-711
    • /
    • 2018
  • Water distribution system (WDS) pipe bursts are caused from excessive pressure, pipe aging, and ground shift from temperature change and earthquake. Prompt detection of and response to the failure event help prevent large-scale service interruption and catastrophic sinkhole generation. To that end, this study proposes a improved Western Electric Company (WECO) method to improve the detection effectiveness and efficiency of the original WECO method. The original WECO method is an univariate Statistical Process Control (SPC) technique used for identifying any non-random patterns in system output data. The improved WECO method multiples a threshold modifier (w) to each threshold of WECO sub-rules in order to control the sensitivity of anomaly detection in a water distribution network of interest. The Austin network was used to demonstrated the proposed method in which normal random and abnormal pipe flow data were generated. The best w value was identified from a sensitivity analysis, and the impact of measurement frequency (dt = 5, 10, 15 min etc.) was also investigated. The proposed method was compared to the original WECO method with respect to detection probability, false alarm rate, and averaged detection time. Finally, this study provides a set of guidelines on the use of the WECO method for real-life WDS pipe burst detection.

A Study on the Quality Control Method for Geotechnical Information Using AI (AI를 이용한 지반정보 품질관리 방안에 관한 연구)

  • Park, Ka-Hyun;Kim, Jongkwan;Lee, Seokhyung;Kim, Min-Ki;Lee, Kyung-Ryoon;Han, Jin-Tae
    • Journal of the Korean Geotechnical Society
    • /
    • v.38 no.11
    • /
    • pp.87-95
    • /
    • 2022
  • The geotechnical information constructed in the National Geotechnical Information DB System has been extensively used in design, construction, underground safety management, and disaster assessment. However, it is necessary to refine the geotechnical information because it has nearly 300,000 established cases containing a lot of missing or incorrect information. This research proposes a method for automatic quality control of geotechnical information using a fully connected neural network. Significantly, the anomalies in geotechnical information were detected using a database combining the standard penetration test results and strata information of Seoul. Consequently, the misclassification rate for the verification data is confirmed as 5.4%. Overall, the studied algorithm is expected to detect outliers of geotechnical information effectively.

A Design of SMS DDoS Detection and Defense Method using Counting Bloom Filter (Counting Bloom Filter를 이용한 SMS DDoS 탐지 및 방어 기법 설계)

  • Shin, Kwang-Kyoon;Park, Ui-Chung;Jun, Moon-Seog
    • Proceedings of the KAIS Fall Conference
    • /
    • 2011.05a
    • /
    • pp.53-56
    • /
    • 2011
  • 지난 7.7 DDoS(Distributed Denial of Service), 3.3 DDoS 대란을 통해서 보여주듯 DDoS 공격이 네트워크 주요 위협요소로 매우 부각되고 있으나, 공격에 대해서 실시간으로 감지하고 대응하기에 어렵다. 그리고 현재 여러 분야에서 매우 많은 용도로 사용되는 SMS(Short Message Service)도 DDoS 공격 수단으로 사용되어 이동전화 시스템에 큰 혼란을 야기할 수 있다. 기존의 Bloom Filter 탐지 기법은 구조가 간단하고 실시간 탐지가 가능한 장점을 갖지만 오탐지율에 대한 문제점을 가진다. 본 논문에서는 목적지 기반의 다중의 해시함수를 사용한 Counting Bloom Filter 기법을 이용하여 임계치 이상 카운트된 동일한 목적지로 발송되는 SMS에 대하여 공격으로 탐지하고 SMSC에 통보하여 차단시키는 시스템을 제안한다.

  • PDF

Detection of Anomaly VMS Messages Using Bi-Directional GPT Networks (양방향 GPT 네트워크를 이용한 VMS 메시지 이상 탐지)

  • Choi, Hyo Rim;Park, Seungyoung
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.21 no.4
    • /
    • pp.125-144
    • /
    • 2022
  • When a variable message signs (VMS) system displays false information related to traffic safety caused by malicious attacks, it could pose a serious risk to drivers. If the normal message patterns displayed on the VMS system are learned, it would be possible to detect and respond to the anomalous messages quickly. This paper proposes a method for detecting anomalous messages by learning the normal patterns of messages using a bi-directional generative pre-trained transformer (GPT) network. In particular, the proposed method was trained using the normal messages and their system parameters to minimize the corresponding negative log-likelihood (NLL) values. After adequate training, the proposed method could detect an anomalous message when its NLL value was larger than a pre-specified threshold value. The experiment results showed that the proposed method could detect malicious messages and cases when the system error occurs.

Deep Learning-based Time Series Data Prediction Research for Performance Enhancement in Cloud Monitoring Systems (클라우드 모니터링 시스템의 성능 향상을 위한 딥러닝을 이용한 시계열 데이터 예측 연구)

  • 김동완;홍두표;신용태
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.05a
    • /
    • pp.342-344
    • /
    • 2023
  • 클라우드 시장의 성장과 마이크로 서비스 접근식이 제기됨에 따라 IT인프라를 관리하기 위한 연구가 최근 활발히 이루어지고 있다. 하지만 고도화 및 분산된 환경에서 관찰 가능성 응용을 확보하기 어렵다는 문제점을 가지고 있다. 따라서 본 연구에서는 모니터링 시스템을 통한 데이터 분석 중 수집한 데이터의 분석이 난해하다는 문제를 해결하기 위한 방법을 제안한다. 제안된 방법은 NAB 데이터셋을 대상으로 STUMPY를 이용하여 데이터를 시각화하고, CNN을 이용하여 분류 작업을 수행한다. 분류를 수행한 데이터셋은 이상치 데이터와 이상 전조 데이터, 정상 데이터셋으로 분류하여 데이터셋을 구성한다. 구성한 학습 데이터셋에 대해 훈련을 마친 딥러닝 모델은 부하 테스트 환경에서 수집한 데이터에 대한 그래프 패턴을 분석하여 이상치 데이터와 이상 전조 데이터를 탐지한다.