• Title/Summary/Keyword: 이상치탐지

Search Result 148, Processing Time 0.023 seconds

Machine Learning-based Phishing Website Detection Model (머신러닝 기반 피싱 사이트 탐지 모델)

  • Sumin Oh;Minseo Park
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.4
    • /
    • pp.575-580
    • /
    • 2024
  • Detecting the status of websites, normal or phishing, is necessary to defend against intelligent phishing attacks. We propose a machine learning-based classification to predict the status of websites. First, we collect information about 'URL', convert it into numerical data, and remove outliers. Second, we apply VIF(Variance Inflation Factors) to understand the correlation and independence between variables. Finally, we develop a phishing website detection model with machine learning-based classifications, which predicts website status. In the test datasets, Random Forest showed the best performance, with precision of 93.74%, recall of 92.26%, and accuracy of 93.14%. In the future, we expect to apply our model to detect various phishing crimes.

Detection of Cold Water Mass along the East Coast of Korea Using Satellite Sea Surface Temperature Products (인공위성 해수면온도 자료를 이용한 동해 연안 냉수대 탐지 알고리즘 개발)

  • Won-Jun Choi;Chan-Su Yang
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.6_1
    • /
    • pp.1235-1243
    • /
    • 2023
  • This study proposes the detection algorithm for the cold water mass (CWM) along the eastern coast of the Korean Peninsula using sea surface temperature (SST) data provided by the Korea Institute of Ocean Science and Technology (KIOST). Considering the occurrence and distribution of the CWM, the eastern coast of the Korean Peninsula is classified into 3 regions("Goseong-Uljin", "Samcheok-Guryongpo", "Pohang-Gijang"), and the K-means clustering is first applied to SST field of each region. Three groups, K-means clusters are used to determine CWM through applying a double threshold filter predetermined using the standard deviation and the difference of average SST for the 3 groups. The estimated sea area is judged by the CWM if the standard deviation in the sea area is 0.6℃ or higher and the average water temperature difference is 2℃ or higher. As a result of the CWM detection in 2022, the number of CWM occurrences in "Pohang-Gijang" was the most frequent on 77 days and performance indicators of the confusion matrix were calculated for quantitative evaluation. The accuracy of the three regions was 0.83 or higher, and the F1 score recorded a maximum of 0.95 in "Pohang-Gijang". The detection algorithm proposed in this study has been applied to the KIOST SST system providing a CWM map by email.

A Sensor Data Management System for USN based Fire Detection Application (USN 기반의 화재감시 응용을 위한 센서 데이터 처리 시스템)

  • Park, Won-Ik;Kim, Young-Kuk
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.5
    • /
    • pp.135-145
    • /
    • 2011
  • These days, the research of a sensor data management system for USN based real-time monitoring application is active thanks to the development and diffusion of sensor technology. The sensor data is rapidly changeable, continuous and massive row level data. However, end user is only interested in high level data. So, it is essential to effectively process the row level data which is changeable, continuous and massive. In this paper, we propose a sensor data management system with multi-analytical query function using OLAP and anomaly detection function using learning based classifier. In the experimental section, we show that our system is valid through the some experimental scenarios. For the this, we use a sensor data generator implemented by ourselves.

Design of Sensor Data's Missing Value Handling Technique for Pet Healthcare Service based on Graph Attention Networks (펫 헬스 케어 서비스를 위한 GATs 기반 센서 데이터 처리 기법 설계)

  • Lee, Jihoon;Moon, Nammee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.05a
    • /
    • pp.463-465
    • /
    • 2021
  • 센서 데이터는 여러가지 원인으로 인해 데이터 결측치가 발생할 수 있으며, 결측치로 인한 데이터의 처리 방식에 따라 데이터 분석 결과가 다르게 해석될 수 있다. 이는 펫 헬스 케어 서비스에서 치명적인 문제로 연결될 수 있다. 따라서 본 논문에서는 펫 웨어러블 디바이스로부터 수집되는 다양한 센서 데이터의 결측치를 처리하기 위해 GATs(Graph Attention neTworks)와 LSTM(Long Short Term Memory)을 결합하여 활용한 데이터 결측치 처리 기법을 제안한다. 펫 웨어러블 디바이스의 센서 데이터가 서로 연관성을 가지고 있다는 점을 바탕으로 인접 노드의 Attention 수치와 Feature map을 도출한다. 이후 Prediction Layer 를 통해 결측치의 Feature 를 예측한다. 예측된 Feature 를 기반으로 Decoding 과정과 함께 결측치 보간이 이루어진다. 제안된 기법은 모델의 변형을 통해 이상치 탐지에도 활용할 수 있을 것으로 기대한다.

A robust detection algorithm against clutters in active sonar in shallow coastal environment (연안 환경에서 클러터에 강인한 능동소나 탐지 알고리듬)

  • Jang, Eun Jeong;Kwon, Sungchur;Oh, Won Tcheon;Lee, Jung Woo;Shin, Keecheol;Kim, Juho
    • The Journal of the Acoustical Society of Korea
    • /
    • v.38 no.6
    • /
    • pp.661-669
    • /
    • 2019
  • High frequency active sonar is appropriate for detecting small targets such as a diver in coast environment. In case of using high frequency active sonar in shallow coastal environment, a false alarm rate is high due to clutters caused by marine biological noise, ship noise, wake, etc. In this paper, we propose an algorithm for target detection which is robust against clutter in active sonar system in shallow coastal environment. The proposed algorithm increases the rate of reduction clutter using calculation of statistical characteristics of signal and a clustering method. The algorithm is evaluated and analysed with sea trial data, as a result, that shows the rate of reducing rate of clutter of 96 % and over.

Improved Fault Detection Based on One-Class Classification and Feature Selection (단일 클래스 분류와 특징 선택에 기반한 향상된 이상 감지)

  • Cho, Hyun-Woo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.20 no.8
    • /
    • pp.216-223
    • /
    • 2019
  • Fault detection during production processes is one of the required operational tasks to run production processes both safely and consistently. Unexpected operational events or undetected process faults can have a serious impact on the production systems and subsequently on the final products' quality. In addition, such situations may lead to malfunctions or breakdowns of production processes. To reliably detect such abnormalities, a new one-class classification-based detection scheme has recently been developed The proposed method consists of four steps:1) noise filtering, 2) feature selection, 3) nonlinear representation and 4) outlier detection. The performance of the proposed scheme was demonstrated using the multivariate data obtained from a simulation process. The results have shown that the proposed method produced reliable monitoring results and outperforms any existing methods with an average improvement of 25.4%. The use of proper feature selection in the proposed framework yielded better detection performance.

Analysis of cross-borehole pulse radar signatures measured at various tunnel angles (다양한 투과 각도에서 측정된 투과형 펄스 시추공 레이더 신호 분석)

  • Kim, Sang-Wook;Kim, Se-Yun
    • Geophysics and Geophysical Exploration
    • /
    • v.13 no.1
    • /
    • pp.96-101
    • /
    • 2010
  • A pulse radar system has been developed recently to detect dormant underground tunnels that are deeply located at depths of hundreds of metres. To check the ability of the radar system to detect an obliquely oriented tunnel, five different borehole pairs in the tunnel test site were chosen so that the horizontal lines-of-sight cut the tunnel axis obliquely, in $15^{\circ}$ steps. The pulse radar signatures were measured over a depth range of 20 m around the centre of the air-filled tunnel. Three canonical parameters, consisting of the arrival time, attenuation, and dispersion time were extracted from the first and second peaks of the measured radar signatures. Using those parameters, the radar system can detect obliquely oriented tunnels at various angles up to 45 from the transmitter-receiver line of sight.

Network based Anomaly Intrusion Detection using Bayesian Network Techniques (네트워크 서비스별 이상 탐지를 위한 베이지안 네트워크 기법의 정상 행위 프로파일링)

  • Cha ByungRae;Park KyoungWoo;Seo JaeHyun
    • Journal of Internet Computing and Services
    • /
    • v.6 no.1
    • /
    • pp.27-38
    • /
    • 2005
  • Recently, the rapidly development of computing environments and the spread of Internet make possible to obtain and use of information easily. Immediately, by opposition function the Hacker's unlawful intrusion and threats rise for network environments as time goes on. Specially, the internet consists of Unix and TCP/IP had many vulnerability. the security techniques of authentication and access controls cannot adequate to solve security problem, thus IDS developed with 2nd defence line. In this paper, intrusion detection method using Bayesian Networks estimated probability values of behavior contexts based on Bayes theory. The contexts of behaviors or events represents Bayesian Networks of graphic types. We profiled concisely normal behaviors using behavior context. And this method be able to detect new intrusions or modificated intrusions. We had simulation using DARPA 2000 Intrusion Data.

  • PDF

Notes on identifying source of out-of-control signals in phase II multivariate process monitoring (다변량 공정 모니터링에서 이상신호 발생시 원인 식별에 관한 연구)

  • Lee, Sungim
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.1
    • /
    • pp.1-11
    • /
    • 2018
  • Multivariate process control has become important in various applied fields. For instance, there are many situations in which the simultaneous monitoring of multivariate quality characteristics is necessary for the manufacturing industry. Despite its importance, its practical usage is not as convenient because it is difficult to identify the source of the out-of-control signal in a multivariate control chart. In this paper, we will introduce how to detect the source of the out-of-control by using confidence intervals for new observations, and will discuss the identification and interpretation of the out-of-control variable through simulation studies.

Identification of the out-of-control variable based on Hotelling's T2 statistic (호텔링 T2의 이상신호 원인 식별)

  • Lee, Sungim
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.6
    • /
    • pp.811-823
    • /
    • 2018
  • Multivariate control chart based on Hotelling's $T^2$ statistic is a powerful tool in statistical process control for identifying an out-of-control process. It is used to monitor multiple process characteristics simultaneously. Detection of the out-of-control signal with the $T^2$ chart indicates mean vector shifts. However, these multivariate signals make it difficult to interpret the cause of the out-of-control signal. In this paper, we review methods of signal interpretation based on the Mason, Young, and Tracy (MYT) decomposition of the $T^2$ statistic. We also provide an example on how to implement it using R software and demonstrate simulation studies for comparing the performance of these methods.