• Title/Summary/Keyword: 사이버공격데이터셋

Search Result 24, Processing Time 0.024 seconds

Clasification of Cyber Attack Group using Scikit Learn and Cyber Treat Datasets (싸이킷런과 사이버위협 데이터셋을 이용한 사이버 공격 그룹의 분류)

  • Kim, Kyungshin;Lee, Hojun;Kim, Sunghee;Kim, Byungik;Na, Wonshik;Kim, Donguk;Lee, Jeongwhan
    • Journal of Convergence for Information Technology
    • /
    • v.8 no.6
    • /
    • pp.165-171
    • /
    • 2018
  • The most threatening attack that has become a hot topic of recent IT security is APT Attack.. So far, there is no way to respond to APT attacks except by using artificial intelligence techniques. Here, we have implemented a machine learning algorithm for analyzing cyber threat data using machine learning method, using a data set that collects cyber attack cases using Scikit Learn, a big data machine learning framework. The result showed an attack classification accuracy close to 70%. This result can be developed into the algorithm of the security control system in the future.

A Study on Dataset Construction Technique for Intrusion Detection based on Pattern Recognition (패턴인식 기반 침입탐지를 위한 데이터셋 구성 기법에 대한 연구)

  • Gong, Seong-Hyeon;Cho, Min-Jeong;Cho, Jae-ik;Lee, Changhoon
    • Annual Conference of KIPS
    • /
    • 2017.04a
    • /
    • pp.343-345
    • /
    • 2017
  • 통신 기술이 발달하고, 네트워크 환경 또한 다양해짐에 따라 통신 사용자들에 대한 사이버 위협 또한 다양해졌다. 패턴인식 기술과 기계학습에 기반한 침입탐지 기술은 새롭게 보고되는 수많은 사이버 공격들에 대응하기 위해 등장하였다. 기계학습 기반의 IDS는 낮은 오탐률과 높은 효율성을 요구하며, 이러한 특징은 데이터셋을 구성하는 방법론에 큰 영향을 받는다. 본 논문에서는 패턴인식 기반 트래픽 분석을 수행하기 위한 데이터셋을 구성할 때 고려해야할 주안점에 대해 논하며, 현실의 사이버 위협 상황을 잘 반영할 수 있는 데이터셋을 도출하는 방법을 모색한다.

SWaT 테스트베드 데이터 셋 및 비정상행위 탐지 동향

  • Kwon, Sungmoon;Shon, Taeshik
    • Review of KIISC
    • /
    • v.29 no.2
    • /
    • pp.29-35
    • /
    • 2019
  • CPS(Cyber Physical System)에 대한 사이버 공격이 다양해지고 고도화됨에 따라 시그니쳐에 기반한 악성행위 탐지는 한계가 있어 기계학습 기반의 정상행위 학습을 통한 비정상행위 탐지 기법이 많이 연구되고 있다. 그러나 CPS 보안 연구는 보안상의 이유로 CPS 데이터가 주로 외부에 공개되지 않으며 또한 실제 비정상행위를 가동 중인 CPS에 실험하는 것이 불가능하여 개발 기법의 검증이 어려운 문제가 있다. 이를 해결하기 위해 2015년 SUTD(Singapore University of Technology and Design)의 iTrust 연구소에서 SWaT(Secure Water Treatment) 테스트베드를 구성하고 36가지의 공격을 수행한 데이터셋을 공개하였다. 이후 국 내외에서 SWaT 테스트베드 데이터를 사용하여 다양한 보안 기법을 검증한 연구결과가 발표되고 있으며 CPS 보안에 기여하고 있다. 따라서 본 논문에서는 SWaT 테스트베드 데이터 및 SWaT 테스트베드 데이터에 기반한 비정상행위 탐지 연구를 분석한 내용을 설명하고, 이를 통해 CPS 비정상행위 탐지 설계의 주요 요소를 분석하여 제시하고자 한다.

Classification of Malware Families Using Hybrid Datasets (하이브리드 데이터셋을 이용한 악성코드 패밀리 분류)

  • Seo-Woo Choi;Myeong-Jin Han;Yeon-Ji Lee;Il-Gu Lee
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.33 no.6
    • /
    • pp.1067-1076
    • /
    • 2023
  • Recently, as variant malware has increased, the scale of cyber hacking incidents is expanding. To respond to intelligent cyberhacking attack, machine learning-based research is actively underway to effectively classify malware families. However, existing classification models have problems where performance deteriorates when the dataset is obfuscated or sparse. In this paper, we propose a hybrid dataset that combines features extracted from ASM files and BYTES files, and evaluate classification performance using FNN. As a result of the experiment, the proposed method showed performance improvement of about 4% compared to a single dataset, and in particular, performance improvement of about 30% for rare families.

Research on Identifying Manipulated Operation Data of Cyber-Physical System Based on Permutation Entropy (순열 엔트로피 기반 사이버 물리 시스템의 조작된 운영 데이터 식별 방안 연구)

  • Ka-Kyung Kim;Ieck-Chae Euom
    • Convergence Security Journal
    • /
    • v.24 no.3
    • /
    • pp.67-79
    • /
    • 2024
  • Attackers targeting critical infrastructure, such as energy plants, conduct intelligent and sophisticated attacks that conceal their traces until their objectives are achieved. Manipulating measurement data of cyber-physical systems, which are connected to the physical environment, directly impacts human safety. Given the unique characteristics of cyber-physical systems, a differentiated approach is necessary, distinct from traditional IT environment anomaly detection and identification methods. This study proposes a methodology that integrates both recursive filtering and an entropy-based approach to identify maliciously manipulated measurement data, considering the characteristics of cyber-physical systems. By applying the proposed approach to synthesized data based on a publicly available industrial control system security dataset in our research environment, the results demonstrate its effectiveness in identifying manipulated operational data.

Network Intrusion Detection Using One-Class Models (단일 클래스 모델을 활용한 네트워크 침입 탐지)

  • Byeongjun Min;Daekyeong Park
    • Convergence Security Journal
    • /
    • v.24 no.3
    • /
    • pp.13-21
    • /
    • 2024
  • Recently, with the rapid expansion of networks driven by the advancements of the Fourth Industrial Revolution, cybersecurity threats are becoming increasingly severe. Traditional signature-based Network Intrusion Detection Systems (NIDS) are effective in detecting known attacks but show limitations when faced with new threats such as Advanced Persistent Threats (APT). Additionally, deep learning models based on supervised learning can lead to biased decision boundaries due to the imbalanced nature of network traffic data, where normal traffic vastly outnumbers malicious traffic. To address these challenges, this paper proposes a network intrusion detection method based on one-class models that learn only from normal data to identify abnormal traffic. The effectiveness of this approach is validated through experiments using the Deep SVDD and MemAE models on the NSL-KDD dataset. Comparative analysis with supervised learning models demonstrates that the proposed method offers superior adaptability and performance in real-world scenarios.

A Study on Pre-processing for the Classification of Rare Classes (희소 클래스 분류 문제 해결을 위한 전처리 연구)

  • Ryu, Kyungjoon;Shin, Dongkyoo;Shin, Dongil
    • Annual Conference of KIPS
    • /
    • 2020.05a
    • /
    • pp.472-475
    • /
    • 2020
  • 실생활의 사례를 바탕으로 생성된 여러 분야의 데이터셋을 기계학습 (Machine Learning) 문제에 적용하고 있다. 정보보안 분야에서도 사이버 공간에서의 공격 트래픽 데이터를 기계학습으로 분석하는 많은 연구들이 진행 되어 왔다. 본 논문에서는 공격 데이터를 유형별로 정확히 분류할 때, 실생활 데이터에서 흔하게 발생하는 데이터 불균형 문제로 인한 분류 성능 저하에 대한 해결방안을 연구했다. 희소 클래스 관점에서 데이터를 재구성하고 기계학습에 악영향을 끼치는 특징들을 제거하고 DNN(Deep Neural Network) 모델을 사용해 분류 성능을 평가했다.

A Study on Improving Precision Rate in Security Events Using Cyber Attack Dictionary and TF-IDF (공격키워드 사전 및 TF-IDF를 적용한 침입탐지 정탐률 향상 연구)

  • Jongkwan Kim;Myongsoo Kim
    • Convergence Security Journal
    • /
    • v.22 no.2
    • /
    • pp.9-19
    • /
    • 2022
  • As the expansion of digital transformation, we are more exposed to the threat of cyber attacks, and many institution or company is operating a signature-based intrusion prevention system at the forefront of the network to prevent the inflow of attacks. However, in order to provide appropriate services to the related ICT system, strict blocking rules cannot be applied, causing many false events and lowering operational efficiency. Therefore, many research projects using artificial intelligence are being performed to improve attack detection accuracy. Most researches were performed using a specific research data set which cannot be seen in real network, so it was impossible to use in the actual system. In this paper, we propose a technique for classifying major attack keywords in the security event log collected from the actual system, assigning a weight to each key keyword, and then performing a similarity check using TF-IDF to determine whether an actual attack has occurred.

Techniques for Improving Host-based Anomaly Detection Performance using Attack Event Types and Occurrence Frequencies

  • Juyeon Lee;Daeseon Choi;Seung-Hyun Kim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.11
    • /
    • pp.89-101
    • /
    • 2023
  • In order to prevent damages caused by cyber-attacks on nations, businesses, and other entities, anomaly detection techniques for early detection of attackers have been consistently researched. Real-time reduction and false positive reduction are essential to promptly prevent external or internal intrusion attacks. In this study, we hypothesized that the type and frequency of attack events would influence the improvement of anomaly detection true positive rates and reduction of false positive rates. To validate this hypothesis, we utilized the 2015 login log dataset from the Los Alamos National Laboratory. Applying the preprocessed data to representative anomaly detection algorithms, we confirmed that using characteristics that simultaneously consider the type and frequency of attack events is highly effective in reducing false positives and execution time for anomaly detection.

Classification Performance Improvement of UNSW-NB15 Dataset Based on Feature Selection (특징선택 기법에 기반한 UNSW-NB15 데이터셋의 분류 성능 개선)

  • Lee, Dae-Bum;Seo, Jae-Hyun
    • Journal of the Korea Convergence Society
    • /
    • v.10 no.5
    • /
    • pp.35-42
    • /
    • 2019
  • Recently, as the Internet and various wearable devices have appeared, Internet technology has contributed to obtaining more convenient information and doing business. However, as the internet is used in various parts, the attack surface points that are exposed to attacks are increasing, Attempts to invade networks aimed at taking unfair advantage, such as cyber terrorism, are also increasing. In this paper, we propose a feature selection method to improve the classification performance of the class to classify the abnormal behavior in the network traffic. The UNSW-NB15 dataset has a rare class imbalance problem with relatively few instances compared to other classes, and an undersampling method is used to eliminate it. We use the SVM, k-NN, and decision tree algorithms and extract a subset of combinations with superior detection accuracy and RMSE through training and verification. The subset has recall values of more than 98% through the wrapper based experiments and the DT_PSO showed the best performance.