• Title/Summary/Keyword: 사이버공격데이터셋

Search Result 22, Processing Time 0.029 seconds

Clasification of Cyber Attack Group using Scikit Learn and Cyber Treat Datasets (싸이킷런과 사이버위협 데이터셋을 이용한 사이버 공격 그룹의 분류)

  • Kim, Kyungshin;Lee, Hojun;Kim, Sunghee;Kim, Byungik;Na, Wonshik;Kim, Donguk;Lee, Jeongwhan
    • Journal of Convergence for Information Technology
    • /
    • v.8 no.6
    • /
    • pp.165-171
    • /
    • 2018
  • The most threatening attack that has become a hot topic of recent IT security is APT Attack.. So far, there is no way to respond to APT attacks except by using artificial intelligence techniques. Here, we have implemented a machine learning algorithm for analyzing cyber threat data using machine learning method, using a data set that collects cyber attack cases using Scikit Learn, a big data machine learning framework. The result showed an attack classification accuracy close to 70%. This result can be developed into the algorithm of the security control system in the future.

A Study on Dataset Construction Technique for Intrusion Detection based on Pattern Recognition (패턴인식 기반 침입탐지를 위한 데이터셋 구성 기법에 대한 연구)

  • Gong, Seong-Hyeon;Cho, Min-Jeong;Cho, Jae-ik;Lee, Changhoon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.04a
    • /
    • pp.343-345
    • /
    • 2017
  • 통신 기술이 발달하고, 네트워크 환경 또한 다양해짐에 따라 통신 사용자들에 대한 사이버 위협 또한 다양해졌다. 패턴인식 기술과 기계학습에 기반한 침입탐지 기술은 새롭게 보고되는 수많은 사이버 공격들에 대응하기 위해 등장하였다. 기계학습 기반의 IDS는 낮은 오탐률과 높은 효율성을 요구하며, 이러한 특징은 데이터셋을 구성하는 방법론에 큰 영향을 받는다. 본 논문에서는 패턴인식 기반 트래픽 분석을 수행하기 위한 데이터셋을 구성할 때 고려해야할 주안점에 대해 논하며, 현실의 사이버 위협 상황을 잘 반영할 수 있는 데이터셋을 도출하는 방법을 모색한다.

SWaT 테스트베드 데이터 셋 및 비정상행위 탐지 동향

  • Kwon, Sungmoon;Shon, Taeshik
    • Review of KIISC
    • /
    • v.29 no.2
    • /
    • pp.29-35
    • /
    • 2019
  • CPS(Cyber Physical System)에 대한 사이버 공격이 다양해지고 고도화됨에 따라 시그니쳐에 기반한 악성행위 탐지는 한계가 있어 기계학습 기반의 정상행위 학습을 통한 비정상행위 탐지 기법이 많이 연구되고 있다. 그러나 CPS 보안 연구는 보안상의 이유로 CPS 데이터가 주로 외부에 공개되지 않으며 또한 실제 비정상행위를 가동 중인 CPS에 실험하는 것이 불가능하여 개발 기법의 검증이 어려운 문제가 있다. 이를 해결하기 위해 2015년 SUTD(Singapore University of Technology and Design)의 iTrust 연구소에서 SWaT(Secure Water Treatment) 테스트베드를 구성하고 36가지의 공격을 수행한 데이터셋을 공개하였다. 이후 국 내외에서 SWaT 테스트베드 데이터를 사용하여 다양한 보안 기법을 검증한 연구결과가 발표되고 있으며 CPS 보안에 기여하고 있다. 따라서 본 논문에서는 SWaT 테스트베드 데이터 및 SWaT 테스트베드 데이터에 기반한 비정상행위 탐지 연구를 분석한 내용을 설명하고, 이를 통해 CPS 비정상행위 탐지 설계의 주요 요소를 분석하여 제시하고자 한다.

Classification of Malware Families Using Hybrid Datasets (하이브리드 데이터셋을 이용한 악성코드 패밀리 분류)

  • Seo-Woo Choi;Myeong-Jin Han;Yeon-Ji Lee;Il-Gu Lee
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.33 no.6
    • /
    • pp.1067-1076
    • /
    • 2023
  • Recently, as variant malware has increased, the scale of cyber hacking incidents is expanding. To respond to intelligent cyberhacking attack, machine learning-based research is actively underway to effectively classify malware families. However, existing classification models have problems where performance deteriorates when the dataset is obfuscated or sparse. In this paper, we propose a hybrid dataset that combines features extracted from ASM files and BYTES files, and evaluate classification performance using FNN. As a result of the experiment, the proposed method showed performance improvement of about 4% compared to a single dataset, and in particular, performance improvement of about 30% for rare families.

A Study on Pre-processing for the Classification of Rare Classes (희소 클래스 분류 문제 해결을 위한 전처리 연구)

  • Ryu, Kyungjoon;Shin, Dongkyoo;Shin, Dongil
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2020.05a
    • /
    • pp.472-475
    • /
    • 2020
  • 실생활의 사례를 바탕으로 생성된 여러 분야의 데이터셋을 기계학습 (Machine Learning) 문제에 적용하고 있다. 정보보안 분야에서도 사이버 공간에서의 공격 트래픽 데이터를 기계학습으로 분석하는 많은 연구들이 진행 되어 왔다. 본 논문에서는 공격 데이터를 유형별로 정확히 분류할 때, 실생활 데이터에서 흔하게 발생하는 데이터 불균형 문제로 인한 분류 성능 저하에 대한 해결방안을 연구했다. 희소 클래스 관점에서 데이터를 재구성하고 기계학습에 악영향을 끼치는 특징들을 제거하고 DNN(Deep Neural Network) 모델을 사용해 분류 성능을 평가했다.

A Study on Improving Precision Rate in Security Events Using Cyber Attack Dictionary and TF-IDF (공격키워드 사전 및 TF-IDF를 적용한 침입탐지 정탐률 향상 연구)

  • Jongkwan Kim;Myongsoo Kim
    • Convergence Security Journal
    • /
    • v.22 no.2
    • /
    • pp.9-19
    • /
    • 2022
  • As the expansion of digital transformation, we are more exposed to the threat of cyber attacks, and many institution or company is operating a signature-based intrusion prevention system at the forefront of the network to prevent the inflow of attacks. However, in order to provide appropriate services to the related ICT system, strict blocking rules cannot be applied, causing many false events and lowering operational efficiency. Therefore, many research projects using artificial intelligence are being performed to improve attack detection accuracy. Most researches were performed using a specific research data set which cannot be seen in real network, so it was impossible to use in the actual system. In this paper, we propose a technique for classifying major attack keywords in the security event log collected from the actual system, assigning a weight to each key keyword, and then performing a similarity check using TF-IDF to determine whether an actual attack has occurred.

Techniques for Improving Host-based Anomaly Detection Performance using Attack Event Types and Occurrence Frequencies

  • Juyeon Lee;Daeseon Choi;Seung-Hyun Kim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.11
    • /
    • pp.89-101
    • /
    • 2023
  • In order to prevent damages caused by cyber-attacks on nations, businesses, and other entities, anomaly detection techniques for early detection of attackers have been consistently researched. Real-time reduction and false positive reduction are essential to promptly prevent external or internal intrusion attacks. In this study, we hypothesized that the type and frequency of attack events would influence the improvement of anomaly detection true positive rates and reduction of false positive rates. To validate this hypothesis, we utilized the 2015 login log dataset from the Los Alamos National Laboratory. Applying the preprocessed data to representative anomaly detection algorithms, we confirmed that using characteristics that simultaneously consider the type and frequency of attack events is highly effective in reducing false positives and execution time for anomaly detection.

Classification Performance Improvement of UNSW-NB15 Dataset Based on Feature Selection (특징선택 기법에 기반한 UNSW-NB15 데이터셋의 분류 성능 개선)

  • Lee, Dae-Bum;Seo, Jae-Hyun
    • Journal of the Korea Convergence Society
    • /
    • v.10 no.5
    • /
    • pp.35-42
    • /
    • 2019
  • Recently, as the Internet and various wearable devices have appeared, Internet technology has contributed to obtaining more convenient information and doing business. However, as the internet is used in various parts, the attack surface points that are exposed to attacks are increasing, Attempts to invade networks aimed at taking unfair advantage, such as cyber terrorism, are also increasing. In this paper, we propose a feature selection method to improve the classification performance of the class to classify the abnormal behavior in the network traffic. The UNSW-NB15 dataset has a rare class imbalance problem with relatively few instances compared to other classes, and an undersampling method is used to eliminate it. We use the SVM, k-NN, and decision tree algorithms and extract a subset of combinations with superior detection accuracy and RMSE through training and verification. The subset has recall values of more than 98% through the wrapper based experiments and the DT_PSO showed the best performance.

A DDoS Attack Detection Technique through CNN Model in Software Define Network (소프트웨어-정의 네트워크에서 CNN 모델을 이용한 DDoS 공격 탐지 기술)

  • Ko, Kwang-Man
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.13 no.6
    • /
    • pp.605-610
    • /
    • 2020
  • Software Defined Networking (SDN) is setting the standard for the management of networks due to its scalability, flexibility and functionality to program the network. The Distributed Denial of Service (DDoS) attack is most widely used to attack the SDN controller to bring down the network. Different methodologies have been utilized to detect DDoS attack previously. In this paper, first the dataset is obtained by Kaggle with 84 features, and then according to the rank, the 20 highest rank features are selected using Permutation Importance Algorithm. Then, the datasets are trained and tested with Convolution Neural Network (CNN) classifier model by utilizing deep learning techniques. Our proposed solution has achieved the best results, which will allow the critical systems which need more security to adopt and take full advantage of the SDN paradigm without compromising their security.

Real-time security Monitroing assessment model for cybersecurity vulnera bilities in network separation situations (망분리 네트워크 상황에서 사이버보안 취약점 실시간 보안관제 평가모델)

  • Lee, DongHwi;Kim, Hong-Ki
    • Convergence Security Journal
    • /
    • v.21 no.1
    • /
    • pp.45-53
    • /
    • 2021
  • When the security monitoring system is performed in a separation network, there is little normal anomaly detection in internal networks or high-risk sections. Therefore, after the establishment of the security network, a model is needed to evaluate state-of-the-art cyber threat anomalies for internal network in separation network to complete the optimized security structure. In this study, We evaluate it by generating datasets of cyber vulnerabilities and malicious code arising from general and separation networks, It prepare for the latest cyber vulnerabilities in internal network cyber attacks to analyze threats, and established a cyber security test evaluation system that fits the characteristics. The study designed an evaluation model that can be applied to actual separation network institutions, and constructed a test data set for each situation and applied a real-time security assessment model.