• Title/Summary/Keyword: Cyber Attack Datasets

Search Result 6, Processing Time 0.021 seconds

Clasification of Cyber Attack Group using Scikit Learn and Cyber Treat Datasets (싸이킷런과 사이버위협 데이터셋을 이용한 사이버 공격 그룹의 분류)

  • Kim, Kyungshin;Lee, Hojun;Kim, Sunghee;Kim, Byungik;Na, Wonshik;Kim, Donguk;Lee, Jeongwhan
    • Journal of Convergence for Information Technology
    • /
    • v.8 no.6
    • /
    • pp.165-171
    • /
    • 2018
  • The most threatening attack that has become a hot topic of recent IT security is APT Attack.. So far, there is no way to respond to APT attacks except by using artificial intelligence techniques. Here, we have implemented a machine learning algorithm for analyzing cyber threat data using machine learning method, using a data set that collects cyber attack cases using Scikit Learn, a big data machine learning framework. The result showed an attack classification accuracy close to 70%. This result can be developed into the algorithm of the security control system in the future.

Optimization of Cyber-Attack Detection Using the Deep Learning Network

  • Duong, Lai Van
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.7
    • /
    • pp.159-168
    • /
    • 2021
  • Detecting cyber-attacks using machine learning or deep learning is being studied and applied widely in network intrusion detection systems. We noticed that the application of deep learning algorithms yielded many good results. However, because each deep learning model has different architecture and characteristics with certain advantages and disadvantages, so those deep learning models are only suitable for specific datasets or features. In this paper, in order to optimize the process of detecting cyber-attacks, we propose the idea of building a new deep learning network model based on the association and combination of individual deep learning models. In particular, based on the architecture of 2 deep learning models: Convolutional Neural Network (CNN) and Long Short Term Memory (LSTM), we combine them into a combined deep learning network for detecting cyber-attacks based on network traffic. The experimental results in Section IV.D have demonstrated that our proposal using the CNN-LSTM deep learning model for detecting cyber-attacks based on network traffic is completely correct because the results of this model are much better than some individual deep learning models on all measures.

Attack Detection and Classification Method Using PCA and LightGBM in MQTT-based IoT Environment (MQTT 기반 IoT 환경에서의 PCA와 LightGBM을 이용한 공격 탐지 및 분류 방안)

  • Lee Ji Gu;Lee Soo Jin;Kim Young Won
    • Convergence Security Journal
    • /
    • v.22 no.4
    • /
    • pp.17-24
    • /
    • 2022
  • Recently, machine learning-based cyber attack detection and classification research has been actively conducted, achieving a high level of detection accuracy. However, low-spec IoT devices and large-scale network traffic make it difficult to apply machine learning-based detection models in IoT environment. Therefore, In this paper, we propose an efficient IoT attack detection and classification method through PCA(Principal Component Analysis) and LightGBM(Light Gradient Boosting Model) using datasets collected in a MQTT(Message Queuing Telementry Transport) IoT protocol environment that is also used in the defense field. As a result of the experiment, even though the original dataset was reduced to about 15%, the performance was almost similar to that of the original. It also showed the best performance in comparative evaluation with the four dimensional reduction techniques selected in this paper.

Feature Selection Algorithm for Intrusions Detection System using Sequential Forward Search and Random Forest Classifier

  • Lee, Jinlee;Park, Dooho;Lee, Changhoon
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.10
    • /
    • pp.5132-5148
    • /
    • 2017
  • Cyber attacks are evolving commensurate with recent developments in information security technology. Intrusion detection systems collect various types of data from computers and networks to detect security threats and analyze the attack information. The large amount of data examined make the large number of computations and low detection rates problematic. Feature selection is expected to improve the classification performance and provide faster and more cost-effective results. Despite the various feature selection studies conducted for intrusion detection systems, it is difficult to automate feature selection because it is based on the knowledge of security experts. This paper proposes a feature selection technique to overcome the performance problems of intrusion detection systems. Focusing on feature selection, the first phase of the proposed system aims at constructing a feature subset using a sequential forward floating search (SFFS) to downsize the dimension of the variables. The second phase constructs a classification model with the selected feature subset using a random forest classifier (RFC) and evaluates the classification accuracy. Experiments were conducted with the NSL-KDD dataset using SFFS-RF, and the results indicated that feature selection techniques are a necessary preprocessing step to improve the overall system performance in systems that handle large datasets. They also verified that SFFS-RF could be used for data classification. In conclusion, SFFS-RF could be the key to improving the classification model performance in machine learning.

Classification of Malware Families Using Hybrid Datasets (하이브리드 데이터셋을 이용한 악성코드 패밀리 분류)

  • Seo-Woo Choi;Myeong-Jin Han;Yeon-Ji Lee;Il-Gu Lee
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.33 no.6
    • /
    • pp.1067-1076
    • /
    • 2023
  • Recently, as variant malware has increased, the scale of cyber hacking incidents is expanding. To respond to intelligent cyberhacking attack, machine learning-based research is actively underway to effectively classify malware families. However, existing classification models have problems where performance deteriorates when the dataset is obfuscated or sparse. In this paper, we propose a hybrid dataset that combines features extracted from ASM files and BYTES files, and evaluate classification performance using FNN. As a result of the experiment, the proposed method showed performance improvement of about 4% compared to a single dataset, and in particular, performance improvement of about 30% for rare families.

Abnormal Data Augmentation Method Using Perturbation Based on Hypersphere for Semi-Supervised Anomaly Detection (준 지도 이상 탐지 기법의 성능 향상을 위한 섭동을 활용한 초구 기반 비정상 데이터 증강 기법)

  • Jung, Byeonggil;Kwon, Junhyung;Min, Dongjun;Lee, Sangkyun
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.4
    • /
    • pp.647-660
    • /
    • 2022
  • Recent works demonstrate that the semi-supervised anomaly detection method functions quite well in the environment with normal data and some anomalous data. However, abnormal data shortages can occur in an environment where it is difficult to reserve anomalous data, such as an unknown attack in the cyber security fields. In this paper, we propose ADA-PH(Abnormal Data Augmentation Method using Perturbation based on Hypersphere), a novel anomalous data augmentation method that is applicable in an environment where abnormal data is insufficient to secure the performance of the semi-supervised anomaly detection method. ADA-PH generates abnormal data by perturbing samples located relatively far from the center of the hypersphere. With the network intrusion detection datasets where abnormal data is rare, ADA-PH shows 23.63% higher AUC performance than anomaly detection without data augmentation and even performs better than the other augmentation methods. Also, we further conduct quantitative and qualitative analysis on whether generated abnormal data is anomalous.