• Title/Summary/Keyword: NSL-KDD

Search Result 27, Processing Time 0.03 seconds

A Study on comparison of KDD CUP 99 and NSL-KDD using artificial neural network (인공신경망을 통한 KDD CUP 99와 NSL-KDD 데이터 셋 비교)

  • Ji, Hyunjung;Kim, Yonghyun;Kim, Donghwa;Shin, Dongkyoo;Shin, Dongil
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.04a
    • /
    • pp.211-213
    • /
    • 2017
  • 최근 컴퓨터 네트워크를 활용하는 다양한 기기들이 개발되고 급격히 확산되면서, 컴퓨터 네크워크는 전보다 많은 보안문제에 직면하게 되었다. 이에 따라 네트워크 보안을 위한 침입탐지시스템의 필요성이 대두된다. 침입탐지시스템을 구현하기 위한 대표적인 데이터 셋으로는 KDD CUP 99(KDD'99)와 이후 KDD'99의 문제점을 보완하여 공개된 NSL-KDD가 있다. 본 논문에서는 KDD'99와 NSL-KDD를 소개하고 인공신경망을 통해 두 데이터 셋을 비교 분석하였다. Multi-Layer Perceptron을 사용해 데이터 셋을 분석해본 결과, KDD'99는 전체 정확도에서 더 높은 결과를 얻은 반면 공격 별 탐지 정확도 면에서는 NSL-KDD에 뒤쳐졌다.

Linear profile monitoring with random covariate (설명변수가 랜덤인 성형 프로파일 연구)

  • Kim, Daeun;Lee, Sungim;Lim, Johan
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.3
    • /
    • pp.335-346
    • /
    • 2022
  • Profile control chart aims to detect a change in the functional relationship of multivariate characteristics in the statistical process control. In monitoring two variables, a linear profile is of interest composed of the intercept and slope of one variable (response variable) against the other (explanatory variable). The previous studies on monitoring of the linear profile mostly assume that the explanatory variables are the same for all profiles. However, there are also cases where they vary depending on profiles. This paper intends to extend the monitoring method to where explanatory variables are different for each profile. We compare the new method's performance through simulation and apply it to monitoring a network intrusion using NSL-KDD data.

Analyzing Key Variables in Network Attack Classification on NSL-KDD Dataset using SHAP (SHAP 기반 NSL-KDD 네트워크 공격 분류의 주요 변수 분석)

  • Sang-duk Lee;Dae-gyu Kim;Chang Soo Kim
    • Journal of the Society of Disaster Information
    • /
    • v.19 no.4
    • /
    • pp.924-935
    • /
    • 2023
  • Purpose: The central aim of this study is to leverage machine learning techniques for the classification of Intrusion Detection System (IDS) data, with a specific focus on identifying the variables responsible for enhancing overall performance. Method: First, we classified 'R2L(Remote to Local)' and 'U2R (User to Root)' attacks in the NSL-KDD dataset, which are difficult to detect due to class imbalance, using seven machine learning models, including Logistic Regression (LR) and K-Nearest Neighbor (KNN). Next, we use the SHapley Additive exPlanation (SHAP) for two classification models that showed high performance, Random Forest (RF) and Light Gradient-Boosting Machine (LGBM), to check the importance of variables that affect classification for each model. Result: In the case of RF, the 'service' variable and in the case of LGBM, the 'dst_host_srv_count' variable were confirmed to be the most important variables. These pivotal variables serve as key factors capable of enhancing performance in the context of classification for each respective model. Conclusion: In conclusion, this paper successfully identifies the optimal models, RF and LGBM, for classifying 'R2L' and 'U2R' attacks, while elucidating the crucial variables associated with each selected model.

Decision Tree Techniques with Feature Reduction for Network Anomaly Detection (네트워크 비정상 탐지를 위한 속성 축소를 반영한 의사결정나무 기술)

  • Kang, Koohong
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.29 no.4
    • /
    • pp.795-805
    • /
    • 2019
  • Recently, there is a growing interest in network anomaly detection technology to tackle unknown attacks. For this purpose, diverse studies using data mining, machine learning, and deep learning have been applied to detect network anomalies. In this paper, we evaluate the decision tree to see its feasibility for network anomaly detection on NSL-KDD data set, which is one of the most popular data mining techniques for classification. In order to handle the over-fitting problem of decision tree, we select 13 features from the original 41 features of the data set using chi-square test, and then model the decision tree using TensorFlow and Scik-Learn, yielding 84% and 70% of binary classification accuracies on the KDDTest+ and KDDTest-21 of NSL-KDD test data set. This result shows 3% and 6% improvements compared to the previous 81% and 64% of binary classification accuracies by decision tree technologies, respectively.

An Intrusion Detection System based on the Artificial Neural Network for Real Time Detection (실시간 탐지를 위한 인공신경망 기반의 네트워크 침입탐지 시스템)

  • Kim, Tae Hee;Kang, Seung Ho
    • Convergence Security Journal
    • /
    • v.17 no.1
    • /
    • pp.31-38
    • /
    • 2017
  • As the cyber-attacks through the networks advance, it is difficult for the intrusion detection system based on the simple rules to detect the novel type of attacks such as Advanced Persistent Threat(APT) attack. At present, many types of research have been focused on the application of machine learning techniques to the intrusion detection system in order to detect previously unknown attacks. In the case of using the machine learning techniques, the performance of the intrusion detection system largely depends on the feature set which is used as an input to the system. Generally, more features increase the accuracy of the intrusion detection system whereas they cause a problem when fast responses are required owing to their large elapsed time. In this paper, we present a network intrusion detection system based on artificial neural network, which adopts a multi-objective genetic algorithm to satisfy the both requirements: accuracy, and fast response. The comparison between the proposing approach and previously proposed other approaches is conducted against NSL_KDD data set for the evaluation of the performance of the proposing approach.

A Feature Set Selection Approach Based on Pearson Correlation Coefficient for Real Time Attack Detection (실시간 공격 탐지를 위한 Pearson 상관계수 기반 특징 집합 선택 방법)

  • Kang, Seung-Ho;Jeong, In-Seon;Lim, Hyeong-Seok
    • Convergence Security Journal
    • /
    • v.18 no.5_1
    • /
    • pp.59-66
    • /
    • 2018
  • The performance of a network intrusion detection system using the machine learning method depends heavily on the composition and the size of the feature set. The detection accuracy, such as the detection rate or the false positive rate, of the system relies on the feature composition. And the time it takes to train and detect depends on the size of the feature set. Therefore, in order to enable the system to detect intrusions in real-time, the feature set to beused should have a small size as well as an appropriate composition. In this paper, we show that the size of the feature set can be further reduced without decreasing the detection rate through using Pearson correlation coefficient between features along with the multi-objective genetic algorithm which was used to shorten the size of the feature set in previous work. For the evaluation of the proposed method, the experiments to classify 10 kinds of attacks and benign traffic are performed against NSL_KDD data set.

  • PDF

Network Intrusion Detection with One Class Anomaly Detection Model based on Auto Encoder. (오토 인코더 기반의 단일 클래스 이상 탐지 모델을 통한 네트워크 침입 탐지)

  • Min, Byeoungjun;Yoo, Jihoon;Kim, Sangsoo;Shin, Dongil;Shin, Dongkyoo
    • Journal of Internet Computing and Services
    • /
    • v.22 no.1
    • /
    • pp.13-22
    • /
    • 2021
  • Recently network based attack technologies are rapidly advanced and intelligent, the limitations of existing signature-based intrusion detection systems are becoming clear. The reason is that signature-based detection methods lack generalization capabilities for new attacks such as APT attacks. To solve these problems, research on machine learning-based intrusion detection systems is being actively conducted. However, in the actual network environment, attack samples are collected very little compared to normal samples, resulting in class imbalance problems. When a supervised learning-based anomaly detection model is trained with such data, the result is biased to the normal sample. In this paper, we propose to overcome this imbalance problem through One-Class Anomaly Detection using an auto encoder. The experiment was conducted through the NSL-KDD data set and compares the performance with the supervised learning models for the performance evaluation of the proposed method.

Intrusion Detection System Modeling Based on Learning from Network Traffic Data

  • Midzic, Admir;Avdagic, Zikrija;Omanovic, Samir
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.11
    • /
    • pp.5568-5587
    • /
    • 2018
  • This research uses artificial intelligence methods for computer network intrusion detection system modeling. Primary classification is done using self-organized maps (SOM) in two levels, while the secondary classification of ambiguous data is done using Sugeno type Fuzzy Inference System (FIS). FIS is created by using Adaptive Neuro-Fuzzy Inference System (ANFIS). The main challenge for this system was to successfully detect attacks that are either unknown or that are represented by very small percentage of samples in training dataset. Improved algorithm for SOMs in second layer and for the FIS creation is developed for this purpose. Number of clusters in the second SOM layer is optimized by using our improved algorithm to minimize amount of ambiguous data forwarded to FIS. FIS is created using ANFIS that was built on ambiguous training dataset clustered by another SOM (which size is determined dynamically). Proposed hybrid model is created and tested using NSL KDD dataset. For our research, NSL KDD is especially interesting in terms of class distribution (overlapping). Objectives of this research were: to successfully detect intrusions represented in data with small percentage of the total traffic during early detection stages, to successfully deal with overlapping data (separate ambiguous data), to maximize detection rate (DR) and minimize false alarm rate (FAR). Proposed hybrid model with test data achieved acceptable DR value 0.8883 and FAR value 0.2415. The objectives were successfully achieved as it is presented (compared with the similar researches on NSL KDD dataset). Proposed model can be used not only in further research related to this domain, but also in other research areas.

Comparative Analysis of Machine Learning Techniques for IoT Anomaly Detection Using the NSL-KDD Dataset

  • Zaryn, Good;Waleed, Farag;Xin-Wen, Wu;Soundararajan, Ezekiel;Maria, Balega;Franklin, May;Alicia, Deak
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.1
    • /
    • pp.46-52
    • /
    • 2023
  • With billions of IoT (Internet of Things) devices populating various emerging applications across the world, detecting anomalies on these devices has become incredibly important. Advanced Intrusion Detection Systems (IDS) are trained to detect abnormal network traffic, and Machine Learning (ML) algorithms are used to create detection models. In this paper, the NSL-KDD dataset was adopted to comparatively study the performance and efficiency of IoT anomaly detection models. The dataset was developed for various research purposes and is especially useful for anomaly detection. This data was used with typical machine learning algorithms including eXtreme Gradient Boosting (XGBoost), Support Vector Machines (SVM), and Deep Convolutional Neural Networks (DCNN) to identify and classify any anomalies present within the IoT applications. Our research results show that the XGBoost algorithm outperformed both the SVM and DCNN algorithms achieving the highest accuracy. In our research, each algorithm was assessed based on accuracy, precision, recall, and F1 score. Furthermore, we obtained interesting results on the execution time taken for each algorithm when running the anomaly detection. Precisely, the XGBoost algorithm was 425.53% faster when compared to the SVM algorithm and 2,075.49% faster than the DCNN algorithm. According to our experimental testing, XGBoost is the most accurate and efficient method.

A Method to Find Feature Set for Detecting Various Denial Service Attacks in Power Grid (전력망에서의 다양한 서비스 거부 공격 탐지 위한 특징 선택 방법)

  • Lee, DongHwi;Kim, Young-Dae;Park, Woo-Bin;Kim, Joon-Seok;Kang, Seung-Ho
    • KEPCO Journal on Electric Power and Energy
    • /
    • v.2 no.2
    • /
    • pp.311-316
    • /
    • 2016
  • Network intrusion detection system based on machine learning method such as artificial neural network is quite dependent on the selected features in terms of accuracy and efficiency. Nevertheless, choosing the optimal combination of features, which guarantees accuracy and efficienty, from generally used many features to detect network intrusion requires extensive computing resources. In this paper, we deal with a optimal feature selection problem to determine 6 denial service attacks and normal usage provided by NSL-KDD data. We propose a optimal feature selection algorithm. Proposed algorithm is based on the multi-start local search algorithm, one of representative meta-heuristic algorithm for solving optimization problem. In order to evaluate the performance of our proposed algorithm, comparison with a case of all 41 features used against NSL-KDD data is conducted. In addtion, comparisons between 3 well-known machine learning methods (multi-layer perceptron., Bayes classifier, and Support vector machine) are performed to find a machine learning method which shows the best performance combined with the proposed feature selection method.