Browse > Article
http://dx.doi.org/10.15207/JKCS.2019.10.5.035

Classification Performance Improvement of UNSW-NB15 Dataset Based on Feature Selection  

Lee, Dae-Bum (Mokwon University)
Seo, Jae-Hyun (Division of Computer Science & Engineering, Wonkwang University)
Publication Information
Journal of the Korea Convergence Society / v.10, no.5, 2019 , pp. 35-42 More about this Journal
Abstract
Recently, as the Internet and various wearable devices have appeared, Internet technology has contributed to obtaining more convenient information and doing business. However, as the internet is used in various parts, the attack surface points that are exposed to attacks are increasing, Attempts to invade networks aimed at taking unfair advantage, such as cyber terrorism, are also increasing. In this paper, we propose a feature selection method to improve the classification performance of the class to classify the abnormal behavior in the network traffic. The UNSW-NB15 dataset has a rare class imbalance problem with relatively few instances compared to other classes, and an undersampling method is used to eliminate it. We use the SVM, k-NN, and decision tree algorithms and extract a subset of combinations with superior detection accuracy and RMSE through training and verification. The subset has recall values of more than 98% through the wrapper based experiments and the DT_PSO showed the best performance.
Keywords
IDS; Feature selection; Data preprocessing; Rare class; Machine learning;
Citations & Related Records
연도 인용수 순위
  • Reference
1 T. Janarthanan & S. Zargari. (2017). Feature selection in UNSW-NB15 and KDDCUP'99 datasets. In Industrial Electronics (ISIE), IEEE 26th International Symposium on. (pp. 1881-1886). IEEE.
2 C. Khammassi & S. Krichen. (2017). A GA-LR wrapper approach for feature selection in network intrusion detection. computers & security, 70, 255-277.   DOI
3 N. Moustafa & J. Slay. (2015). A hybrid feature selection for network intrusion detection systems: Central points. arXiv preprint arXiv:1707.05505.
4 M. Kamarudin, C. Maple, T. Watson, & N. Safa. (2017). A logitboost-based algorithm for detecting known and unknown web attacks. IEEE Access, 5, 26190-26200.   DOI
5 K. Mwitondi & S. Zargari. (2017). A Repeated Sampling and Clustering Method for Intrusion Detection. In International Conference in Data Mining (DMIN'17). (pp. 91-96). CSREA Press.
6 M. Belouch, S. E. Hadai, & M. Idhammad. (2017). A two-stage classifier approach using reptree algorithm for network intrusion detection. International Journal of Advanced Computer Science and Applications (ijacsa), 8(6), 389-394.
7 S. Guha. (2016). Attack detection for cyber systems and probabilistic state estimation in partially observable cyber environments. Arizona State University.
8 N. Moustafa, G. Creech & J. Slay. (2017). Novel geometric area analysis technique for anomaly detection using trapezoidal area estimation on large-scale networks. IEEE Transactions on Big Data.
9 M. Idhammad, K. Afdel, & M. Belouch. (2017). Dos detection method based on artificial neural networks. International Journal of Advanced Computer Science and Applications, 8(4), 465-471.
10 The UNSW-NB15 dataset. (2018). www.unsw.adfa.edu.au/unsw-canberra-cyber/cybersecurity/ADFA-NB15-Datasets.
11 CVE (Common Vulnerabilities and Exposures). (2018). cve.mitre.org.
12 WEKA. (2018). www.cs.waikato.ac.nz/ml/weka.
13 Y. Shi. (2001). Particle swarm optimization: developments, applications and resources. In evolutionary computation, 2001. Proceedings of the 2001 Congress on. (pp. 81-86). IEEE.
14 N. V. Chawla. (2009). Data mining for imbalanced datasets: An overview. In Data mining and knowledge discovery handbook. (pp. 875-886). Springer, Boston, MA.
15 R. Kohavi & H. J. George. (1997). Wrappers for feature subset selection. Artificial Intelligence, 97(1-2), 273-324.   DOI
16 J. rey Horn, N. Nafpliotis, & D. E. Goldberg. (1994). A niched Pareto genetic algorithm for multiobjective optimization. In Proceedings of the first IEEE conference on evolutionary computation, IEEE world congress on computational intelligence, (pp. 82-87).
17 M. Dorigo, M. Birattari, C. Blum, M. Clerc, T. Stutzle, & A. Winfield. (2008). Ant Colony Optimization and Swarm Intelligence. The 6th International Conference, ANTS 2008, Springer.