Browse > Article
http://dx.doi.org/10.13089/JKIISC.2017.27.6.1385

Machine Learning Based Intrusion Detection Systems for Class Imbalanced Datasets  

Cheong, Yun-Gyung (Sungkyunkwan University)
Park, Kinam (Sungkyunkwan University)
Kim, Hyunjoo (Electronics and Telecommunications Research Institute)
Kim, Jonghyun (Electronics and Telecommunications Research Institute)
Hyun, Sangwon (Sungkyunkwan University)
Abstract
This paper aims to develop an IDS (Intrusion Detection System) that takes into account class imbalanced datasets. For this, we first built a set of training data sets from the Kyoto 2006+ dataset in which the amounts of normal data and abnormal (intrusion) data are not balanced. Then, we have run a number of tests to evaluate the effectiveness of machine learning techniques for detecting intrusions. Our evaluation results demonstrated that the Random Forest algorithm achieved the best performances.
Keywords
Intrusion Detection System; Machine Learning; Imbalanced Dataset;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Visa, Sofia and Anca Ralescu, "Issues in Mining Imbalanced Data Sets - A Review Paper," Proceedings of the Sixteen Midwest Artificial Intelligence and Cognitive Science Conference, pp. 67-73, Apr. 2005.
2 Song, Jungsuk, Takakura, Hiroki, Okabe, Yasuo, Eto, Masahi, Inoue, Daisuke, and Nakao, Koji, "Statistical analysis of honeypot data and building of Kyoto 2006+ dataset for NIDS evaluation," Proceedings of the First Workshop on Building Analysis Datasets and Gathering Experience Returns for Security, pp. 29-36, Apr. 2011.
3 Song, Jungsuk, Takakura, Hiroki, Okabe, Yasuo, and Kwon, Yongjin, "Correlation analysis between honeypot data and IDS alerts using one-class SVM," Intrusion Detection Systems, InTech, pp. 173-192, Mar. 2011.
4 Sallay, Hassen and Sami Bourouis, "Intrusion detection alert management for high-speed networks: current researches and applications," Security and Communication Networks, vol. 8, no. 18, pp. 4362-4372, Dec. 2015.   DOI
5 Chitrakar, Roshan, and Chuanhe Huang, "Selection of Candidate Support Vectors in incremental SVM for network intrusion detection," Computers & Security, vol. 45, no. 16, pp. 231-241, Sep. 2014.   DOI
6 Ishida, Moriteru, Hiroki Takakura, and Yasuo Okabe, "High-performance intrusion detection using optigrid clustering and grid-based labelling," Proceedings of IEEE/IPSJ 11th International Symposium on Applications and the Internet, pp. 11-19, Jul. 2011.
7 Ambusaidi, Mohammed A., He, Xiangjian, Nanda, Priyadarsi, and Tan, Zhiyuan, "Building an intrusion detection system using a filter-based feature selection algorithm," IEEE Transactions on Computers, vol. 65, no.10, pp. 2986-2998, Jan. 2016.   DOI
8 KDD Cup 1999. Available on: http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html, Oct. 2007.
9 Kishimoto, Kazuya, Hirofumi Yamaki, and Hiroki Takakura, "Improving performance of anomaly-based ids by combining multiple classifiers," Proceedings of IEEE/IPSJ 11th International Symposium on Applications and the Internet, pp. 366-371, Jul. 2011.
10 Beaver, Justin M., Christopher T. Symons, and Robert E. Gillen, "A learning system for discriminating variants of malicious network traffic," Proceedings of the Eighth Annual Cyber Security and Information Intelligence Research Workshop, pp. 23-26, Jan. 2013.
11 Sahu, Shailendra and Babu M. Mehtre, "Network intrusion detection system using J48 Decision Tree," Proceedings of IEEE International Conference on Advances in Computing, Communications and Informatics, pp. 2023-2026, Aug. 2015.
12 RSA Security Analytics Data Sheet, Available on: https://www.rsa.com/content/dam/rsa/PDF/h13414-ds-pdf-saoverview.pdf
13 Cyphort Adaptive Detection Fabric Data Sheet, Available on: http://go.rt.com/rs/181-NTN-682/images/CYPHORT_DataSheet.pdf
14 2016 Cost of Cyber Crime Study & the Risk of Business Innovation. Ponemon Institute. Available on: https://www.ponemon.org/local/upload/file/2016%20HPE%20CCC%20GLOBAL%20REPORT%20FINAL%203.pdf