Browse > Article
http://dx.doi.org/10.15207/JKCS.2018.9.7.001

Feature Selection for Anomaly Detection Based on Genetic Algorithm  

Seo, Jae-Hyun (Division of Computer Science & Engineering, WonKwang University)
Publication Information
Journal of the Korea Convergence Society / v.9, no.7, 2018 , pp. 1-7 More about this Journal
Abstract
Feature selection, one of data preprocessing techniques, is one of major research areas in many applications dealing with large dataset. It has been used in pattern recognition, machine learning and data mining, and is now widely applied in a variety of fields such as text classification, image retrieval, intrusion detection and genome analysis. The proposed method is based on a genetic algorithm which is one of meta-heuristic algorithms. There are two methods of finding feature subsets: a filter method and a wrapper method. In this study, we use a wrapper method, which evaluates feature subsets using a real classifier, to find an optimal feature subset. The training dataset used in the experiment has a severe class imbalance and it is difficult to improve classification performance for rare classes. After preprocessing the training dataset with SMOTE, we select features and evaluate them with various machine learning algorithms.
Keywords
Intrusion detection; Machine Learning; Genetic Algorithm; Feature Selection; PCA;
Citations & Related Records
Times Cited By KSCI : 3  (Citation Analysis)
연도 인용수 순위
1 P. Pudil, J. Novovicva & J. Kittler. (1994). Floating search methods in feature selection. Pattern recognition letters, 15(11), 1119-1125.   DOI
2 V. Bolon-Canedo, N. Sanchez-Marono & A. Alonso- Betanzos. (2011). Feature selection and classification in multiple class datasets: An application to KDD Cup 99 dataset. Expert Systems with Applications, 38(5), 5947-5957.   DOI
3 H. Nguyen, K. Franke & S. Petrovic. (2010, February). Improving effectiveness of intrusion detection by correlation feature selection. In Availability, Reliability, and Security, 2010. ARES'10 International Conference on, 17-24.
4 T. S. Chou, K. K. Yen & J. Luo. (2008). Network intrusion detection design using feature selection of soft computing paradigms. International journal of computational intelligence, 4(3), 196-208.
5 KDD Cup 1999 Data, http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html
6 N. V. Chawla, K. W. Bowyer, L. O. Hall & W. P. Kegelmeyer. (2002). SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research, 16, 321-357.   DOI
7 WEKA, https://www.cs.waikato.ac.nz/ml/weka/
8 D. E. Goldberg. (1989). Genetic Algorithms in Search, Optimization & Machine Learning. Addison. Wesely Publishing Co., Inc, 1998(3), 25.
9 J. H. Seo. (2015). A study on the performance evaluation of unbalanced intrusion detection dataset classification based on machine learning. Journal of the Korean Institute of Intelligence Systems, 27, 466-474.
10 H. Liu & L. Yu. (2005). Toward integrating feature selection algorithms for classification and clustering. IEEE Transactions on knowledge and data engineering, 17(4), 491-502.   DOI
11 I. Guyon & A. Elisseeff. (2003). An introduction to variable and feature selection. Journal of machine learning research, 3(Mar), 1157-1182.
12 E. M. Yang, H. J. Lee & C. H. Seo. (2017). Comparison of Detection Performance of Intrusion Detection System Using Fuzzy and Artificial Neural Network. Journal of Digital Convergence, 15(6), 391-398.   DOI
13 H. Y. Lee & H. S. Y. (2014). Quality Evaluation Model for Intrusion Detection System based on Security and Performance. Journal of Digital Convergence, 12(6), 289-295.   DOI
14 H. Y. Lee & H. S. Y. (2015). Convergence Performance Evaluation Model for Intrusion Protection System based on CC and ISO Standard. Journal of Digital Convergence, 13(5), 251-257.   DOI
15 A. Jain & D. Zongker. (1997). Feature selection: Evaluation, application, and small sample performance. IEEE transactions on pattern analysis and machine intelligence, 19(2), 153-158.   DOI
16 A. Blum & R. L. Rivest. (1989). Training a 3-node neural network is NP-complete. In Advances in neural information processing systems, 494-501.
17 R. Kohavi & G. H. John. (1997). Wrappers for feature subset selection. Artificial intelligence, 97(1-2), 273-324.   DOI