• Title/Summary/Keyword: hybrid feature selection

Search Result 45, Processing Time 0.024 seconds

Improved Network Intrusion Detection Model through Hybrid Feature Selection and Data Balancing (Hybrid Feature Selection과 Data Balancing을 통한 효율적인 네트워크 침입 탐지 모델)

  • Min, Byeongjun;Ryu, Jihun;Shin, Dongkyoo;Shin, Dongil
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.2
    • /
    • pp.65-72
    • /
    • 2021
  • Recently, attacks on the network environment have been rapidly escalating and intelligent. Thus, the signature-based network intrusion detection system is becoming clear about its limitations. To solve these problems, research on machine learning-based intrusion detection systems is being conducted in many ways, but two problems are encountered to use machine learning for intrusion detection. The first is to find important features associated with learning for real-time detection, and the second is the imbalance of data used in learning. This problem is fatal because the performance of machine learning algorithms is data-dependent. In this paper, we propose the HSF-DNN, a network intrusion detection model based on a deep neural network to solve the problems presented above. The proposed HFS-DNN was learned through the NSL-KDD data set and performs performance comparisons with existing classification models. Experiments have confirmed that the proposed Hybrid Feature Selection algorithm does not degrade performance, and in an experiment between learning models that solved the imbalance problem, the model proposed in this paper showed the best performance.

Hybrid Genetic Algorithms for Feature Selection and Classification Performance Comparisons (특징 선택을 위한 혼합형 유전 알고리즘과 분류 성능 비교)

  • 오일석;이진선;문병로
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.8
    • /
    • pp.1113-1120
    • /
    • 2004
  • This paper proposes a novel hybrid genetic algorithm for the feature selection. Local search operations are devised and embedded in hybrid GAs to fine-tune the search. The operations are parameterized in terms of the fine-tuning power, and their effectiveness and timing requirement are analyzed and compared. Experimentations performed with various standard datasets revealed that the proposed hybrid GA is superior to a simple GA and sequential search algorithms.

A Hybrid Efficient Feature Selection Model for High Dimensional Data Set based on KNHNAES (2013~2015) (KNHNAES (2013~2015) 에 기반한 대형 특징 공간 데이터집 혼합형 효율적인 특징 선택 모델)

  • Kwon, Tae il;Li, Dingkun;Park, Hyun Woo;Ryu, Kwang Sun;Kim, Eui Tak;Piao, Minghao
    • Journal of Digital Contents Society
    • /
    • v.19 no.4
    • /
    • pp.739-747
    • /
    • 2018
  • With a large feature space data, feature selection has become an extremely important procedure in the Data Mining process. But the traditional feature selection methods with single process may no longer fit for this procedure. In this paper, we proposed a hybrid efficient feature selection model for high dimensional data. We have applied our model on KNHNAES data set, the result shows that our model outperforms many existing methods in terms of accuracy over than at least 5%.

Hybrid Feature Selection Method Based on a Naïve Bayes Algorithm that Enhances the Learning Speed while Maintaining a Similar Error Rate in Cyber ISR

  • Shin, GyeongIl;Yooun, Hosang;Shin, DongIl;Shin, DongKyoo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.12
    • /
    • pp.5685-5700
    • /
    • 2018
  • Cyber intelligence, surveillance, and reconnaissance (ISR) has become more important than traditional military ISR. An agent used in cyber ISR resides in an enemy's networks and continually collects valuable information. Thus, this agent should be able to determine what is, and is not, useful in a short amount of time. Moreover, the agent should maintain a classification rate that is high enough to select useful data from the enemy's network. Traditional feature selection algorithms cannot comply with these requirements. Consequently, in this paper, we propose an effective hybrid feature selection method derived from the filter and wrapper methods. We illustrate the design of the proposed model and the experimental results of the performance comparison between the proposed model and the existing model.

Hybrid Feature Selection Method Based on Genetic Algorithm for the Diagnosis of Coronary Heart Disease

  • Wiharto, Wiharto;Suryani, Esti;Setyawan, Sigit;Putra, Bintang PE
    • Journal of information and communication convergence engineering
    • /
    • v.20 no.1
    • /
    • pp.31-40
    • /
    • 2022
  • Coronary heart disease (CHD) is a comorbidity of COVID-19; therefore, routine early diagnosis is crucial. A large number of examination attributes in the context of diagnosing CHD is a distinct obstacle during the pandemic when the number of health service users is significant. The development of a precise machine learning model for diagnosis with a minimum number of examination attributes can allow examinations and healthcare actions to be undertaken quickly. This study proposes a CHD diagnosis model based on feature selection, data balancing, and ensemble-based classification methods. In the feature selection stage, a hybrid SVM-GA combined with fast correlation-based filter (FCBF) is used. The proposed system achieved an accuracy of 94.60% and area under the curve (AUC) of 97.5% when tested on the z-Alizadeh Sani dataset and used only 8 of 54 inspection attributes. In terms of performance, the proposed model can be placed in the very good category.

Hybrid Case-based Reasoning and Genetic Algorithms Approach for Customer Classification

  • Kim Kyoung-jae;Ahn Hyunchul
    • Journal of information and communication convergence engineering
    • /
    • v.3 no.4
    • /
    • pp.209-212
    • /
    • 2005
  • This study proposes hybrid case-based reasoning and genetic algorithms model for customer classification. In this study, vertical and horizontal dimensions of the research data are reduced through integrated feature and instance selection process using genetic algorithms. We applied the proposed model to customer classification model which utilizes customers' demographic characteristics as inputs to predict their buying behavior for the specific product. Experimental results show that the proposed model may improve the classification accuracy and outperform various optimization models of typical CBR system.

Network intrusion detection Model through Hybrid Feature Selection and Data Balancing (Hybrid Feature Selection과 Data Balancing을 통한 네트워크 침입 탐지 모델)

  • Min, Byeongjun;Shin, Dongkyoo;Shin, Dongil
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2020.05a
    • /
    • pp.526-529
    • /
    • 2020
  • 최근 네트워크 환경에 대한 공격이 급속도로 고도화 및 지능화 되고 있기에, 기존의 시그니처 기반 침입탐지 시스템은 한계점이 명확해지고 있다. 이러한 문제를 해결하기 위해서 기계학습 기반의 침입 탐지 시스템에 대한 연구가 활발히 진행되고 있지만 기계학습을 침입 탐지에 이용하기 위해서는 두 가지 문제에 직면한다. 첫 번째는 실시간 탐지를 위한 학습과 연관된 중요 특징들을 선별하는 문제이며 두 번째는 학습에 사용되는 데이터의 불균형 문제로, 기계학습 알고리즘들은 데이터에 의존적이기에 이러한 문제는 치명적이다. 본 논문에서는 위 제시된 문제들을 해결하기 위해서 Hybrid Feature Selection과 Data Balancing을 통한 심층 신경망 기반의 네트워크 침입 탐지 모델을 제안한다. NSL-KDD 데이터 셋을 통해 학습을 진행하였으며, 평가를 위해 Accuracy, Precision, Recall, F1 Score 지표를 사용하였다. 본 논문에서 제안된 모델은 Random Forest 및 기본 심층 신경망 모델과 비교해 F1 Score를 기준으로 7~9%의 성능 향상을 이루었다.

Optimal k-Nearest Neighborhood Classifier Using Genetic Algorithm (유전알고리즘을 이용한 최적 k-최근접이웃 분류기)

  • Park, Chong-Sun;Huh, Kyun
    • Communications for Statistical Applications and Methods
    • /
    • v.17 no.1
    • /
    • pp.17-27
    • /
    • 2010
  • Feature selection and feature weighting are useful techniques for improving the classification accuracy of k-Nearest Neighbor (k-NN) classifier. The main propose of feature selection and feature weighting is to reduce the number of features, by eliminating irrelevant and redundant features, while simultaneously maintaining or enhancing classification accuracy. In this paper, a novel hybrid approach is proposed for simultaneous feature selection, feature weighting and choice of k in k-NN classifier based on Genetic Algorithm. The results have indicated that the proposed algorithm is quite comparable with and superior to existing classifiers with or without feature selection and feature weighting capability.

Relevancy contemplation in medical data analytics and ranking of feature selection algorithms

  • P. Antony Seba;J. V. Bibal Benifa
    • ETRI Journal
    • /
    • v.45 no.3
    • /
    • pp.448-461
    • /
    • 2023
  • This article performs a detailed data scrutiny on a chronic kidney disease (CKD) dataset to select efficient instances and relevant features. Data relevancy is investigated using feature extraction, hybrid outlier detection, and handling of missing values. Data instances that do not influence the target are removed using data envelopment analysis to enable reduction of rows. Column reduction is achieved by ranking the attributes through feature selection methodologies, namely, extra-trees classifier, recursive feature elimination, chi-squared test, analysis of variance, and mutual information. These methodologies are ranked via Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) using weight optimization to identify the optimal features for model building from the CKD dataset to facilitate better prediction while diagnosing the severity of the disease. An efficient hybrid ensemble and novel similarity-based classifiers are built using the pruned dataset, and the results are thereafter compared with random forest, AdaBoost, naive Bayes, k-nearest neighbors, and support vector machines. The hybrid ensemble classifier yields a better prediction accuracy of 98.31% for the features selected by extra tree classifier (ETC), which is ranked as the best by TOPSIS.

A New Confidence Measure for Eye Detection Using Pixel Selection (눈 검출에서의 픽셀 선택을 이용한 신뢰 척도)

  • Lee, Yonggeol;Choi, Sang-Il
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.4 no.7
    • /
    • pp.291-296
    • /
    • 2015
  • In this paper, we propose a new confidence measure using pixel selection for eye detection and design a hybrid eye detector. For this, we produce sub-images by applying a pixel selection method to the eye patches and construct the BDA(Biased Discriminant Analysis) feature space for measuring the confidence of the eye detection results. For a hybrid eye detector, we select HFED(Haar-like Feature based Eye Detector) and MFED(MCT Feature based Eye Detector), which are complementary to each other, as basic detectors. For a given image, each basic detector conducts eye detection and the confidence of each result is estimated in the BDA feature space by calculating the distances between the produced eye patches and the mean of positive samples in the training set. Then, the result with higher confidence is adopted as the final eye detection result and is used to the face alignment process for face recognition. The experimental results for various face databases show that the proposed method performs more accurate eye detection and consequently results in better face recognition performance compared with other methods.