Loading [MathJax]/jax/output/CommonHTML/jax.js
  • Title/Summary/Keyword: Feature selection algorithm

Search Result 345, Processing Time 0.026 seconds

A Novel CNN and GA-Based Algorithm for Intrusion Detection in IoT Devices

  • Ibrahim Darwish;Samih Montser;Mohamed R. Saadi
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.9
    • /
    • pp.55-64
    • /
    • 2023
  • The Internet of Things (IoT) is the combination of the internet and various sensing devices. IoT security has increasingly attracted extensive attention. However, significant losses appears due to malicious attacks. Therefore, intrusion detection, which detects malicious attacks and their behaviors in IoT devices plays a crucial role in IoT security. The intrusion detection system, namely IDS should be executed efficiently by conducting classification and efficient feature extraction techniques. To effectively perform Intrusion detection in IoT applications, a novel method based on a Conventional Neural Network (CNN) for classification and an improved Genetic Algorithm (GA) for extraction is proposed and implemented. Existing issues like failing to detect the few attacks from smaller samples are focused, and hence the proposed novel CNN is applied to detect almost all attacks from small to large samples. For that purpose, the feature selection is essential. Thus, the genetic algorithm is improved to identify the best fitness values to perform accurate feature selection. To evaluate the performance, the NSL-KDDCUP dataset is used, and two datasets such as KDDTEST21 and KDDTEST+ are chosen. The performance and results are compared and analyzed with other existing models. The experimental results show that the proposed algorithm has superior intrusion detection rates to existing models, where the accuracy and true positive rate improve and the false positive rate decrease. In addition, the proposed algorithm indicates better performance on KDDTEST+ than KDDTEST21 because there are few attacks from minor samples in KDDTEST+. Therefore, the results demonstrate that the novel proposed CNN with the improved GA can identify almost every intrusion.

An ADHD Diagnostic Approach Based on Binary-Coded Genetic Algorithm and Extreme Learning Machine

  • Sachnev, Vasily;Suresh, Sundaram
    • Journal of Computing Science and Engineering
    • /
    • v.10 no.4
    • /
    • pp.111-117
    • /
    • 2016
  • An accurate approach for diagnosis of attention deficit hyperactivity disorder (ADHD) is presented in this paper. The presented technique efficiently classifies three subtypes of ADHD (ADHD-C, ADHD-H, ADHD-I) and typically developing control (TDC) by using only structural magnetic resonance imaging (MRI). The research examines structural MRI of the hippocampus from the ADHD-200 database. Each available MRI has been processed by a region-of-interest (ROI) to build a set of features for further analysis. The presented ADHD diagnostic approach unifies feature selection and classification techniques. The feature selection technique based on the proposed binary-coded genetic algorithm searches for an optimal subset of features extracted from the hippocampus. The classification technique uses a chosen optimal subset of features for accurate classification of three subtypes of ADHD and TDC. In this study, the famous Extreme Learning Machine is used as a classification technique. Experimental results clearly indicate that the presented BCGA-ELM (binary-coded genetic algorithm coupled with Extreme Learning Machine) efficiently classifies TDC and three subtypes of ADHD and outperforms existing techniques.

Optimal Band Selection Techniques for Hyperspectral Image Pixel Classification using Pooling Operations & PSNR (초분광 이미지 픽셀 분류를 위한 풀링 연산과 PSNR을 이용한 최적 밴드 선택 기법)

  • Chang, Duhyeuk;Jung, Byeonghyeon;Heo, Junyoung
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.21 no.5
    • /
    • pp.141-147
    • /
    • 2021
  • In this paper, in order to improve the utilization of hyperspectral large-capacity data feature information by reducing complex computations by dimension reduction of neural network inputs in embedded systems, the band selection algorithm is applied in each subset. Among feature extraction and feature selection techniques, the feature selection aim to improve the optimal number of bands suitable for datasets, regardless of wavelength range, and the time and performance, more than others algorithms. Through this experiment, although the time required was reduced by 1/3 to 1/9 times compared to the others band selection technique, meaningful results were improved by more than 4% in terms of performance through the K-neighbor classifier. Although it is difficult to utilize real-time hyperspectral data analysis now, it has confirmed the possibility of improvement.

Model based Facial Expression Recognition using New Feature Space (새로운 얼굴 특징공간을 이용한 모델 기반 얼굴 표정 인식)

  • Kim, Jin-Ok
    • The KIPS Transactions:PartB
    • /
    • v.17B no.4
    • /
    • pp.309-316
    • /
    • 2010
  • This paper introduces a new model based method for facial expression recognition that uses facial grid angles as feature space. In order to be able to recognize the six main facial expression, proposed method uses a grid approach and therefore it establishes a new feature space based on the angles that each gird's edge and vertex form. The way taken in the paper is robust against several affine transformations such as translation, rotation, and scaling which in other approaches are considered very harmful in the overall accuracy of a facial expression recognition algorithm. Also, this paper demonstrates the process that the feature space is created using angles and how a selection process of feature subset within this space is applied with Wrapper approach. Selected features are classified by SVM, 3-NN classifier and classification results are validated with two-tier cross validation. Proposed method shows 94% classification result and feature selection algorithm improves results by up to 10% over the full set of feature.

Development of Heuristic Algorithm Using Data-mining Method (데이터마이닝 방법을 응용한 휴리스틱 알고리즘 개발)

  • Kim, Pan-Soo
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.28 no.4
    • /
    • pp.94-101
    • /
    • 2005
  • This paper presents a data-mining aided heuristic algorithm development. The developed algorithm includes three steps. The steps are a uniform selection, development of feature functions and clustering, and a decision tree making. The developed algorithm is employed in designing an optimal multi-station fixture layout. The objective is to minimize the sensitivity function subject to geometric constraints. Its benefit is presented by a comparison with currently available optimization methods.

A Study on Feature Selection and Feature Extraction for Hyperspectral Image Classification Using Canonical Correlation Classifier (정준상관분류에 의한 하이퍼스펙트럴영상 분류에서 유효밴드 선정 및 추출에 관한 연구)

  • Park, Min-Ho
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.29 no.3D
    • /
    • pp.419-431
    • /
    • 2009
  • The core of this study is finding out the efficient band selection or extraction method discovering the optimal spectral bands when applying canonical correlation classifier (CCC) to hyperspectral data. The optimal efficient bands grounded on each separability decision technique are selected using Multispec(C) software developed by Purdue university of USA. Total 6 separability decision techniques are used, which are Divergence, Transformed Divergence, Bhattacharyya, Mean Bhattacharyya, Covariance Bhattacharyya, Noncovariance Bhattacharyya. For feature extraction, PCA transformation and MNF transformation are accomplished by ERDAS Imagine and ENVI software. For the comparison and assessment on the effect of feature selection and feature extraction, land cover classification is performed by CCC. The overall accuracy of CCC using the firstly selected 60 bands is 71.8%, the highest classification accuracy acquired by CCC is 79.0% as the case that executes CCC after appling Noncovariance Bhattacharyya. In conclusion, as a matter of fact, only Noncovariance Bhattacharyya separability decision method was valuable as feature selection algorithm for hyperspectral image classification depended on CCC. The lassification accuracy using other feature selection and extraction algorithms except Divergence rather declined in CCC.

Writer verification using feature selection based on genetic algorithm: A case study on handwritten Bangla dataset

  • Jaya Paul;Kalpita Dutta;Anasua Sarkar;Kaushik Roy;Nibaran Das
    • ETRI Journal
    • /
    • v.46 no.4
    • /
    • pp.648-659
    • /
    • 2024
  • Author verification is challenging because of the diversity in writing styles. We propose an enhanced handwriting verification method that combines handcrafted and automatically extracted features. The method uses a genetic algorithm to reduce the dimensionality of the feature set. We consider offline Bangla handwriting content and evaluate the proposed method using handcrafted features with a simple logistic regression, radial basis function network, and sequential minimal optimization as well as automatically extracted features using a convolutional neural network. The handcrafted features outperform the automatically extracted ones, achieving an average verification accuracy of 94.54% for 100 writers. The handcrafted features include Radon transform, histogram of oriented gradients, local phase quantization, and local binary patterns from interwriter and intrawriter content. The genetic algorithm reduces the feature dimensionality and selects salient features using a support vector machine. The top five experimental results are obtained from the optimal feature set selected using a consensus strategy. Comparisons with other methods and features confirm the satisfactory results.

Feature Selection Using Submodular Approach for Financial Big Data

  • Attigeri, Girija;Manohara Pai, M.M.;Pai, Radhika M.
    • Journal of Information Processing Systems
    • /
    • v.15 no.6
    • /
    • pp.1306-1325
    • /
    • 2019
  • As the world is moving towards digitization, data is generated from various sources at a faster rate. It is getting humungous and is termed as big data. The financial sector is one domain which needs to leverage the big data being generated to identify financial risks, fraudulent activities, and so on. The design of predictive models for such financial big data is imperative for maintaining the health of the country's economics. Financial data has many features such as transaction history, repayment data, purchase data, investment data, and so on. The main problem in predictive algorithm is finding the right subset of representative features from which the predictive model can be constructed for a particular task. This paper proposes a correlation-based method using submodular optimization for selecting the optimum number of features and thereby, reducing the dimensions of the data for faster and better prediction. The important proposition is that the optimal feature subset should contain features having high correlation with the class label, but should not correlate with each other in the subset. Experiments are conducted to understand the effect of the various subsets on different classification algorithms for loan data. The IBM Bluemix BigData platform is used for experimentation along with the Spark notebook. The results indicate that the proposed approach achieves considerable accuracy with optimal subsets in significantly less execution time. The algorithm is also compared with the existing feature selection and extraction algorithms.

New Rectangle Feature Type Selection for Real-time Facial Expression Recognition (실시간 얼굴 표정 인식을 위한 새로운 사각 특징 형태 선택기법)

  • Kim Do Hyoung;An Kwang Ho;Chung Myung Jin;Jung Sung Uk
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.12 no.2
    • /
    • pp.130-137
    • /
    • 2006
  • In this paper, we propose a method of selecting new types of rectangle features that are suitable for facial expression recognition. The basic concept in this paper is similar to Viola's approach, which is used for face detection. Instead of previous Haar-like features we choose rectangle features for facial expression recognition among all possible rectangle types in a 3×3 matrix form using the AdaBoost algorithm. The facial expression recognition system constituted with the proposed rectangle features is also compared to that with previous rectangle features with regard to its capacity. The simulation and experimental results show that the proposed approach has better performance in facial expression recognition.

Feature-selection algorithm based on genetic algorithms using unstructured data for attack mail identification (공격 메일 식별을 위한 비정형 데이터를 사용한 유전자 알고리즘 기반의 특징선택 알고리즘)

  • Hong, Sung-Sam;Kim, Dong-Wook;Han, Myung-Mook
    • Journal of Internet Computing and Services
    • /
    • v.20 no.1
    • /
    • pp.1-10
    • /
    • 2019
  • Since big-data text mining extracts many features and data, clustering and classification can result in high computational complexity and low reliability of the analysis results. In particular, a term document matrix obtained through text mining represents term-document features, but produces a sparse matrix. We designed an advanced genetic algorithm (GA) to extract features in text mining for detection model. Term frequency inverse document frequency (TF-IDF) is used to reflect the document-term relationships in feature extraction. Through a repetitive process, a predetermined number of features are selected. And, we used the sparsity score to improve the performance of detection model. If a spam mail data set has the high sparsity, detection model have low performance and is difficult to search the optimization detection model. In addition, we find a low sparsity model that have also high TF-IDF score by using s(F) where the numerator in fitness function. We also verified its performance by applying the proposed algorithm to text classification. As a result, we have found that our algorithm shows higher performance (speed and accuracy) in attack mail classification.