• Title/Summary/Keyword: Local feature selection

Search Result 59, Processing Time 0.027 seconds

Classifier Selection using Feature Space Attributes in Local Region (국부적 영역에서의 특징 공간 속성을 이용한 다중 인식기 선택)

  • Shin Dong-Kuk;Song Hye-Jeong;Kim Baeksop
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.12
    • /
    • pp.1684-1690
    • /
    • 2004
  • This paper presents a method for classifier selection that uses distribution information of the training samples in a small region surrounding a sample. The conventional DCS-LA(Dynamic Classifier Selection - Local Accuracy) selects a classifier dynamically by comparing the local accuracy of each classifier at the test time, which inevitably requires long classification time. On the other hand, in the proposed approach, the best classifier in a local region is stored in the FSA(Feature Space Attribute) table during the training time, and the test is done by just referring to the table. Therefore, this approach enables fast classification because classification is not needed during test. Two feature space attributes are used entropy and density of k training samples around each sample. Each sample in the feature space is mapped into a point in the attribute space made by two attributes. The attribute space is divided into regular rectangular cells in which the local accuracy of each classifier is appended. The cells with associated local accuracy comprise the FSA table. During test, when a test sample is applied, the cell to which the test sample belongs is determined first by calculating the two attributes, and then, the most accurate classifier is chosen from the FSA table. To show the effectiveness of the proposed algorithm, it is compared with the conventional DCS -LA using the Elena database. The experiments show that the accuracy of the proposed algorithm is almost same as DCS-LA, but the classification time is about four times faster than that.

Sparse and low-rank feature selection for multi-label learning

  • Lim, Hyunki
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.7
    • /
    • pp.1-7
    • /
    • 2021
  • In this paper, we propose a feature selection technique for multi-label classification. Many existing feature selection techniques have selected features by calculating the relation between features and labels such as a mutual information scale. However, since the mutual information measure requires a joint probability, it is difficult to calculate the joint probability from an actual premise feature set. Therefore, it has the disadvantage that only a few features can be calculated and only local optimization is possible. Away from this regional optimization problem, we propose a feature selection technique that constructs a low-rank space in the entire given feature space and selects features with sparsity. To this end, we designed a regression-based objective function using Nuclear norm, and proposed an algorithm of gradient descent method to solve the optimization problem of this objective function. Based on the results of multi-label classification experiments on four data and three multi-label classification performance, the proposed methodology showed better performance than the existing feature selection technique. In addition, it was showed by experimental results that the performance change is insensitive even to the parameter value change of the proposed objective function.

Band Selection Using Forward Feature Selection Algorithm for Citrus Huanglongbing Disease Detection

  • Katti, Anurag R.;Lee, W.S.;Ehsani, R.;Yang, C.
    • Journal of Biosystems Engineering
    • /
    • v.40 no.4
    • /
    • pp.417-427
    • /
    • 2015
  • Purpose: This study investigated different band selection methods to classify spectrally similar data - obtained from aerial images of healthy citrus canopies and citrus greening disease (Huanglongbing or HLB) infected canopies - using small differences without unmixing endmember components and therefore without the need for an endmember library. However, large number of hyperspectral bands has high redundancy which had to be reduced through band selection. The objective, therefore, was to first select the best set of bands and then detect citrus Huanglongbing infected canopies using these bands in aerial hyperspectral images. Methods: The forward feature selection algorithm (FFSA) was chosen for band selection. The selected bands were used for identifying HLB infected pixels using various classifiers such as K nearest neighbor (KNN), support vector machine (SVM), naïve Bayesian classifier (NBC), and generalized local discriminant bases (LDB). All bands were also utilized to compare results. Results: It was determined that a few well-chosen bands yielded much better results than when all bands were chosen, and brought the classification results on par with standard hyperspectral classification techniques such as spectral angle mapper (SAM) and mixture tuned matched filtering (MTMF). Median detection accuracies ranged from 66-80%, which showed great potential toward rapid detection of the disease. Conclusions: Among the methods investigated, a support vector machine classifier combined with the forward feature selection algorithm yielded the best results.

An optimal feature selection algorithm for the network intrusion detection system (네트워크 침입 탐지를 위한 최적 특징 선택 알고리즘)

  • Jung, Seung-Hyun;Moon, Jun-Geol;Kang, Seung-Ho
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2014.10a
    • /
    • pp.342-345
    • /
    • 2014
  • Network intrusion detection system based on machine learning methods is quite dependent on the selected features in terms of accuracy and efficiency. Nevertheless, choosing the optimal combination of features from generally used features to detect network intrusion requires extensive computing resources. For instance, the number of possible feature combinations from given n features is $2^n-1$. In this paper, to tackle this problem we propose a optimal feature selection algorithm. Proposed algorithm is based on the local search algorithm, one of representative meta-heuristic algorithm for solving optimization problem. In addition, the accuracy of clusters which obtained using selected feature components and k-means clustering algorithm is adopted to evaluate a feature assembly. In order to estimate the performance of our proposed algorithm, comparing with a method where all features are used on NSL-KDD data set and multi-layer perceptron.

  • PDF

Local Linear Transform and New Features of Histogram Characteristic Functions for Steganalysis of Least Significant Bit Matching Steganography

  • Zheng, Ergong;Ping, Xijian;Zhang, Tao
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.5 no.4
    • /
    • pp.840-855
    • /
    • 2011
  • In the context of additive noise steganography model, we propose a method to detect least significant bit (LSB) matching steganography in grayscale images. Images are decomposed into detail sub-bands with local linear transform (LLT) masks which are sensitive to embedding. Novel normalized characteristic function features weighted by a bank of band-pass filters are extracted from the detail sub-bands. A suboptimal feature set is searched by using a threshold selection algorithm. Extensive experiments are performed on four diverse uncompressed image databases. In comparison with other well-known feature sets, the proposed feature set performs the best under most circumstances.

Review on Genetic Algorithms for Pattern Recognition (패턴 인식을 위한 유전 알고리즘의 개관)

  • Oh, Il-Seok
    • The Journal of the Korea Contents Association
    • /
    • v.7 no.1
    • /
    • pp.58-64
    • /
    • 2007
  • In pattern recognition field, there are many optimization problems having exponential search spaces. To solve of sequential search algorithms seeking sub-optimal solutions have been used. The algorithms have limitations of stopping at local optimums. Recently lots of researches attempt to solve the problems using genetic algorithms. This paper explains the huge search spaces of typical problems such as feature selection, classifier ensemble selection, neural network pruning, and clustering, and it reviews the genetic algorithms for solving them. Additionally we present several subjects worthy of noting as future researches.

Fuzzy Threshold Inference of a Nonlinear Filter for Color Sketch Feature Extraction (컬러 스케치특징 추출을 위한 비선형 필터의 퍼지임계치 추론)

  • Cho Sung-Mok;Cho Ok-Lae
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.7 no.3
    • /
    • pp.398-403
    • /
    • 2006
  • In this paper, we describe a fuzzy threshold selection technique for feature extraction in digital color images. this is achieved by the formulation a fuzzy inference system that evaluates threshold for feature configurations. The system uses two fuzzy measures. They capture desirable characteristics of features such as dependency of local intensity and continuity in an image. We give a graphical description of a nonlinear sketch feature extraction filter and design the fuzzy inference system in terms of the characteristics of the feature. Through the design, we provide selection method on the choice of a threshold to achieve certain characteristics of the extracted features. Experimental results show the usefulness of our fuzzy threshold inference approach which is able to extract features without human intervention.

  • PDF

Improved marine predators algorithm for feature selection and SVM optimization

  • Jia, Heming;Sun, Kangjian;Li, Yao;Cao, Ning
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.4
    • /
    • pp.1128-1145
    • /
    • 2022
  • Owing to the rapid development of information science, data analysis based on machine learning has become an interdisciplinary and strategic area. Marine predators algorithm (MPA) is a novel metaheuristic algorithm inspired by the foraging strategies of marine organisms. Considering the randomness of these strategies, an improved algorithm called co-evolutionary cultural mechanism-based marine predators algorithm (CECMPA) is proposed. Through this mechanism, search agents in different spaces can share knowledge and experience to improve the performance of the native algorithm. More specifically, CECMPA has a higher probability of avoiding local optimum and can search the global optimum quickly. In this paper, it is the first to use CECMPA to perform feature subset selection and optimize hyperparameters in support vector machine (SVM) simultaneously. For performance evaluation the proposed method, it is tested on twelve datasets from the university of California Irvine (UCI) repository. Moreover, the coronavirus disease 2019 (COVID-19) can be a real-world application and is spreading in many countries. CECMPA is also applied to a COVID-19 dataset. The experimental results and statistical analysis demonstrate that CECMPA is superior to other compared methods in the literature in terms of several evaluation metrics. The proposed method has strong competitive abilities and promising prospects.

A Method to Find Feature Set for Detecting Various Denial Service Attacks in Power Grid (전력망에서의 다양한 서비스 거부 공격 탐지 위한 특징 선택 방법)

  • Lee, DongHwi;Kim, Young-Dae;Park, Woo-Bin;Kim, Joon-Seok;Kang, Seung-Ho
    • KEPCO Journal on Electric Power and Energy
    • /
    • v.2 no.2
    • /
    • pp.311-316
    • /
    • 2016
  • Network intrusion detection system based on machine learning method such as artificial neural network is quite dependent on the selected features in terms of accuracy and efficiency. Nevertheless, choosing the optimal combination of features, which guarantees accuracy and efficienty, from generally used many features to detect network intrusion requires extensive computing resources. In this paper, we deal with a optimal feature selection problem to determine 6 denial service attacks and normal usage provided by NSL-KDD data. We propose a optimal feature selection algorithm. Proposed algorithm is based on the multi-start local search algorithm, one of representative meta-heuristic algorithm for solving optimization problem. In order to evaluate the performance of our proposed algorithm, comparison with a case of all 41 features used against NSL-KDD data is conducted. In addtion, comparisons between 3 well-known machine learning methods (multi-layer perceptron., Bayes classifier, and Support vector machine) are performed to find a machine learning method which shows the best performance combined with the proposed feature selection method.

A Decision Tree Induction using Genetic Programming with Sequentially Selected Features (순차적으로 선택된 특성과 유전 프로그래밍을 이용한 결정나무)

  • Kim Hyo-Jung;Park Chong-Sun
    • Korean Management Science Review
    • /
    • v.23 no.1
    • /
    • pp.63-74
    • /
    • 2006
  • Decision tree induction algorithm is one of the most widely used methods in classification problems. However, they could be trapped into a local minimum and have no reasonable means to escape from it if tree algorithm uses top-down search algorithm. Further, if irrelevant or redundant features are included in the data set, tree algorithms produces trees that are less accurate than those from the data set with only relevant features. We propose a hybrid algorithm to generate decision tree that uses genetic programming with sequentially selected features. Correlation-based Feature Selection (CFS) method is adopted to find relevant features which are fed to genetic programming sequentially to find optimal trees at each iteration. The new proposed algorithm produce simpler and more understandable decision trees as compared with other decision trees and it is also effective in producing similar or better trees with relatively smaller set of features in the view of cross-validation accuracy.