• Title/Summary/Keyword: Filter and Wrapper

Search Result 14, Processing Time 0.025 seconds

Speech Feature Selection of Normal and Autistic children using Filter and Wrapper Approach

  • Akhtar, Muhammed Ali;Ali, Syed Abbas;Siddiqui, Maria Andleeb
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.5
    • /
    • pp.129-132
    • /
    • 2021
  • Two feature selection approaches are analyzed in this study. First Approach used in this paper is Filter Approach which comprises of correlation technique. It provides two reduced feature sets using positive and negative correlation. Secondly Approach used in this paper is the wrapper approach which comprises of Sequential Forward Selection technique. The reduced feature set obtained by positive correlation results comprises of Rate of Acceleration, Intensity and Formant. The reduced feature set obtained by positive correlation results comprises of Rasta PLP, Log energy, Log power and Zero Crossing Rate. Pitch, Rate of Acceleration, Log Power, MFCC, LPCC is the reduced feature set yield as a result of Sequential Forwarding Selection.

On the Performance of Cuckoo Search and Bat Algorithms Based Instance Selection Techniques for SVM Speed Optimization with Application to e-Fraud Detection

  • AKINYELU, Andronicus Ayobami;ADEWUMI, Aderemi Oluyinka
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.3
    • /
    • pp.1348-1375
    • /
    • 2018
  • Support Vector Machine (SVM) is a well-known machine learning classification algorithm, which has been widely applied to many data mining problems, with good accuracy. However, SVM classification speed decreases with increase in dataset size. Some applications, like video surveillance and intrusion detection, requires a classifier to be trained very quickly, and on large datasets. Hence, this paper introduces two filter-based instance selection techniques for optimizing SVM training speed. Fast classification is often achieved at the expense of classification accuracy, and some applications, such as phishing and spam email classifiers, are very sensitive to slight drop in classification accuracy. Hence, this paper also introduces two wrapper-based instance selection techniques for improving SVM predictive accuracy and training speed. The wrapper and filter based techniques are inspired by Cuckoo Search Algorithm and Bat Algorithm. The proposed techniques are validated on three popular e-fraud types: credit card fraud, spam email and phishing email. In addition, the proposed techniques are validated on 20 other datasets provided by UCI data repository. Moreover, statistical analysis is performed and experimental results reveals that the filter-based and wrapper-based techniques significantly improved SVM classification speed. Also, results reveal that the wrapper-based techniques improved SVM predictive accuracy in most cases.

Hybrid Feature Selection Method Based on a Naïve Bayes Algorithm that Enhances the Learning Speed while Maintaining a Similar Error Rate in Cyber ISR

  • Shin, GyeongIl;Yooun, Hosang;Shin, DongIl;Shin, DongKyoo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.12
    • /
    • pp.5685-5700
    • /
    • 2018
  • Cyber intelligence, surveillance, and reconnaissance (ISR) has become more important than traditional military ISR. An agent used in cyber ISR resides in an enemy's networks and continually collects valuable information. Thus, this agent should be able to determine what is, and is not, useful in a short amount of time. Moreover, the agent should maintain a classification rate that is high enough to select useful data from the enemy's network. Traditional feature selection algorithms cannot comply with these requirements. Consequently, in this paper, we propose an effective hybrid feature selection method derived from the filter and wrapper methods. We illustrate the design of the proposed model and the experimental results of the performance comparison between the proposed model and the existing model.

Feature Selection for Anomaly Detection Based on Genetic Algorithm (유전 알고리즘 기반의 비정상 행위 탐지를 위한 특징선택)

  • Seo, Jae-Hyun
    • Journal of the Korea Convergence Society
    • /
    • v.9 no.7
    • /
    • pp.1-7
    • /
    • 2018
  • Feature selection, one of data preprocessing techniques, is one of major research areas in many applications dealing with large dataset. It has been used in pattern recognition, machine learning and data mining, and is now widely applied in a variety of fields such as text classification, image retrieval, intrusion detection and genome analysis. The proposed method is based on a genetic algorithm which is one of meta-heuristic algorithms. There are two methods of finding feature subsets: a filter method and a wrapper method. In this study, we use a wrapper method, which evaluates feature subsets using a real classifier, to find an optimal feature subset. The training dataset used in the experiment has a severe class imbalance and it is difficult to improve classification performance for rare classes. After preprocessing the training dataset with SMOTE, we select features and evaluate them with various machine learning algorithms.

A Design of an Optimized Classifier based on Feature Elimination for Gene Selection (유전자 선택을 위해 속성 삭제에 기반을 둔 최적화된 분류기 설계)

  • Lee, Byung-Kwan;Park, Seok-Gyu;Tifani, Yusrina
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.8 no.5
    • /
    • pp.384-393
    • /
    • 2015
  • This paper proposes an optimized classifier based on feature elimination (OCFE) for gene selection with combining two feature elimination methods, ReliefF and SVM-RFE. ReliefF algorithm is filter feature selection which rank the data by the importance of the data. SVM-RFE algorithm is a wrapper feature selection which wrapped the data and rank the data based on the weight of feature. With combining these two methods we get less error rate average, 0.3016138 for OCFE and 0.3096779 for SVM-RFE. The proposed method also get better accuracy with 70% for OCFE and 69% for SVM-RFE.

Feature Selection by Using Distance Histogram (거리 히스토그램을 이용한 특성 추출 기법)

  • 최기석;전성진;양명석
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2003.04a
    • /
    • pp.713-715
    • /
    • 2003
  • 특성 추출은dimensionality reduction technique로서 잡음을 제거하기 위해 사용되는 중요한 전처리 방식이다. 이러한 과정을 통해 데이터의 크기를 줄일 수 있으며 학습의 정확성 및 이해도를 높일 수 있다. Classification에 사용되는 다양한 특성 추출방식들이 존재하는 반면에 클러스터링에 적용될 수 있는 방식들은 양적으로도 많이 부족하며 존재하는 방식들도 대부분 사용되는 클러스터링 알고리즘 자체에 의존적인 실세계 어플리케이션에는 적용하기 부적합한 Wrapper 방식을 도입하고 있다. 본 논문에서는 클러스터링 알고리즘으로부터 독립적인 필터 솔루션(filter solution)을 제안하였다. 이 방식은 클러스터를 가진 데이터와 가지지 않고 있는 데이터 사이의 point-to-point 거리 히스토그램의 차이에 기반하고 있다.

  • PDF

A Pre-processing Study to Solve the Problem of Rare Class Classification of Network Traffic Data (네트워크 트래픽 데이터의 희소 클래스 분류 문제 해결을 위한 전처리 연구)

  • Ryu, Kyung Joon;Shin, DongIl;Shin, DongKyoo;Park, JeongChan;Kim, JinGoog
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.12
    • /
    • pp.411-418
    • /
    • 2020
  • In the field of information security, IDS(Intrusion Detection System) is normally classified in two different categories: signature-based IDS and anomaly-based IDS. Many studies in anomaly-based IDS have been conducted that analyze network traffic data generated in cyberspace by machine learning algorithms. In this paper, we studied pre-processing methods to overcome performance degradation problems cashed by rare classes. We experimented classification performance of a Machine Learning algorithm by reconstructing data set based on rare classes and semi rare classes. After reconstructing data into three different sets, wrapper and filter feature selection methods are applied continuously. Each data set is regularized by a quantile scaler. Depp neural network model is used for learning and validation. The evaluation results are compared by true positive values and false negative values. We acquired improved classification performances on all of three data sets.

Design and Verification of Deblocking Filter Circuit Using AMBA-Based Platform (AMBA 기반 플랫폼을 이용한 디블록킹 필터 회로의 설계 및 검증)

  • Park, Kang-Pil;Lee, Seon-Young;Cho, Kyeong-Soon
    • Proceedings of the IEEK Conference
    • /
    • 2005.11a
    • /
    • pp.735-738
    • /
    • 2005
  • This paper presents an AMBA-based IP that can perform the deblocking filtering operations required in the H.264 video compression. The deblocking filter circuit was optimized for area and performance. The AHB wrapper was added to the circuit to interface with the AMBA-based platform. The AMBA-compliant operation of the proposed IP was verified on the platform board with Xilinx Virtex2 XC2V600 FPGA and ARM9 processor.

  • PDF

Hybrid Feature Selection Using Genetic Algorithm and Information Theory

  • Cho, Jae Hoon;Lee, Dae-Jong;Park, Jin-Il;Chun, Myung-Geun
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.13 no.1
    • /
    • pp.73-82
    • /
    • 2013
  • In pattern classification, feature selection is an important factor in the performance of classifiers. In particular, when classifying a large number of features or variables, the accuracy and computational time of the classifier can be improved by using the relevant feature subset to remove the irrelevant, redundant, or noisy data. The proposed method consists of two parts: a wrapper part with an improved genetic algorithm(GA) using a new reproduction method and a filter part using mutual information. We also considered feature selection methods based on mutual information(MI) to improve computational complexity. Experimental results show that this method can achieve better performance in pattern recognition problems than other conventional solutions.

Feature Selection Applied to Recommender Systems for Reverse Logistics Internet Auction (역 물류 환경 인터넷 경매를 위한 요소 선택응용 추천 시스템)

  • Yang, Jae-Kyung;Yu, Woo-Yeon
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.29 no.1
    • /
    • pp.76-86
    • /
    • 2006
  • 다양한 데이터 마이닝 기법들의 발전과 더불어, 속성(Feature 또는 Attribute)의 범위(Dimension)를 줄이기 위해 많은 요소 선택 방법이 개발되었다. 이는 확장성(Scalability)을 향상시킬 수 있고 학습 모델(Learning Model)을 더욱 쉽게 해석할 수 있도록 한다. 이 논문에서는 네스티드 분할(Nested Partition, 이하 NP)을 이용한 새로운 최적화 기반 속성 선택 방법을 NP 기본 구조와 다양한 실험 문제의 수치적 결과들과 함께 제시하여 어떻게 NP의 최적화 구조가 속성 선택 과정에 기여를 하고 있는지 보여준다. 그리고 이 새로운 지능적인 분할 방법이 어떻게 매우 효율적인 분할을 수행하는지를 제시한다. 이 새로운 속성 선택 방법은 필터(Filter)방법과 래퍼(Wrapper)방법 두 가지로 구현될 수 있다. 사례 연구로서, B2B e-비즈니스 시스템에서 효과적으로 사용될 수 있는 추천 시스템(Recommender System)을 제안하였다. 이 추천 시스템은 분류 기법(Classification Rule)과 제시된 NP 기반 요소 선택 방법을 사용하고 있다. 이 추천 시스템은 사용자의 인터넷 경매 참여를 추천하는데 사용되며, 이 때 제안된 요소 선택 앨고리듬은 추천 규칙들이 쉽게 이해될 수 있도록 모델을 간략화 하는데 사용된다.