• Title/Summary/Keyword: Feature Subset

Search Result 131, Processing Time 0.034 seconds

Feature Selection for Multi-Class Genre Classification using Gaussian Mixture Model (Gaussian Mixture Model을 이용한 다중 범주 분류를 위한 특징벡터 선택 알고리즘)

  • Moon, Sun-Kuk;Choi, Tack-Sung;Park, Young-Cheol;Youn, Dae-Hee
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.32 no.10C
    • /
    • pp.965-974
    • /
    • 2007
  • In this paper, we proposed the feature selection algorithm for multi-class genre classification. In our proposed algorithm, we developed GMM separation score based on Gaussian mixture model for measuring separability between two genres. Additionally, we improved feature subset selection algorithm based on sequential forward selection for multi-class genre classification. Instead of setting criterion as entire genre separability measures, we set criterion as worst genre separability measure for each sequential selection step. In order to assess the performance proposed algorithm, we extracted various features which represent characteristics such as timbre, rhythm, pitch and so on. Then, we investigate classification performance by GMM classifier and k-NN classifier for selected features using conventional algorithm and proposed algorithm. Proposed algorithm showed improved performance in classification accuracy up to 10 percent for classification experiments of low dimension feature vector especially.

Fuzzy discretization with spatial distribution of data and Its application to feature selection (데이터의 공간적 분포를 고려한 퍼지 이산화와 특징선택에의 응용)

  • Son, Chang-Sik;Shin, A-Mi;Lee, In-Hee;Park, Hee-Joon;Park, Hyoung-Seob;Kim, Yoon-Nyun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.20 no.2
    • /
    • pp.165-172
    • /
    • 2010
  • In clinical data minig, choosing the optimal subset of features is such important, not only to reduce the computational complexity but also to improve the usefulness of the model constructed from the given data. Moreover the threshold values (i.e., cut-off points) of selected features are used in a clinical decision criteria of experts for differential diagnosis of diseases. In this paper, we propose a fuzzy discretization approach, which is evaluated by measuring the degree of separation of redundant attribute values in overlapping region, based on spatial distribution of data with continuous attributes. The weighted average of the redundant attribute values is then used to determine the threshold value for each feature and rough set theory is utilized to select a subset of relevant features from the overall features. To verify the validity of the proposed method, we compared experimental results, which applied to classification problem using 668 patients with a chief complaint of dyspnea, based on three discretization methods (i.e., equal-width, equal-frequency, and entropy-based) and proposed discretization method. From the experimental results, we confirm that the discretization methods with fuzzy partition give better results in two evaluation measures, average classification accuracy and G-mean, than those with hard partition.

Maximum Simplex Volume based Landmark Selection for Isomap (최대 부피 Simplex 기반의 Isomap을 위한 랜드마크 추출)

  • Chi, Junhwa
    • Korean Journal of Remote Sensing
    • /
    • v.29 no.5
    • /
    • pp.509-516
    • /
    • 2013
  • Since traditional linear feature extraction methods are unable to handle nonlinear characteristics often exhibited in hyperspectral imagery, nonlinear feature extraction, also known as manifold learning, is receiving increased attention in hyperspectral remote sensing society as well as other community. A most widely used manifold Isomap is generally promising good results in classification and spectral unmixing tasks, but significantly high computational overhead is problematic, especially for large scale remotely sensed data. A small subset of distinguishing points, referred to as landmarks, is proposed as a solution. This study proposes a new robust and controllable landmark selection method based on the maximum volume of the simplex spanned by landmarks. The experiments are conducted to compare classification accuracies with standard deviation according to sampling methods, the number of landmarks, and processing time. The proposed method could employ both classification accuracy and computational efficiency.

A Novel Network Anomaly Detection Method based on Data Balancing and Recursive Feature Addition

  • Liu, Xinqian;Ren, Jiadong;He, Haitao;Wang, Qian;Sun, Shengting
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.7
    • /
    • pp.3093-3115
    • /
    • 2020
  • Network anomaly detection system plays an essential role in detecting network anomaly and ensuring network security. Anomaly detection system based machine learning has become an increasingly popular solution. However, due to the unbalance and high-dimension characteristics of network traffic, the existing methods unable to achieve the excellent performance of high accuracy and low false alarm rate. To address this problem, a new network anomaly detection method based on data balancing and recursive feature addition is proposed. Firstly, data balancing algorithm based on improved KNN outlier detection is designed to select part respective data on each category. Combination optimization about parameters of improved KNN outlier detection is implemented by genetic algorithm. Next, recursive feature addition algorithm based on correlation analysis is proposed to select effective features, in which a cross contingency test is utilized to analyze correlation and obtain a features subset with a strong correlation. Then, random forests model is as the classification model to detection anomaly. Finally, the proposed algorithm is evaluated on benchmark datasets KDD Cup 1999 and UNSW_NB15. The result illustrates the proposed strategies enhance accuracy and recall, and decrease the false alarm rate. Compared with other algorithms, this algorithm still achieves significant effects, especially recall in the small category.

Optimal Band Selection Techniques for Hyperspectral Image Pixel Classification using Pooling Operations & PSNR (초분광 이미지 픽셀 분류를 위한 풀링 연산과 PSNR을 이용한 최적 밴드 선택 기법)

  • Chang, Duhyeuk;Jung, Byeonghyeon;Heo, Junyoung
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.21 no.5
    • /
    • pp.141-147
    • /
    • 2021
  • In this paper, in order to improve the utilization of hyperspectral large-capacity data feature information by reducing complex computations by dimension reduction of neural network inputs in embedded systems, the band selection algorithm is applied in each subset. Among feature extraction and feature selection techniques, the feature selection aim to improve the optimal number of bands suitable for datasets, regardless of wavelength range, and the time and performance, more than others algorithms. Through this experiment, although the time required was reduced by 1/3 to 1/9 times compared to the others band selection technique, meaningful results were improved by more than 4% in terms of performance through the K-neighbor classifier. Although it is difficult to utilize real-time hyperspectral data analysis now, it has confirmed the possibility of improvement.

Classification of Sides of Neighboring Vehicles and Pillars for Parking Assistance Using Ultrasonic Sensors (주차보조를 위한 초음파 센서 기반의 주변차량의 주차상태 및 기둥 분류)

  • Park, Eunsoo;Yun, Yongji;Kim, Hyoungrae;Lee, Jonghwan;Ki, Hoyong;Lee, Chulhee;Kim, Hakil
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.19 no.1
    • /
    • pp.15-26
    • /
    • 2013
  • This paper proposes a classification method of parallel, vertical parking states and pillars for parking assist system using ultrasonic sensors. Since, in general parking space detection module, the compressed amplitude of ultrasonic data are received, the analysis of them is difficult. To solve these problems, in preprocessing state, symmetric transform and noise removal are performed. In feature extraction process, four features, standard deviation of distance, reconstructed peak, standard deviation of reconstructed signal and sum of width, are proposed. Gaussian fitting model is used to reconstruct saturated peak signal and discriminability of each feature is measured. To find the best combination among these features, multi-class SVM and subset generator are used for more accurate and robust classification. The proposed method shows 92 % classification rate and proves the applicability to parking space detection modules.

Automatic Detection of Cow's Oestrus in Audio Surveillance System

  • Chung, Y.;Lee, J.;Oh, S.;Park, D.;Chang, H.H.;Kim, S.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.26 no.7
    • /
    • pp.1030-1037
    • /
    • 2013
  • Early detection of anomalies is an important issue in the management of group-housed livestock. In particular, failure to detect oestrus in a timely and accurate way can become a limiting factor in achieving efficient reproductive performance. Although a rich variety of methods has been introduced for the detection of oestrus, a more accurate and practical method is still required. In this paper, we propose an efficient data mining solution for the detection of oestrus, using the sound data of Korean native cows (Bos taurus coreanea). In this method, we extracted the mel frequency cepstrum coefficients from sound data with a feature dimension reduction, and use the support vector data description as an early anomaly detector. Our experimental results show that this method can be used to detect oestrus both economically (even a cheap microphone) and accurately (over 94% accuracy), either as a standalone solution or to complement known methods.

Multi-Radial Basis Function SVM Classifier: Design and Analysis

  • Wang, Zheng;Yang, Cheng;Oh, Sung-Kwun;Fu, Zunwei
    • Journal of Electrical Engineering and Technology
    • /
    • v.13 no.6
    • /
    • pp.2511-2520
    • /
    • 2018
  • In this study, Multi-Radial Basis Function Support Vector Machine (Multi-RBF SVM) classifier is introduced based on a composite kernel function. In the proposed multi-RBF support vector machine classifier, the input space is divided into several local subsets considered for extremely nonlinear classification tasks. Each local subset is expressed as nonlinear classification subspace and mapped into feature space by using kernel function. The composite kernel function employs the dual RBF structure. By capturing the nonlinear distribution knowledge of local subsets, the training data is mapped into higher feature space, then Multi-SVM classifier is realized by using the composite kernel function through optimization procedure similar to conventional SVM classifier. The original training data set is partitioned by using some unsupervised learning methods such as clustering methods. In this study, three types of clustering method are considered such as Affinity propagation (AP), Hard C-Mean (HCM) and Iterative Self-Organizing Data Analysis Technique Algorithm (ISODATA). Experimental results on benchmark machine learning datasets show that the proposed method improves the classification performance efficiently.

Image Retrieval by Important Feature Weighting for Each Class (영상 클레스별 중요 특징 가중에 의한 영상 검색 방법)

  • Yoo, Donggeun;Park, Chaehoon;Choi, Yukyung;Kweon, In So
    • Annual Conference of KIPS
    • /
    • 2012.04a
    • /
    • pp.382-385
    • /
    • 2012
  • 이 논문에서는 영상 검색(image retrieval) 및 영상 부류(image categorization)을 위하여 영상을 기술할 때 영상의 클레스(class)별로 서로 다른 주요 특징량(feature)에 가중치 를 주는 방법론을 제안한다. 기존에 연구되어온 영상의 특징량 벡터에 가중치를 주는 방식은 모든 영상 클레스에 대하여 동일하게 가중치를 적용하기 때문에 영상이 클레스별로 서로 다른 특징량이 중요하다는 성질을 이용할 수 없다. 영상이 클레 별로 서로 다른 특징량이 중요하다는 성질을 이용하기 위하여 영상의 클레스별로 특징량 벡터에 서로 다른 가중치 벡터(weight vector)를 학습하였다. 그 후 질의 영상(query image)이 입력되면, 기존의 영상 검색 프레임워크(framework)를 통해 데이터베이 스(database)로 부터 미리 정의된 서브 클레스(sub-class)의 수에 해당하는 영상부 집합(subset)을 만들었다. 그리고 영상부 집합의 특징량 벡터들에 클레스별로 각각 학습된 가중치 벡터를 적용하여 특징량 벡터들 간의 거리를 다시 계산하여 리랭킹(re-ranking)하였다. 이 방법론을 UKBench Dataset에 적용하여 실험을 해보았으며 가중치를 주기 전과 비교 하였을 때 더 높은 정확도를 보였다.

Porting Window CE Operating System to Arm based board device

  • An, Byung-Chan;Ham, Woon-Chul
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2003.10a
    • /
    • pp.2159-2163
    • /
    • 2003
  • Hand carried computing machinery and tools have been developed into an embedded system which the small footprint operating system is contained internally. Windows CE which is one of imbedded operating system is a lightweight, multithreaded operating system with an optional graphical user interface. Its strength lies in its small size, its Win32 subset API, and its multiplatform support. Therefore we choose to port this OS on Arm based board that is provided high performance, low cost, and low power consumption. In this paper, we describe the architecture of ARM based board, the feature of Windows CE, techniques and steps involved in this porting process.

  • PDF