Browse > Article
http://dx.doi.org/10.3745/KIPSTB.2007.14-B.2.135

A Feature Selection Method Based on Fuzzy Cluster Analysis  

Rhee, Hyun-Sook (동양공업전문대학 전산정보학부)
Abstract
Feature selection is a preprocessing technique commonly used on high dimensional data. Feature selection studies how to select a subset or list of attributes that are used to construct models describing data. Feature selection methods attempt to explore data's intrinsic properties by employing statistics or information theory. The recent developments have involved approaches like correlation method, dimensionality reduction and mutual information technique. This feature selection have become the focus of much research in areas of applications with massive and complex data sets. In this paper, we provide a feature selection method considering data characteristics and generalization capability. It provides a computational approach for feature selection based on fuzzy cluster analysis of its attribute values and its performance measures. And we apply it to the system for classifying computer virus and compared with heuristic method using the contrast concept. Experimental result shows the proposed approach can give a feature ranking, select the features, and improve the system performance.
Keywords
Feature Selection; Fuzzy Cluster Analysis; Performance Measure; Information Theory;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 I. Witten and E. Frank, 'Data mining: Practical machine leaning tools and techniques with java implementations', Morgan Kaufmann, San francisco, CA, 2000
2 VX Heaven : http://vx.netlux.org
3 http://www.datarescue.com
4 Abou-Assaleh, Nick Cercone, Vlado Keselj, and Ray Sweidan, 'Detection of New Malicious Code Using N grams Signatures, Proceedings of the Second Annual Conference on Privacy, Security and Trust (PST'04), pp. 193-196, 2004
5 Kolter, J.Z., and Maloof, M. A., 'Learning to detect malicious executables in the wild', In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 470-478, New York, NY, 2004   DOI
6 J. O. Kephart, 'A Biologically Inspired Immune System for Computers', Proceedings of the 4th Workshop on Synthesis and Simulation of Living Systems, pp.130-139, 1994
7 Jianyong Dai, Muazzam Siddiqui, Joohan Lee and Morgan C. Wang, 'Detecting Computer Viruses Mining Instruction Sequences', Submitted to IEEE Trans. on Dependable and Secure Computing, Jan, 2007
8 이현숙, '퍼지 성능 측정자를 이용한 적응 데이터 마이닝 모델', 정보처리학회 논문지, 제13 B권 5호, 2006   과학기술학회마을   DOI
9 이현숙, '점증적 학습 퍼지 신경망을 이용한 적응 분류 모델', 퍼지 및 지능시스템 학회 논문지, Vol. 16, No. 6, 2006   과학기술학회마을   DOI
10 Huan Liu, 'Evolving Feature Selection', IEEE Intelligent Systems and Their Applications Vol. 20, Issue 4 Nov. Dec. 2005   DOI   ScienceOn
11 Jianyong Dai, Joohan Lee and Morgan C. Wang, 'Detecting Unknown Computer Virus Using Data Mining Techniques', Business Intelligent Symposium. poster presentation, April, 2006
12 Debrup Chakraborty and Nikhil R. Pal, 'Integrated Feature Analysis and Fuzzy Rule Based System Identification in a Neuro-Fuzzy Paradigm', IEEE Trans. on System, Man and Cybernetics, Vol. 31, No. 3, June 2001   DOI   ScienceOn
13 Isabelle Guyon and Andre Elisseeff, 'An Introduction to Variable and Feature Selection', Journal of Machine Learning Research 3, 2003
14 Chin-Teng Lin, Chang Mao Yeh, Shen-Fu Liang, Jen Feng Chung and Nimit Kumar, 'Support Vector Based Fuzzy Neural Network for Pattern Classification', IEEE Trans, on Fuzzy System, Vol. 14, No. 1, Feb, 2006   DOI   ScienceOn
15 Gupta, M. M., Jin, L., and Homma, N., Static and Dynamic Neural Networks : From Fundamentals to Advanced Theory, Wiley-IEEE Press, April 2004