Browse > Article
http://dx.doi.org/10.4218/etrij.11.0110.0237

Conditional Mutual Information-Based Feature Selection Analyzing for Synergy and Redundancy  

Cheng, Hongrong (Department of Computer Science, University of Electronic Science and Technology)
Qin, Zhiguang (Department of Computer Science, University of Electronic Science and Technology)
Feng, Chaosheng (Department of Computer Science, University of Electronic Science and Technology)
Wang, Yong (Department of Computer Science, University of Electronic Science and Technology)
Li, Fagen (Department of Computer Science, University of Electronic Science and Technology)
Publication Information
ETRI Journal / v.33, no.2, 2011 , pp. 210-218 More about this Journal
Abstract
Battiti's mutual information feature selector (MIFS) and its variant algorithms are used for many classification applications. Since they ignore feature synergy, MIFS and its variants may cause a big bias when features are combined to cooperate together. Besides, MIFS and its variants estimate feature redundancy regardless of the corresponding classification task. In this paper, we propose an automated greedy feature selection algorithm called conditional mutual information-based feature selection (CMIFS). Based on the link between interaction information and conditional mutual information, CMIFS takes account of both redundancy and synergy interactions of features and identifies discriminative features. In addition, CMIFS combines feature redundancy evaluation with classification tasks. It can decrease the probability of mistaking important features as redundant features in searching process. The experimental results show that CMIFS can achieve higher best-classification-accuracy than MIFS and its variants, with the same or less (nearly 50%) number of features.
Keywords
Classification; feature selection; conditional mutual information; redundancy; interaction;
Citations & Related Records

Times Cited By Web Of Science : 4  (Related Records In Web of Science)
Times Cited By SCOPUS : 6
연도 인용수 순위
1 N. Kwak and C.H. Choi, "Input Feature Selection for Classification Problems," IEEE Trans. Neural Netw., vol. 13, no. 1, 2002, pp. 143-159.   DOI   ScienceOn
2 J.J. HUANG et al., "Feature Selection for Classificatory Analysis Based on Information-Theoretic Criteria," Acta Automatica Sinica, vol. 34, no. 3, 2008, pp. 383-392.   DOI
3 H. Peng, F. Long, and C. Ding, "Feature Selection Based on Mutual Information: Criteria of Max-Dependency, Max-Relevance, and Min-Redundancy," IEEE Trans. Pattern Anal. Machine Intell., vol. 27, no. 8, 2005, pp. 1226-1238.   DOI
4 P.A. Estevez et al., "Normalized Mutual Information Feature Selection," IEEE Trans. Neural Netw., vol. 20, no. 2, 2009, pp. 189-201.   DOI
5 J. Novovicova, "Conditional Mutual Information Based Feature Selection for Classification Task," Progress Pattern Recog., Image Anal. Appl., LNCS, Springer, vol. 4756, 2007, pp. 417-426.
6 W.J. McGill, "Multivariate Information Transmission," Psychomeetrika, vol. 19, no. 2, 1954, pp. 97-116.   DOI   ScienceOn
7 R.M. Fano, Transmission of Information: A Statistical Theory of Communications, New York, USA: Wiley Press, 1961.
8 C.E. Shannon and W. Weaver, The Mathematical Theory of Communication, Urbana, Israel: University of Illinois Press, 1949.
9 T.M. Cover and J.A. Thomas, Elements of Information Theory, New York, USA:Wiley-Interscience Press, 1991.
10 U.M. Fayyad and K.B. Irani, "Multi-interval Discretization of Continuous-Valued Attributes for Classification Learning," Proc. 13th Int. Joint Conf. Artificial Intell., 1993, pp. 1022-1027.
11 R. Kohavi and G.H. John, "Wrappers for Feature Subset Selection," Artificial Intell., vol. 97, no. 1-2, 1997, pp. 273-324.   DOI   ScienceOn
12 D. Koller and M. Sahami, "Toward Optimal Feature Selection," Proc. 13th Int. Conf. Machine Learning, 1996, pp. 284-292.
13 M. Dash and H. Liu, "Feature Selection for Classification," Intelligent Data Analysis, vol. 1, 1997, pp. 131-156.   DOI   ScienceOn
14 E. Amaldi and V. Kann, "On the Approximation of Minimizing Nonzero Variables or Unsatisfied Relations in Linear Systems," Theoretical Computer Sci., vol. 209, 1998, pp. 237-260.   DOI   ScienceOn
15 I.H. Witten and E. Frank, Data Mining-Practical Machine Learning Tools and Techniques with JAVA Implementations, Morgan Kaufmann Publishers, 2nd ed., 2005.
16 R. Battiti, "Using Mutual Information for Selecting Features in Supervised Neural Net Learning," IEEE Trans. Neural Netw., vol. 5, no. 4, 1994, pp. 537-550.   DOI   ScienceOn
17 A. Jakulin and I. Bratko, "Quantifying and Visualizing Attribute Interactions: An Approach Based on Entropy." Available: http://arxiv.org/abs/cs.AI/0308002v3, 2004.
18 C.J. Merz and P.M. Murphy, "UCI Repository of Machine Learning Databases [Online]." Available: http://www.ics.uci.edu/fmlearn/MLRepository.html.
19 H. Peng, "mRMR Sample Data Sets [Online]." Available: http://penglab.janelia.org/proj/mRMR/test colon s3.csv.