DOI QR코드

DOI QR Code

Feature Selection Based on Bi-objective Differential Evolution

  • Das, Sunanda (Neotia Institute of Technology, Management and Science) ;
  • Chang, Chi-Chang (School of Medical Informatics, Chung-Shan Medical University) ;
  • Das, Asit Kumar (Indian Institute of Engineering Science and Technology) ;
  • Ghosh, Arka (Indian Institute of Engineering Science and Technology)
  • Received : 2017.05.25
  • Accepted : 2017.11.25
  • Published : 2017.12.30

Abstract

Feature selection is one of the most challenging problems of pattern recognition and data mining. In this paper, a feature selection algorithm based on an improved version of binary differential evolution is proposed. The method simultaneously optimizes two feature selection criteria, namely, set approximation accuracy of rough set theory and relational algebra based derived score, in order to select the most relevant feature subset from an entire feature set. Superiority of the proposed method over other state-of-the-art methods is confirmed by experimental results, which is conducted over seven publicly available benchmark datasets of different characteristics such as a low number of objects with a high number of features, and a high number of objects with a low number of features.

Keywords

References

  1. M. Dash and H. Liu, "Feature selection for classification," Intelligent data analysis, vol. 1, no. 3, pp. 131-156, 1997. https://doi.org/10.1016/S1088-467X(97)00008-5
  2. S. Das, "Filters, wrappers and a boosting-based hybrid for feature selection," in Proceedings of the 18th International Conference on Machine Learning (ICML), Williamstown, MA, 2001, pp. 74-81.
  3. Y. Saeys, I. Inza, and P. Larranaga, "A review of feature selection techniques in bioinformatics," Bioinformatics, vol. 23, no. 19, pp. 2507-2517, 2007. https://doi.org/10.1093/bioinformatics/btm344
  4. R. Bellman, "Dynamic programming and Lagrange multipliers," Proceedings of the National Academy of Sciences of the United States of America, vol. 42, no. 10, 767, 1956. https://doi.org/10.1073/pnas.42.10.767
  5. R. Kohavi and G. H. John, "Wrappers for feature subset selection," Artificial Intelligence, vol. 97, no. 1, pp. 273-324, 1997. https://doi.org/10.1016/S0004-3702(97)00043-X
  6. P. Mitra, C. A. Murthy, and S. K. Pal, "Unsupervised feature selection using feature similarity," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 3, pp. 301-312, 2002. https://doi.org/10.1109/34.990133
  7. Y. Yang, H. T. Shen, Z. Ma, Z. Huang, and X. Zhou, "l2,1-norm regularized discriminative feature selection for unsupervised learning," in Proceedings of 22nd International Joint Conference on Artificial Intelligence (IJCAI), Barcelona, Spain, 2011, pp. 1589-1594.
  8. D. Cai, C. Zhang, and X. He, "Unsupervised feature selection for multi-cluster data," in Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, 2010, pp. 333-342.
  9. R. Jensen and Q. Shen, Computational Intelligence and Feature Selection: Rough and Fuzzy Approaches. Hoboken, NJ: John Wiley & Sons, 2008.
  10. R. Jensen and Q. Shen, "New approaches to fuzzy-rough feature selection," IEEE Transactions Fuzzy Systems, vol. 17, no. 4, pp. 824-838, 2009. https://doi.org/10.1109/TFUZZ.2008.924209
  11. N. Parthalain, Q. Shen, and R. Jensen, "A distance measure approach to exploring the rough set boundary region for attribute reduction," IEEE Transactions on Knowledge and Data Engineering, vol. 22, no. 3, pp. 305-317, 2010. https://doi.org/10.1109/TKDE.2009.119
  12. A. K. Das, S. K. Pati, and S. Chakrabarty, "Reduct generation of microarray dataset using rough set and graph theory for unsupervised learning," in Proceedings of the 2nd International Conference on Computational Science, Engineering and Information Technology, Coimbatore, India, 2012, pp. 555-561.
  13. S. Sengupta and A. K. Das, "Dimension reduction using clustering algorithm and rough set theory," in Swarm, Evolutionary, and Memetic Computing. Heidelberg: Springer, 2012, pp. 705-712.
  14. M. Dash and H. Liu, "Consistency-based search in feature selection," Artificial Intelligence, vol. 151, no. 1, pp. 155-176, 2003. https://doi.org/10.1016/S0004-3702(03)00079-1
  15. S. K. Pati, A. K. Das, and A. Ghosh, "Gene selection using multi-objective genetic algorithm integrating cellular automata and rough set theory," in Swarm, Evolutionary, and Memetic Computing. Heidelberg: Springer, 2013, pp. 144-155.
  16. J. Wroblewski and M. Szczuka, "Neural network architecture for synthesis of the probabilistic rule based classifiers," Electronic Notes in Theoretical Computer Science, vol. 82, no. 4, pp. 251-262, 2003. https://doi.org/10.1016/S1571-0661(04)80723-0
  17. G. L. Pappa, A. A. Freitas, and C. A. Kaestner, "Attribute selection with a multi-objective genetic algorithm," in Advances in Artificial Intelligence. Heidelberg: Springer, 2002, pp. 280-290.
  18. B. Liu, F. Liu, and X. Cheng, "An adaptive genetic algorithm based on rough set attribute reduction," in Proceedings of 2010 3rd International Conference on Biomedical Engineering and Informatics, Yantai, China, 2010, pp. 2880-2883.
  19. D. P. Muni, N. R. Pal, and J. Das, "Genetic programming for simultaneous feature selection and classifier design," IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 36, no. 1, pp. 106-117, 2006. https://doi.org/10.1109/TSMCB.2005.854499
  20. X. Wang, J.Yang, X. Teng, W. Xia, and R. Jensen, "Feature selection based on rough sets and particle swarm optimization," Pattern Recognition Letters, vol. 28, no. 4, pp. 459-471, 2007. https://doi.org/10.1016/j.patrec.2006.09.003
  21. R. Storn and K. Price, "Differential evolution: a simple and efficient heuristic for global optimization over continuous spaces," Journal of Global Optimization, vol. 11, no. 4, pp. 341-359, 1997. https://doi.org/10.1023/A:1008202821328
  22. S. Das and P. N. Suganthan, "Differential evolution: a survey of the state-of-the-art," IEEE Transactions on Evolutionary Computation, vol. 15, no. 1, pp. 4-31, 2011. https://doi.org/10.1109/TEVC.2010.2059031
  23. A. K. Qin, V. L. Huang, and P. N. Suganthan, "Differential evolution algorithm with strategy adaptation for global numerical optimization," IEEE Transactions on Evolutionary Computation, vol. 13, no. 2, pp. 398-417, 2009. https://doi.org/10.1109/TEVC.2008.927706
  24. J. Zhang and A. C. Sanderson, "JADE: adaptive differential evolution with optional external archive," IEEE Transactions on Evolutionary Computation, vol. 13, no. 5, pp. 945-958, 2009. https://doi.org/10.1109/TEVC.2009.2014613
  25. M. G. Epitropakis, D. K. Tasoulis, N. G. Pavlidis, V. P. Plagianakos, and M. N. Vrahatis, "Enhancing differential evolution utilizing proximity-based mutation operators," IEEE Transactions on Evolutionary Computation, vol. 15, no. 1, pp. 99-119, 2011. https://doi.org/10.1109/TEVC.2010.2083670
  26. S. M. Islam, S. Das, S. Ghosh, S. Roy, and P. N. Suganthan, "An adaptive differential evolution algorithm with novel mutation and crossover strategies for global numerical optimization," IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 42, no. 2, pp. 482-500, 2012. https://doi.org/10.1109/TSMCB.2011.2167966
  27. Y. Wang, Z. Cai, and Q. Zhang, "Differential evolution with composite trial vector generation strategies and control parameters," IEEE Transactions on Evolutionary Computation, vol. 15, no. 1, pp. 55-66, 2011. https://doi.org/10.1109/TEVC.2010.2087271
  28. R. Mallipeddi and P. N. Suganthan, "Differential evolution algorithm with ensemble of parameters and mutation and crossover strategies," in Swarm, Evolutionary, and Memetic Computing. Heidelberg: Springer, 2010, pp. 71-78.
  29. Z. Yang, K. Tang, and X. Yao, "Large scale evolutionary optimization using cooperative coevolution," Information Sciences, vol. 178, no. 15, pp. 2985-2999, 2008. https://doi.org/10.1016/j.ins.2008.02.017
  30. A. Zamuda, J. Brest, B. Boskovic, and V. Zumer, "Large scale global optimization using differential evolution with self-adaptation and cooperative co-evolution," in Proceedings of IEEE Congress on Evolutionary Computation (CEC2008), Hong Kong, 2008, pp. 3718-3725.
  31. K. E. Parsopoulos, "Cooperative micro-differential evolution for high-dimensional problems," in Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation, Montreal, Canada, 2009, pp. 531-538.
  32. S. Z. Zhao, P. N. Suganthan, and S. Das, "Self-adaptive differential evolution with multi-trajectory search for largescale optimization," Soft Computing, vol. 15, no. 11, pp. 2175-2185, 2011. https://doi.org/10.1007/s00500-010-0645-4
  33. J. Brest and M. S. Maucec, "Self-adaptive differential evolution algorithm using population size reduction and three strategies," Soft Computing, vol. 15, no. 11, pp. 2157-2174, 2011. https://doi.org/10.1007/s00500-010-0644-5
  34. H. Wang, Z. Wu, and S. Rahnamayan, "Enhanced oppositionbased differential evolution for solving high-dimensional continuous optimization problems," Soft Computing, vol. 15, no. 11, pp. 2127-2140, 2011. https://doi.org/10.1007/s00500-010-0642-7
  35. M. Weber, F. Neri, and V. Tirronen, "Shuffle or update parallel differential evolution for large-scale optimization," Soft Computing, vol. 15, no. 11, pp. 2089-2107, 2011. https://doi.org/10.1007/s00500-010-0640-9
  36. Z. Pawlak, "Rough set approach to knowledge-based decision support," European Journal of Operational Research, vol. 99, no. 1, pp. 48-57, 1997. https://doi.org/10.1016/S0377-2217(96)00382-7
  37. K. Deb, Multi-Objective Optimization Using Evolutionary Algorithms. Hoboken, NJ: John Wiley & Sons, 2001.
  38. K. Bache and M. Lichman, "UCI machine learning repository," http://archive.ics.uci.edu/ml.
  39. X. He, D. Cai, and P. Niyogi, "Laplacian score for feature selection," Advances in Neural Information Processing Systems, vol. 18, pp. 507-514, 2005.
  40. D. Cai, C. Zhang, and X. He, "Unsupervised feature selection for multi-cluster data," in Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, 2010, pp. 333-342.
  41. S. Bandyopadhyay, T. Bhadra, P. Mitra, and U. Maulik, "Integration of dense subgraph finding with feature clustering for unsupervised feature selection," Pattern Recognition Letters, vol. 40, pp. 104-112, 2014. https://doi.org/10.1016/j.patrec.2013.12.008
  42. M. A. Hall, "Correlation-based feature selection for machine learning," Ph.D. dissertation, The University of Waikato, Hamilton, New Zealand, 1999.
  43. J. Garcia-Nieto, E. Alba, L. Jourdan, and E. Talbi, "Sensitivity and specificity based multiobjective approach for feature selection: application to cancer diagnosis," Information Processing Letters, vol. 109, no. 16, pp. 887-896, 2009. https://doi.org/10.1016/j.ipl.2009.03.029
  44. M. A. Hall and G. Holmes, "Benchmarking attribute selection techniques for discrete class data mining," IEEE Transactions on Knowledge and Data Engineering, vol. 15, no. 6, pp. 1437-1447, 2003. https://doi.org/10.1109/TKDE.2003.1245283
  45. T. Bhadra and S. Bandyopadhyay, "Unsupervised feature selection using an improved version of differential evolution," Expert Systems with Applications, vol. 42, no. 8, pp. 4042-4053, 2014. https://doi.org/10.1016/j.eswa.2014.12.010
  46. M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and H. I. Witten, "The WEKA data mining software: an update," ACM SIGKDD Explorations Newsletter, vol. 11, no. 1, pp. 10-18, 2009. https://doi.org/10.1145/1656274.1656278
  47. G. Forney, "Generalized minimum distance decoding," IEEE Transactions on Information Theory, vol. 12, no. 2, pp. 125-131, 1966. https://doi.org/10.1109/TIT.1966.1053873
  48. R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification, 2nd ed. New York, NY: John Wiley & Sons, 2001.
  49. V. Vapnik, The Nature of Statistical Learning Theory, 1st ed. New York, NY: Springer, 1995.
  50. Y. Freund and R. E. Schapire, "A decision-theoretic generalization of on-line learning and an application to boosting," Journal of Computer and System Sciences, vol. 55, pp. 119-139, 1997. https://doi.org/10.1006/jcss.1997.1504
  51. J. R. Quinlan, C4. 5: Programs for Machine Learning, 1st ed. San Mateo, CA: Morgan Kaufmann, 1993.