DOI QR코드

DOI QR Code

General Set Covering for Feature Selection in Data Mining

  • Ma, Zhengyu (School of Industrial and Management Engineering, Korea University) ;
  • Ryoo, Hong Seo (School of Industrial and Management Engineering, Korea University)
  • Received : 2012.05.26
  • Accepted : 2012.06.19
  • Published : 2012.11.30

Abstract

Set covering has widely been accepted as a staple tool for feature selection in data mining. We present a generalized version of this classical combinatorial optimization model to make it better suited for the purpose and propose a surrogate relaxation-based procedure for its meta-heuristic solution. Mathematically and also numerically with experiments on 25 set covering instances, we demonstrate the utility of the proposed model and the proposed solution method.

Keywords

References

  1. Beasley, J. E., "An Algorithm for Set Covering Problem," European Journal of Operational Research 31 (1987), 85-93. https://doi.org/10.1016/0377-2217(87)90141-X
  2. Beasley, J. E., "A Lagrangian Heuristic for Set-covering Problems," Naval Research Logistics 37 (1990), 151-164. https://doi.org/10.1002/1520-6750(199002)37:1<151::AID-NAV3220370110>3.0.CO;2-2
  3. Beasley, J. E., "OR-Library: Distributing Test Problems by Electronic Mail," Journal of the Operational Research Society 41 (1990), 1069-1072. https://doi.org/10.1057/jors.1990.166
  4. Boros, E., P. L. Hammer, T. Ibaraki, and A. Kogan, "Logical Analysis of Numerical Data," Mathematical Programming 79 (1997), 163-190.
  5. Boros, E., P. L. Hammer, T. Ibaraki, and A. Kogan, "An Implementation of Logical Analysis of Data," IEEE Transactions on Knowledge and Data Engineering, 12, 2 (2000), 292-306. https://doi.org/10.1109/69.842268
  6. Caprara, A., M. Fischetti, and P. Toth, "A Heuristic Method for the Set Covering Problem," Operations Research 45 (1999), 730-743.
  7. Chvatal, V., "A Greedy Heuristic for the Set-covering Problem," Mathematics of Operations Research 4 (1979), 233-235. https://doi.org/10.1287/moor.4.3.233
  8. Dantzig, G. B., "A Comment on Edie's Traffic Delay at Toll Booths," Journal of the Operations Research Society of America 2, 3 (1954), 339-341. https://doi.org/10.1287/opre.2.3.339
  9. Greenberg, H. J. and W. P. Pierskalla, "Surrogate Mathematical Programming," Operations Research 18, 5 (1979), 924-939.
  10. Hall, N. G. and D. S. Hochbaum, "A Fast Approximation Algorithm for the Multicovering Problem," Discrete Applied Mathematics 15 (1986), 35-40. https://doi.org/10.1016/0166-218X(86)90016-8
  11. Kim, K. and H. S. Ryoo, "A LAD-based Method for Selecting Short Oligo Probes for Genotyping Applications," OR Spectrum: Special Issue on OR and Biomedical Informatics 30 (2008), 249-268.
  12. Lorena, L. A. N. and F. B. Lopes, "A Surrogate Heuristic for Set Covering Problems," European Journal of Operational Research 79 (1994), 38-150. https://doi.org/10.1016/0377-2217(94)90394-8
  13. Nemhauser, G. L. and L. A. Wolsey, "Integer and Combinatorial Optimization," A Wiley-Interscience Publication, John Wiley and Sons INC., USA, 1999.
  14. Ryoo, H. S. and I.-Y. Jang, "MILP Approach to Pattern Generation in Logical Analysis of Data," Discrete Applied Mathematics 157 (2009), 749-761. https://doi.org/10.1016/j.dam.2008.07.005