Discretization of Continuous Attributes based on Rough Set Theory and SOM

러브집합이론과 SOM을 이용한 연속형 속성의 이산화

  • 서완석 (한양대학교 산업공학과) ;
  • 김재련 (한양대학교 산업공학과)
  • Published : 2005.03.01

Abstract

Data mining is widely used for turning huge amounts of data into useful information and knowledge in the information industry in recent years. When analyzing data set with continuous values in order to gain knowledge utilizing data mining, we often undergo a process called discretization, which divides the attribute's value into intervals. Such intervals from new values for the attribute allow to reduce the size of the data set. In addition, discretization based on rough set theory has the advantage of being easily applied. In this paper, we suggest a discretization algorithm based on Rough Set theory and SOM(Self-Organizing Map) as a means of extracting valuable information from large data set, which can be employed even in the case where there lacks of professional knowledge for the field.

Keywords

References

  1. M. Fayyad, G. Piatesky-Shapiro, P. Smyth, From Data mining to Knowledge Discovery: An Overview, in Advances in Knowledge Discovery and Data Mining, pp. 1-34, MIT Press, 1996
  2. J. Han, M. Kamber, Data Mining : concepts and Tech­niques, pp. 21-26, Morgan Kaufmann publishers, 2000
  3. M. J. A. Berry , Gordon Linoff, Data Mining Techni­ques, pp. 63-93, John Wiley & Sons, NY, 1997
  4. S. D. Jitender, V. V. Raghavan, A. Sarkar, H. Sever, 'Data Mining: Trends in Research and Development', Rough Sets and Data Mining analysis of imprecise data, T.Y.Lin and N.Cercone(Ed.), Kluwer Academic publishers, pp. 9-45, 1997
  5. J.G. Bazan, H. S. Nguyen, S. H. Nguyen, P. Synak, J. Wroblewski, 'Rough set algorithm in classification problem', Rough set methods and applications: new developments in knowledge discovery in information systems, L. Polkowski, S. Tsumoto, and T. Y. Lin(Ed.), Physica- Verlag, pp. 57-66, 2000
  6. W. Ziarko, 'Variable Precision Rough Set Model', Journal of Computer & System Sciences, Vol. 46, No 1, pp. 39-59, 1993 https://doi.org/10.1016/0022-0000(93)90048-2
  7. M. J. Beynon, M. J. Peel, 'Variable precision rough set theory and data discretisation : an application to corpo­rate failure prediction', Omega, Volume 29, No.6, pp. 561-576, December 2001 https://doi.org/10.1016/S0305-0483(01)00045-7
  8. Z. Pawlak, Rough sets : Theoretical Aspects of Reason­ing About Data, A Kluwer Academy Publisher, 1991
  9. Z. Pawlak, 'Rough sets and intelligent data analysis', Information Sciences, Volume 147, Issues 1-4, pp. 1-12, November 2002 https://doi.org/10.1016/S0020-0255(02)00197-4
  10. Z. Pawlak, A. Skowron, 'Rough Membership Func­tions', Advances in the Dempster-Shafer theory of evi­dence, R. Yager, J. Kacprzyk, and M. Fedrizzi(Ed.). Wiley, pp. 251-271, 1994
  11. Teuvo Kohonen, Self-Organizing maps, 3rd Edition, Springer, 2001
  12. N. Cercone, H. Hamilton, X. Hu, N. Shan, 'Data Mining Using Attribute-Oriented Generation and Information Reduction', Rough Sets and Data Mining: Analysis of Imprecise Data, T. Y. Lin and N. Cercone(Ed.), Kluwer Academic publishers, pp. 199-227, 1997
  13. X. Hu, N. Cercone, 'Learning maximal generalized decision rules via discretization: generalization and rough set feature selection', Proceedings ICTAl '97, IEEE, pp. 548-556, 1997