Development of Practical Data Mining Methods for Database Summarization

  • Lee, Do-Heon (Department of Computer Science, Chonnam National University)
  • Published : 1998.03.01

Abstract

Database summarization is the procedure to obtain generalized and representative descriptions expressing the content of a large amount of database at a glance. We present a top-down summary refinement procedure to discover database summaries. The procedure exploits attribute concept hierarchies that represent ISA relationships among domain concepts. It begins with the most generalized summary and proceeds to find more specialized ones by stepwise refinements. This top-down paradigm reveals at least two important advantages compared to the previous bottom-up methods. Firstly, it provides a natural way of reflecting the user's own discovery preference interactively. Secondly, it does not produce too large intermediate result that makes it hard for the bottom-up approach to be applied in practical environment. The proposed procedure can also be easily extended for distributed databases. Information content measure of a database summary is derived in order to identify more informative summaries among the discovered results.

Keywords