An Algorithm for Sequential Sampling Method in Data Mining

데이터 마이닝에서 샘플링 기법을 이용한 연속패턴 알고리듬

  • 홍지명 (한양대학교 산업공학과) ;
  • 김낙현 (한양대학교 산업공학과) ;
  • 김성집 (한양대학교 산업공학과)
  • Published : 1998.02.01

Abstract

Data mining, which is also referred to as knowledge discovery in database, means a process of nontrivial extraction of implicit, previously unknown and potentially useful information (such as knowledge rules, constraints, regularities) from data in databases. The discovered knowledge can be applied to information management, decision making, and many other applications. In this paper, a new data mining problem, discovering sequential patterns, is proposed which is to find all sequential patterns using sampling method. Recognizing that the quantity of database is growing exponentially and transaction database is frequently updated, sampling method is a fast algorithm reducing time and cost while extracting the trend of customer behavior. This method analyzes the fraction of database but can in general lead to results of a very high degree of accuracy. The relaxation factor, as well as the sample size, can be properly adjusted so as to improve the result accuracy while minimizing the corresponding execution time. The superiority of the proposed algorithm will be shown through analyzing accuracy and efficiency by comparing with Apriori All algorithm.

Keywords