Learning Multidimensional Sequential Patterns Using Hellinger Entropy Function

Lee, Chang-Hwan;

doi:10.3745/KIPSTB.2004.11B.4.477

The KIPS Transactions:PartB (정보처리학회논문지B)

Volume 11B Issue 4
/
Pages.477-484
/
2004
/
1598-284X(pISSN)

Korea Information Processing Society (한국정보처리학회)

DOI QR Code

Learning Multidimensional Sequential Patterns Using Hellinger Entropy Function

Hellinger 엔트로피를 이용한 다차원 연속패턴의 생성방법

Lee, Chang-Hwan

이창환 (동국대학교 정보통신학과)

Published : 2004.08.01

https://doi.org/10.3745/KIPSTB.2004.11B.4.477 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

The technique of sequential pattern mining means generating a set of inter-transaction patterns residing in time-dependent data. This paper proposes a new method for generating sequential patterns with the use of Hellinger measure. While the current methods are generating single dimensional sequential patterns within a single attribute, the proposed method is able to detect multi-dimensional patterns among different attributes. A number of heuristics, based on the characteristics of Hellinger measure, are proposed to reduce the computational complexity of the sequential pattern systems. Some experimental results are presented.

데이터 마이닝에서 연속패턴(sequential pattern) 생성기술은 시차를 두고 발생한 사건들에 대하여 잠재해있는 패턴을 발견하는 기술을 의미한다. 본 연구는 정보이론을 이용하여 데이터베이스로부터 연속패턴을 자동으로 발견하는 방법에 관한 내용이다. 기존의 방법들이 한 속성내에서의 연속패턴만을 탐지하는 일차원 연속패턴을 생성하는데 비하여 본 연구에서 제시하는 방법은 데이터베이스내의 모든 속성간의 연속패턴 관계를 탐지할 수 있는 다차원 연속패턴을 생성할 수 있다. 본 연구에서는 연속패턴 생성을 위하여 헬링거(Hellinger) 변량을 사용하였으며 이를 이용하여 발견된 연속패턴들의 중요도를 측정할 수 있었다. 또한 헬링거 변량의 함수적인 특성을 분석하여 연속패턴 추출의 복잡도를 줄이기 위한 두 가지의 법칙이 제안되었고 다수의 실험 데이터를 통하여 다차원의 연속패턴을 생성할 수 있음을 보였다.

Keywords

References

Jiawei Han, Micheline Kamber, Data Mining : Concepts and Techniques, Morgan Kaufmann, August, 2000
David J. Hand, Heikki Mannila and Padhraic Smyth, Principles of Data Mining, MIT Press, Fall, 2000
R. Agrawal and R. Srikant, Mining sequential pattern, Conf. Data Engineering(ICDE '95)
R. Agrawal and R. Srikant, Mining sequential pattern : Generalizations and Perfoemance Improvements, Int'l Conf. on Extending Database Technology, 1996
R. Agrawal, R. Srikant, 'Fast Algorithms for Mining Association Rules,' Proc. of the 20th Int'l Conference on Very Large Databases, Santiago, Chile, Sept., 1994
Rakesh Agrawal, Tomasz Imielinski and Arun Swami, Mining association rules between sets of items in large databases, In Proc. of the ACM SIGMOD Conference on Management of Data, Washington, D.C., pp.207-216, May, 1993 https://doi.org/10.1145/170036.170072
C. Lee, Learning Inductive Rules Using Hellinger Measure, Applied Artificial Intelligence, Vol.13, No.8, pp.743-762, 1999 https://doi.org/10.1080/088395199117207
R. J. Beran, Minimum Hellinger Distances for Parametric Models, Ann. Statistics, Vol.5, pp.445-463, 1977 https://doi.org/10.1214/aos/1176343842
J. Han, J. Pei, B. Mortazavi-Asl, Q.Chen, U. Dayal and M.-C. Hsu., Freespan : Frequent pattern-projected sequential pattern mining, Conf. Knowledge Discovery and Data Mining(KDD'00), 2000 https://doi.org/10.1145/347090.347167
H. Mannila, H. Toivonen and A. I. Verkamo, Discovery of frequent episodes in event sequences, Data Mining and Knowledge Discovery, 1998 https://doi.org/10.1023/A:1009748302351
M. N. Garafalakis, R. Rastogi, K. Shim, SPIRIT : Sequential Pattern Mining with Regular Expression Constraints Int'l COnf. on VLDB, 1999
J. Han, J. Pei, G. Dong and K. Wang, Efficient Computation of Iceberg Cubes with Complex Measures, Int'l Conf. on Management of Data(SIGMOD-01), 2001 https://doi.org/10.1145/376284.375664
F. Masseglia, F. Cathala and P. Poncelet, Incremental Mining of Sequential Patterns in Large Databases, European Symposium on Principles of Data Mining and Knowledge Discovery(PKDD98), Vol.1510, pp.176-184, 1998 https://doi.org/10.1007/BFb0094818
M. Zaki, N. Lesh and M. Ogihara. PLANMINE : Sequence Mining for Plan Failures, Int'l Conf. on Knowledge Discovery and Data Mining(KDD-98), 1998
M. Zaki, SPADE : An Efficient Algorithm for Mining Frequent Sequences, Machine Learning, Vol.42, No.1/2, pp.31-60, 2001 https://doi.org/10.1023/A:1007652502315

The KIPS Transactions:PartB (정보처리학회논문지B)

Learning Multidimensional Sequential Patterns Using Hellinger Entropy Function

Hellinger 엔트로피를 이용한 다차원 연속패턴의 생성방법

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)