Browse > Article

The application for predictive similarity measures of binary data in association rule mining  

Park, Hee-Chang (Department of Statistics, Changwon National University)
Publication Information
Journal of the Korean Data and Information Science Society / v.22, no.3, 2011 , pp. 495-503 More about this Journal
Abstract
The most widely used data mining technique is to find association rules. Association rule mining is the method to quantify the relationship between each set of items in very huge database based on the association thresholds. There are some basic association thresholds to explore meaningful association rules ; support, confidence, lift, etc. Among them, confidence is the most frequently used, but it has the drawback that it can not determine the direction of the association. The net confidence and the attributably pure confidence were developed to compensate for this drawback, but they have other drawbacks.In this paper we consider some predictive similarity measures for binary data in cluster analysis and multi-dimensional analysis as association threshold to compensate for these drawbacks. The comparative studies with net confidence, attributably pure confidence, and some predictive similarity measures are shown by numerical example.
Keywords
Association thresholds; attributably pure confidence; confidence; net confidence; similarity measure;
Citations & Related Records
Times Cited By KSCI : 5  (Citation Analysis)
연도 인용수 순위
1 Park, H. C. and Cho, K. H. (2005). Waste database analysis joined with local information using association rules. Journal of the Korean Data Analysis Society, 7, 763-772.
2 Park, J. S., Chen, M. S. and Philip, S. Y. (1995). An effective hash-based algorithms for mining association rules. Proceedings of ACM SIGMOD Conference on Management of Data, 175-186.
3 Pasquier, N., Bastide, Y., Taouil, R. and Lakhal, L. (1999). Discovering frequent closed itemsets for association rules. Proceedings of the 7th International Conference on Database Theory, 398-416.
4 Pei, J., Han, J. and Mao, R. (2000). CLOSET: An efficient algorithm for mining frequent closed itemsets. Proceedings of ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, 21-30.
5 Romesburg, H. C. (1984). Cluster analysis for researchers, Lifetime Learning Publications, Belmont, California.
6 Srikant, R. and Agrawal, R. (1995). Mining generalized association rules. Proceedings of the 21st VLDB Conference, 407-419.
7 Han, J. and Fu, Y. (1999). Mining multiple-level association rules in large databases. IEEE Transactions on Knowledge and Data Engineering, 11, 68-77.
8 Han, J., Pei, J. and Yin, Y. (2000). Mining frequent patterns without candidate generation. Proceedings of ACM SIGMOD Conference on Management of Data, 1-12.
9 Liu, B., Hsu, W. and Ma, Y. (1999). Mining association rules with multiple minimum supports. Proceedings of the 5th International Conference on Knowledge Discovery and Data Mining, 337-241.
10 Park, H. C. (2008). The proposition of conditionally pure confidence in association rule mining. Journal of the Korean Data & Information Science Society, 19, 1141-1151.
11 Park, H. C. (2009). Proposition of pure association rule for original characteristics grasping. Journal of the Korean Data Analysis Society, 11, 859-869.
12 Cai, C. H., Fu, A. W. C., Cheng, C. H. and Kwong, W. W. (1998). Mining association rules with weighted items. Proceedings of International Database Engineering and Applications Symposium, 68-77.
13 Park, H. C. (2011). The proposition of attributably pure confidence in association rule mining. Journal of the Korean Data & Information Science Society, 22, 235-243.
14 Agrawal, R., Imielinski, R. and Swami, A. (1993). Mining association rules between sets of items in large databases. Proceedings of the ACM SIGMOD Conference on Management of Data, 207-216.
15 Agrawal, R. and Srikant, R. (1994). Fast algorithms for mining association rules. Proceedings of the 20th VLDB Conference, 487-499.
16 Cho, K. H. and Park, H. C. (2008). A study of association rule application using self-organizing map for fused data. Journal of the Korean Data & Information Science Society, 19, 95-104.
17 안광일, 김성집 (2003). 연관규칙 탐색에서의 새로운 흥미도 척도의 제안. <대한산업공학회지>, 29, 41-48.
18 Choi, J. H. and Park, H. C. (2008). Comparative study of quantitative data binning methods in association rule. Journal of the Korean Data & Information Science Society, 19, 903-910.