Browse > Article
http://dx.doi.org/10.7465/jkdi.2012.23.6.1127

Exploration of PIM based similarity measures as association rule thresholds  

Park, Hee Chang (Department of Statistics, Changwon National University)
Publication Information
Journal of the Korean Data and Information Science Society / v.23, no.6, 2012 , pp. 1127-1135 More about this Journal
Abstract
Association rule mining is the method to quantify the relationship between each set of items in a large database. One of the well-studied problems in data mining is exploration for association rules. There are three primary quality measures for association rule, support and confidence and lift. We generate some association rules using confidence. Confidence is the most important measure of these measures, but it is an asymmetric measure and has only positive value. Thus we can face with difficult problems in generation of association rules. In this paper we apply the similarity measures by probabilistic interestingness measure to find a solution to this problem. The comparative studies with support, two confidences, lift, and some similarity measures by probabilistic interestingness measure are shown by numerical example. As the result, we knew that the similarity measures by probabilistic interestingness measure could be seen the degree of association same as confidence. And we could confirm the direction of association because they had the sign of their values.
Keywords
Association rule; confidence; lift; probabilistic interestingness measure; similarity measure; support;
Citations & Related Records
Times Cited By KSCI : 5  (Citation Analysis)
연도 인용수 순위
1 Agrawal, R., Imielinski, R. and Swami, A. (1993). Mining association rules between sets of items in large databases. Proceedings of the ACM SIGMOD Conference on Management of Data, 207-216.
2 Agrawal, R. and Srikant, R. (1994). Fast algorithms for mining association rules. Proceedings of the 20th VLDB Conference, 487-499.
3 Bayardo, R. J. (1998). Efficiently mining long patterns from databases. Proceedings of ACM SIGMOD Conference on Management of Data, 85-93.
4 Cai, C. H., Fu, A. W. C., Cheng, C. H. and Kwong, W. W. (1998). Mining association rules with weighted items. Proceedings of International Database Engineering and Applications Symposium, 68-77.
5 Cho, K. H. and Park, H. C. (2011). Discovery of insignificant association rule s using external variable. Journal of the Korean Data Analysis Society, 13, 1343-1352.
6 Doolittle, M. H. (1885). The verification of predictions. Bulletin of the Philosophical Society of Washington, 7, 122-127.
7 Han, J. and Fu, Y. (1995). Discovery of multiple-level association rules from large databases. Proceeding of the 21st VLDB Conference, 420-431.
8 Han, J. and Fu, Y. (1999). Mining multiple-level association rules in large databases. IEEE Transactions on Knowledge and Data Engineering, 11, 68-77.
9 Han, J., Pei, J. and Yin, Y. (2000). Mining frequent patterns without candidate generation. Proceedings of ACM SIGMOD Conference on Management of Data, 1-12.
10 Imberman S., Domanski B. and Thompson H.(2001), Boolean analyer - An algorithm that uses a probabilistic interestingness measure to find dependency/association rules in a head trauma data. Proceedings of Americas Conference on Information Systems, 369-375.
11 Lim, J., Lee, K. and Cho, Y. (2010). A study of association rule by considering the frequency. Journal of the Korean Data & Information Science Society, 21, 1061-1069.   과학기술학회마을
12 Liu, B., Hsu, W. and Ma, Y. (1999). Mining association rules with multiple minimum supports. Proceedings of the 5th International Conference on Knowledge Discovery and Data Mining, 337-241.
13 Michael, E. L. (1920). Marine ecology and the coefficient of association. Journal of Animal Ecology, 8, 54-59.   DOI   ScienceOn
14 Park, H. C. (2010b). Standardization for basic association measures in association rule mining. Journal of the Korean Data & Information Science Society, 21, 891-899.   과학기술학회마을
15 Montgomery, A. C. and Crittenden, K. S. (1977). Improving coding reliability for open-ended questions. Public Opinion Quarterly, 41, 235-243.   DOI   ScienceOn
16 Orchard, R. A. (1975). On the determination of relationships between computer system state variables, Bell Laboratories Technical Memorandum, Bell Laboratories, New Jersey.
17 Park, H. C. (2010a). Weighted association rules considering item RFM scores. Journal of the Korean Data & Information Science Society, 21, 1147-1154.   과학기술학회마을
18 Park, H. C. (2011a). Proposition of negatively pure association rule threshold. Journal of the Korean Data & Information Science Society, 22, 179-188.   과학기술학회마을
19 Park, H. C. (2011b). The proposition of attributably pure confidence in association rule mining. Journal of the Korean Data & Information Science Society, 22, 235-243.   과학기술학회마을
20 Park, H. C. (2011c). The application of some similarity measures to association rule thresholds. Journal of the Korean Data Analysis Society, 13, 1331-1342.
21 Park, J. S., Chen, M. S. and Philip, S. Y. (1995). An effective hash-based algorithms for mining association rules. Proceedings of ACM SIGMOD Conference on Management of Data, 175-186.
22 Pasquier, N., Bastide, Y., Taouil, R. and Lakhal, L. (1999). Discovering frequent closed itemsets for association rules. Proceedings of the 7th International Conference on Database Theory, 398-416.
23 Piatetsky-Shapiro, G (1991). Discovery, analysis and presentation of strong rules, Knowledge Discovery in Databases. AAAI/MIT Press, 229-248.
24 Pearson, K. (1926). On the coefficient of racial likeness. Biometrika, 9, 105-117.
25 Pearson, K and Heron, D. (1913). On theories of association. Biometrika, 9, 159-315.   DOI
26 Pei, J., Han, J. and Mao, R. (2000). CLOSET: An efficient algorithm for mining frequent closed itemsets. Proceedings of ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, 21-30.
27 Srinkant R., Vu Q. and Agrawal R. (1997). Mining association rules with item constraints. Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining, 67-73.
28 Toivonen H. (1996). Sampling large database for association rules. Proceedings of the 22nd VLDB Conference, 134-145.
29 Warrens M. J. (2008). Similarity coefficients for binary data, properties of coefficients, coefficient matrices, multi-way metrics and multivariate coefficients, The Doctoral paper of Leiden University, Netherlands.
30 Yule, G. U. (1900). On the association of attributes in statistics. Philosophical Transactions of the Royal Society, 75, 257-319.
31 Yule, G. U. (1912). On the methods of measuring the association between two attributes. Journal of the Royal Statistical Society , 75, 579-652.   DOI   ScienceOn