Browse > Article
http://dx.doi.org/10.7465/jkdi.2016.27.2.353

Signed Hellinger measure for directional association  

Park, Hee Chang (Department of Statistics, Changwon National University)
Publication Information
Journal of the Korean Data and Information Science Society / v.27, no.2, 2016 , pp. 353-362 More about this Journal
Abstract
By Wikipedia, data mining is the process of discovering patterns in a big data set involving methods at the intersection of association rule, decision tree, clustering, artificial intelligence, machine learning. and database systems. Association rule is a method for discovering interesting relations between items in large transactions by interestingness measures. Association rule interestingness measures play a major role within a knowledge discovery process in databases, and have been developed by many researchers. Among them, the Hellinger measure is a good association threshold considering the information content and the generality of a rule. But it has the drawback that it can not determine the direction of the association. In this paper we proposed a signed Hellinger measure to be able to interpret operationally, and we checked three conditions of association threshold. Furthermore, we investigated some aspects through a few examples. The results showed that the signed Hellinger measure was better than the Hellinger measure because the signed one was able to estimate the right direction of association.
Keywords
Association rule; Hellinger divergence; Hellinger measure; interestingness measure; signed Hellinger measure;
Citations & Related Records
Times Cited By KSCI : 12  (Citation Analysis)
연도 인용수 순위
1 Park, H. C. (2014a). Comparison of cosine family similarity measures in the aspect of association rule. Journal of the Korean Data Analysis Society, 16, 729-737.
2 Park, H. C. (2014b). Comparison of confidence measures useful for classification model building. Journal of the Korean Data & Information Science Society, 25, 1-7.   DOI
3 Park, H. C. (2015). A study on the ordering of PIM family similarity measures without marginal probability. Journal of the Korean Data & Information Science Society, 26, 367-376.   DOI
4 Park, J. H. and Pi, S. Y. (2015). A study on wt-algorithm for effective reduction of association rules. Journal of the Korea Industrial Information Systems Research, 20, 61-69.
5 Piatetsky-Shapiro, G. (1991). Discovery, analysis and presentation of strong rules, Knowledge Discovery in Databases, AAAI/MIT Press, Cambridge MA, USA, 229-248.
6 Silberschatz, A. and Tuzhilin, A. (1996). What makes patterns interesting in knowledge discovery systems. IEEE Transactions on Knowledge Data Engineering, 8, 970-974.   DOI
7 Tan, P. N., Kumar, V. and Srivastava, J. (2002). Selecting the right interestingness measure for association patterns. Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Association for Computing Machinery, New York, USA, 32-41.
8 Agrawal, R., Imielinski, R. and Swami, A. (1993). Mining association rules between sets of items in large databases. Proceedings of the ACM SIGMOD Conference on Management of Data, Association for Computing Machinery, New York, USA, 207-216.
9 Ahn, K. and Kim, S. (2003). A new interstingness measure in association rules mining. Journal of the Korean Institute of Industrial Engineers, 29, 41-48.
10 Beran, R. J. (1977). Minimum hellinger distances for parametric models. Annals of Statistics, 5, 445-463.   DOI
11 Cho, K. H. and Park, H. C. (2013). A study of Gyungnam's social indicator survey using data mining. Journal of the Korean Data Analysis Society, 15, 2489-2497.
12 Lee, C. H. and Bae, J. H. (2014). A new importance measure of association rules using information theory. Journal of rhe Korea Information Processing Society Transactions on Software and Data Engineering, 3, 37-42.
13 Jin, D. S., Kang, C., Kim, K. K. and Choi, S. B. (2011). CRM on travel agency using association rules. Journal of the Korean Data Analysis Society, 13, 2945-2952.
14 Park, H. C. (2011a). Association rule ranking function by decreased lift influence. Journal of the Korean Data & Information Science Society, 22, 179-188.
15 Park, H. C. (2011b). The proposition of attributably pure confidence in association rule mining. Journal of the Korean Data & Information Science Society, 22, 235-243.
16 Park, H. C. (2012a). Negatively attributable and pure confidence for generation of negative association rules. Journal of the Korean Data & Information Science Society, 23, 707-716.
17 Park, H. C. (2012b). Exploration of PIM based similarity measures as association rule thresholds. Journal of the Korean Data & Information Science Society, 23, 1127-1135.   DOI
18 Park, H. C. (2013a). The proposition of compared and a ttributably pure confidence in association rule mining. Journal of the Korean Data & Information Science Society, 24, 523-532.   DOI
19 Park, H. C. (2013b). Proposition of causal association rule thresholds. Journal of the Korean Data & Information Science Society, 24, 1189-1197.   DOI