• Title/Summary/Keyword: interestingness measure

Search Result 24, Processing Time 0.027 seconds

A New Interestingness Measure in Association Rules Mining (연관규칙 탐색에서 새로운 흥미도 척도의 제안)

  • Ahn, Kwang-Il;Kim, Seong-Jip
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.29 no.1
    • /
    • pp.41-48
    • /
    • 2003
  • In this paper, we present a new measure to evaluate the interestingness of association rules. Ultimately. to evaluate whether a rule is interesting or not is subjective. However, an interestingness measure is useful in that it shows the cause for pruning uninteresting rules statistically or logically. Some interestingness measures have been developed in association rules mining. We present an overview of interestingness measures and propose a new measure. A comparative study of some interestingness measures is made on an example dataset and a real dataset. Our experiments show that the new measure can avoid the discovery of misleading rules.

The Development of Relative Interestingness Measure for Comparing with Degrees of Association

  • Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.19 no.4
    • /
    • pp.1269-1279
    • /
    • 2008
  • Data mining is the technique to find useful information in huge databases. One of the well-studied problems in data mining is exploration for association rules. An association rule technique finds the relation among each items in massive volume databases by several interestingness measures. An important and useful classification scheme of interestingness measures may be based on user-involvement. This results in two categories - objective and subjective measures. This paper present some relative interestingess measures to compare with degrees of association for two groups. A comparative study with some relative interestingness measures is shown by numerical example. The results show that the relative net confidence is the best relative interestingness measure.

  • PDF

Exploration of PIM based similarity measures as association rule thresholds (확률적 흥미도를 이용한 유사성 측도의 연관성 평가 기준)

  • Park, Hee Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.6
    • /
    • pp.1127-1135
    • /
    • 2012
  • Association rule mining is the method to quantify the relationship between each set of items in a large database. One of the well-studied problems in data mining is exploration for association rules. There are three primary quality measures for association rule, support and confidence and lift. We generate some association rules using confidence. Confidence is the most important measure of these measures, but it is an asymmetric measure and has only positive value. Thus we can face with difficult problems in generation of association rules. In this paper we apply the similarity measures by probabilistic interestingness measure to find a solution to this problem. The comparative studies with support, two confidences, lift, and some similarity measures by probabilistic interestingness measure are shown by numerical example. As the result, we knew that the similarity measures by probabilistic interestingness measure could be seen the degree of association same as confidence. And we could confirm the direction of association because they had the sign of their values.

The Proposition of Conditionally Pure Confidence in Association Rule Mining

  • Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.19 no.4
    • /
    • pp.1141-1151
    • /
    • 2008
  • Data mining is the process of sorting through large amounts of data and picking out useful information. One of the well-studied problems in data mining is the exploration of association rules. An association rule technique finds the relation among each items in massive volume database. Some interestingness measures have been developed in association rule mining. Interestingness measures are useful in that it shows the causes for pruning uninteresting rules statistically or logically. This paper propose a conditional pure confidence to evaluate association rules and then describe some properties for a proposed measure. The comparative studies with confidence and pure confidence are shown by numerical example. The results show that the conditional pure confidence is better than confidence or pure confidence.

  • PDF

Decision process for right association rule generation (올바른 연관성 규칙 생성을 위한 의사결정과정의 제안)

  • Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.21 no.2
    • /
    • pp.263-270
    • /
    • 2010
  • Data mining is the process of sorting through large amounts of data and picking out useful information. An important goal of data mining is to discover, define and determine the relationship between several variables. Association rule mining is an important research topic in data mining. An association rule technique finds the relation among each items in massive volume database. Association rule technique consists of two steps: finding frequent itemsets and then extracting interesting rules from the frequent itemsets. Some interestingness measures have been developed in association rule mining. Interestingness measures are useful in that it shows the causes for pruning uninteresting rules statistically or logically. This paper explores some problems for two interestingness measures, confidence and net confidence, and then propose a decision process for right association rule generation using these interestingness measures.

Signed Hellinger measure for directional association (연관성 방향을 고려한 부호 헬링거 측도의 제안)

  • Park, Hee Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.2
    • /
    • pp.353-362
    • /
    • 2016
  • By Wikipedia, data mining is the process of discovering patterns in a big data set involving methods at the intersection of association rule, decision tree, clustering, artificial intelligence, machine learning. and database systems. Association rule is a method for discovering interesting relations between items in large transactions by interestingness measures. Association rule interestingness measures play a major role within a knowledge discovery process in databases, and have been developed by many researchers. Among them, the Hellinger measure is a good association threshold considering the information content and the generality of a rule. But it has the drawback that it can not determine the direction of the association. In this paper we proposed a signed Hellinger measure to be able to interpret operationally, and we checked three conditions of association threshold. Furthermore, we investigated some aspects through a few examples. The results showed that the signed Hellinger measure was better than the Hellinger measure because the signed one was able to estimate the right direction of association.

A study on the relatively causal strength measures in a viewpoint of interestingness measure (흥미도 측도 관점에서 상대적 인과 강도의 고찰)

  • Park, Hee Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.1
    • /
    • pp.49-56
    • /
    • 2017
  • Among the techniques for analyzing big data, the association rule mining is a technique for searching for relationship between some items using various relevance evaluation criteria. This associative rule scheme is based on the direction of rule creation, and there are positive, negative, and inverse association rules. The purpose of this paper is to investigate the applicability of various types of relatively causal strength measures to the types of association rules from the point of view of interestingness measure. We also clarify the relationship between various types of confidence measures. As a result, if the rate of occurrence of the posterior item is more than 0.5, the first measure ($RCS_{IJ1}$) proposed by Good (1961) is more preferable to the first measure ($RCS_{LR1}$) proposed by Lewis (1986) because the variation of the value is larger than that of $RCS_{LR1}$, and if the ratio is less than 0.5, $RCS_{LR1}$ is more preferable to $RCS_{IJ1}$.

Design and Implementation of an Interestingness Analysis System for Web Personalizatoion & Customization

  • Jung, Youn-Hong;Kim, I-I;Park, Kyoo-seok
    • Journal of Korea Multimedia Society
    • /
    • v.6 no.4
    • /
    • pp.707-713
    • /
    • 2003
  • Convenience and promptness of the internet have been not only making the electronic commerce grow rapidly in case of website, analyzing a navigation pattern of the users has been also making personalization and customization techniques develop rapidly for providing service accordant to individual interestingness. Web personalization and customization skill has been utilizing various methods, such as web log mining to use web log data and web mining to use the transaction of users etc, especially e-CRM analyzing a navigation pattern of the users. In this paper, We measure exact duration time of the users in web page and web site, compute weight about duration time each page, and propose a way to comprehend e-loyalty through the computed weight.

  • PDF

Negatively attributable and pure confidence for generation of negative association rules (음의 연관성 규칙 생성을 위한 음의 기여 순수 신뢰도의 제안)

  • Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.5
    • /
    • pp.939-948
    • /
    • 2012
  • The most widely used data mining technique is to explore association rules. This technique has been used to find the relationship between items in a massive database based on the interestingness measures such as support, confidence, lift, etc. Association rules are frequently used by retail stores to assist in marketing, advertising, floor placement, and inventory control.In general, association rule technique generates the rule, 'If A, then B.', whereas negative association rule technique generates the rule, 'If A, then not B.', or 'If not A, then B.'. We can determine whether we promote other products in addition to promote its products only if we add negative association rules to existing association rules. In this paper, we proposed the negatively attributable and pure confidence to overcome the problems faced by negative association rule technique, and then we checked three conditions for interestingness measure. The comparative studies with negative confidence, negatively pure confidence, and negatively attributable and pure confidence are shown by numerical examples. The results show that the negatively attributable and pure confidence is better than negative confidence and negatively pure confidence.

Non-linear regression model considering all association thresholds for decision of association rule numbers (기본적인 연관평가기준 전부를 고려한 비선형 회귀모형에 의한 연관성 규칙 수의 결정)

  • Park, Hee Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.2
    • /
    • pp.267-275
    • /
    • 2013
  • Among data mining techniques, the association rule is the most recently developed technique, and it finds the relevance between two items in a large database. And it is directly applied in the field because it clearly quantifies the relationship between two or more items. When we determine whether an association rule is meaningful, we utilize interestingness measures such as support, confidence, and lift. Interestingness measures are meaningful in that it shows the causes for pruning uninteresting rules statistically or logically. But the criteria of these measures are chosen by experiences, and the number of useful rules is hard to estimate. If too many rules are generated, we cannot effectively extract the useful rules.In this paper, we designed a variety of non-linear regression equations considering all association thresholds between the number of rules and three interestingness measures. And then we diagnosed multi-collinearity and autocorrelation problems, and used analysis of variance results and adjusted coefficients of determination for the best model through numerical experiments.