• Title/Summary/Keyword: 기준 확인 측도

Search Result 19, Processing Time 0.024 seconds

Exploration of relationship between confirmation measures and association thresholds (기준 확인 측도와 연관성 평가기준과의 관계 탐색)

  • Park, Hee Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.4
    • /
    • pp.835-845
    • /
    • 2013
  • Association rule of data mining techniques is the method to quantify the relevance between a set of items in a big database, andhas been applied in various fields like manufacturing industry, shopping mall, healthcare, insurance, and education. Philosophers of science have proposed interestingness measures for various kinds of patterns, analyzed their theoretical properties, evaluated them empirically, and suggested strategies to select appropriate measures for particular domains and requirements. Such interestingness measures are divided into objective, subjective, and semantic measures. Objective measures are based on data used in the discovery process and are typically motivated by statistical considerations. Subjective measures take into account not only the data but also the knowledge and interests of users who examine the pattern, while semantic measures additionally take into account utility and actionability. In a very different context, researchers have devoted a lot of attention to measures of confirmation or evidential support. The focus in this paper was on asymmetric confirmation measures, and we compared confirmation measures with basic association thresholds using some simulation data. As the result, we could distinguish the direction of association rule by confirmation measures, and interpret degree of association operationally by them. Futhermore, the result showed that the measure by Rips and that by Kemeny and Oppenheim were better than other confirmation measures.

Proposition of causally confirmed measures in association rule mining (인과적 확인 측도에 의한 연관성 규칙 탐색)

  • Park, Hee Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.4
    • /
    • pp.857-868
    • /
    • 2014
  • Data mining is the representative analysis methodology in the era of big data, and is the process to analyze a massive volume database and summarize it into meaningful information. Association rule technique finds the relationship among several items in huge database using the interestingness measures such as support, confidence, lift, etc. But these interestingness measures cannot be used to establish a causality relationship between antecedent and consequent item sets. Moreover, we can not know association direction by them. This paper propose causally confirmed association thresholds to compensate for these problems, and then check the three conditions of interestingness measures. The comparative studies with basic association thresholds, causal association thresholds, and causally confirmed association thresholds are shown by simulation studies. The results show that causally confirmed association thresholds are better than basic and causal association thresholds.

The application for predictive similarity measures of binary data in association rule mining (이분형 예측 유사성 측도의 연관성 평가 기준 적용 방안)

  • Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.3
    • /
    • pp.495-503
    • /
    • 2011
  • The most widely used data mining technique is to find association rules. Association rule mining is the method to quantify the relationship between each set of items in very huge database based on the association thresholds. There are some basic association thresholds to explore meaningful association rules ; support, confidence, lift, etc. Among them, confidence is the most frequently used, but it has the drawback that it can not determine the direction of the association. The net confidence and the attributably pure confidence were developed to compensate for this drawback, but they have other drawbacks.In this paper we consider some predictive similarity measures for binary data in cluster analysis and multi-dimensional analysis as association threshold to compensate for these drawbacks. The comparative studies with net confidence, attributably pure confidence, and some predictive similarity measures are shown by numerical example.

Utilization of similarity measures by PIM with AMP as association rule thresholds (모든 주변 비율을 고려한 확률적 흥미도 측도 기반 유사성 측도의 연관성 평가 기준 활용 방안)

  • Park, Hee Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.1
    • /
    • pp.117-124
    • /
    • 2013
  • Association rule of data mining techniques is the method to quantify the relationship between a set of items in a huge database, andhas been applied in various fields like internet shopping mall, healthcare, insurance, and education. There are three primary interestingness measures for association rule, support and confidence and lift. Confidence is the most important measure of these measures, and we generate some association rules using confidence. But it is an asymmetric measure and has only positive value. So we can face with difficult problems in generation of association rules. In this paper we apply the similarity measures by probabilistic interestingness measure (PIM) with all marginal proportions (AMP) to solve this problem. The comparative studies with support, confidences, lift, chi-square statistics, and some similarity measures by PIM with AMPare shown by numerical example. As the result, we knew that the similarity measures by PIM with AMP could be seen the degree of association same as confidence. And we could confirm the direction of association because they had the sign of their values, and select the best similarity measure by PIM with AMP.

Exploration of PIM based similarity measures as association rule thresholds (확률적 흥미도를 이용한 유사성 측도의 연관성 평가 기준)

  • Park, Hee Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.6
    • /
    • pp.1127-1135
    • /
    • 2012
  • Association rule mining is the method to quantify the relationship between each set of items in a large database. One of the well-studied problems in data mining is exploration for association rules. There are three primary quality measures for association rule, support and confidence and lift. We generate some association rules using confidence. Confidence is the most important measure of these measures, but it is an asymmetric measure and has only positive value. Thus we can face with difficult problems in generation of association rules. In this paper we apply the similarity measures by probabilistic interestingness measure to find a solution to this problem. The comparative studies with support, two confidences, lift, and some similarity measures by probabilistic interestingness measure are shown by numerical example. As the result, we knew that the similarity measures by probabilistic interestingness measure could be seen the degree of association same as confidence. And we could confirm the direction of association because they had the sign of their values.

Development of association rule threshold by balancing of relative rule accuracy (상대적 규칙 정확도의 균형화에 의한 연관성 측도의 개발)

  • Park, Hee Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.6
    • /
    • pp.1345-1352
    • /
    • 2014
  • Data mining is the representative methodology to obtain meaningful information in the era of big data.By Wikipedia, association rule learning is a popular and well researched method for discovering interesting relationship between itemsets in large databases using association thresholds. It is intended to identify strong rules discovered in databases using different interestingness measures. Unlike general association rule, inverse association rule mining finds the rules that a special item does not occur if an item does not occur. If two types of association rule can be simultaneously considered, we can obtain the marketing information for some related products as well as the information of specific product marketing. In this paper, we propose a balanced attributable relative accuracy applicable to these association rule techniques, and then check the three conditions of interestingness measures by Piatetsky-Shapiro (1991). The comparative studies with rule accuracy, relative accuracy, attributable relative accuracy, and balanced attributable relative accuracy are shown by numerical example. The results show that balanced attributable relative accuracy is better than any other accuracy measures.

The proposition of compared and attributably pure confidence in association rule mining (연관 규칙 마이닝에서 비교 기여 순수 신뢰도의 제안)

  • Park, Hee Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.3
    • /
    • pp.523-532
    • /
    • 2013
  • Generally, data mining is the process of analyzing big data from different perspectives and summarizing it into useful information. The most widely used data mining technique is to generate association rules, and it finds the relevance between two items in a huge database. This technique has been used to find the relationship between each set of items based on the interestingness measures such as support, confidence, lift, etc. Among many interestingness measures, confidence is the most frequently used, but it has the drawback that it can not determine the direction of the association. The attributably pure confidence and compared confidence are able to determine the direction of the association, but their ranges are not [-1, +1]. So we can not interpret the degree of association operationally by their values. This paper propose a compared and attributably pure confidence to compensate for this drawback, and then describe some properties for a proposed measure. The comparative studies with confidence, compared confidence, attributably pure confidence, and a proposed measure are shown by numerical example. The results show that the a compared and attributably pure confidence is better than any other confidences.

Compensation of Arousal Level Criteria by a Modified KSS Scale (수정된 KSS 측도에 의한 각성도 평가기준 보상법)

  • 고한우;김연호
    • Journal of Biomedical Engineering Research
    • /
    • v.18 no.4
    • /
    • pp.477-484
    • /
    • 1997
  • In this paper, we proposed the compensation method to evaluate arousal level in different initial arousal states. Arousal level was measured by the relationship between IRI and Nz. Since Nz is affected by BI which is directly proportional to initial arousal state of subjects, the arousal level is underestimated To overcome this problem, we proposed the compensation method using modified Karolinska sleepiness scale, and determined compensation coefficients derived from this scale with five arousal levels. Applying these coefficients to portable arousal monitoring system, the proposed method in this paper could be useful for real-time evaluation and control of arousal level. As a result the developed system can detect and control the arousal state from initial drowsing sate.

  • PDF

Bayesian Learning based Fuzzy Rule Extraction for Clustering (군집화를 위한 베이지안 학습 기반의 퍼지 규칙 추출)

  • 한진우;전성해;오경환
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2003.04c
    • /
    • pp.389-391
    • /
    • 2003
  • 컴퓨터 학습의 군집화는 주어진 데이터를 서로 유사한 몇 개의 집단으로 묶는 작업을 수행한다. 군집화를 위한 유사도 결정을 위한 측도는 많은 기법들에서 매우 다양한 측도들이 사용되고 또한 연구되어 왔다. 하지만 군집화의 결과에 대한 성능측정에 대한 객관적인 기준 설정이 어렵기 때문에 군집화 결과에 대한 해석은 매우 주관적이고 애매한 경우가 많다. 퍼지 군집화는 이러한 애매한 군집화 문제에 있어서 융통성 있는 군집 결정 방안을 제시해 준다. 각 개체들이 특정 군집에 속하게 될 퍼지 멤버 함수값을 원소로 하는 유사도 행렬을 통하여 군집화를 수행한다. 본 논문에서는 베이지안 학습을 통하여 군집화를 위한 퍼지 멤버 함수값을 구하였다. 본 연구에서는 최적의 퍼지 군집화 수행을 위하여 베이지안 학습 기반의 퍼지 규칙을 추출하였다. 인공적으로 만든 데이터와 기존의 기계 학습 데이터를 이용한 실험을 통하여 제안 방법의 성능을 확인하였다.

  • PDF

The proposition of cosine net confidence in association rule mining (연관 규칙 마이닝에서의 코사인 순수 신뢰도의 제안)

  • Park, Hee Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.1
    • /
    • pp.97-106
    • /
    • 2014
  • The development of big data technology was to more accurately predict diversified contemporary society and to more efficiently operate it, and to enable impossible technique in the past. This technology can be utilized in various fields such as the social science, economics, politics, cultural sector, and science technology at the national level. It is a prerequisite to find valuable information by data mining techniques in order to analyze big data. Data mining techniques associated with big data involve text mining, opinion mining, cluster analysis, association rule mining, and so on. The most widely used data mining technique is to explore association rules. This technique has been used to find the relationship between each set of items based on the association thresholds such as support, confidence, lift, similarity measures, etc.This paper proposed cosine net confidence as association thresholds, and checked the conditions of interestingness measure proposed by Piatetsky-Shapiro, and examined various characteristics. The comparative studies with basic confidence and cosine similarity, and cosine net confidence were shown by numerical example. The results showed that cosine net confidence are better than basic confidence and cosine similarity because of the relevant direction.