• Title/Summary/Keyword: 측도

Search Result 362, Processing Time 0.029 seconds

Exploration of relationship between confirmation measures and association thresholds (기준 확인 측도와 연관성 평가기준과의 관계 탐색)

  • Park, Hee Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.4
    • /
    • pp.835-845
    • /
    • 2013
  • Association rule of data mining techniques is the method to quantify the relevance between a set of items in a big database, andhas been applied in various fields like manufacturing industry, shopping mall, healthcare, insurance, and education. Philosophers of science have proposed interestingness measures for various kinds of patterns, analyzed their theoretical properties, evaluated them empirically, and suggested strategies to select appropriate measures for particular domains and requirements. Such interestingness measures are divided into objective, subjective, and semantic measures. Objective measures are based on data used in the discovery process and are typically motivated by statistical considerations. Subjective measures take into account not only the data but also the knowledge and interests of users who examine the pattern, while semantic measures additionally take into account utility and actionability. In a very different context, researchers have devoted a lot of attention to measures of confirmation or evidential support. The focus in this paper was on asymmetric confirmation measures, and we compared confirmation measures with basic association thresholds using some simulation data. As the result, we could distinguish the direction of association rule by confirmation measures, and interpret degree of association operationally by them. Futhermore, the result showed that the measure by Rips and that by Kemeny and Oppenheim were better than other confirmation measures.

Fuzzy 측도의 개념과 몇 가지 예에 관한 연구

  • 염승화
    • The Mathematical Education
    • /
    • v.23 no.1
    • /
    • pp.7-11
    • /
    • 1984
  • Fuzzy 측도를 고전적인 방법에 병행해서 T-fuzzy $\sigma$-algebra의 개념을 얻어서 이로부터 Fuzzy 측도를 도입하고 다시 이와 관련된 예를 몇가지 들었다. 이 결과 고전적인 방법과 매우 가까운 Fuzzy 측도를 얻었음을 확인할 수 있었다.

  • PDF

A study on the ordering of similarity measures with negative matches (음의 일치 빈도를 고려한 유사성 측도의 대소 관계 규명에 관한 연구)

  • Park, Hee Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.1
    • /
    • pp.89-99
    • /
    • 2015
  • The World Economic Forum and the Korean Ministry of Knowledge Economy have selected big data as one of the top 10 in core information technology. The key of big data is to analyze effectively the properties that do have data. Clustering analysis method of big data techniques is a method of assigning a set of objects into the clusters so that the objects in the same cluster are more similar to each other clusters. Similarity measures being used in the cluster analysis may be classified into various types depending on the nature of the data. In this paper, we studied upper and lower bounds for binary similarity measures with negative matches such as Russel and Rao measure, simple matching measure by Sokal and Michener, Rogers and Tanimoto measure, Sokal and Sneath measure, Hamann measure, and Baroni-Urbani and Buser mesures I, II. And the comparative studies with these measures were shown by real data and simulated experiment.

A study on the ordering of PIM family similarity measures without marginal probability (주변 확률을 고려하지 않는 확률적 흥미도 측도 계열 유사성 측도의 서열화)

  • Park, Hee Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.2
    • /
    • pp.367-376
    • /
    • 2015
  • Today, big data has become a hot keyword in that big data may be defined as collection of data sets so huge and complex that it becomes difficult to process by traditional methods. Clustering method is to identify the information in a big database by assigning a set of objects into the clusters so that the objects in the same cluster are more similar to each other clusters. The similarity measures being used in the cluster analysis may be classified into various types depending on the nature of the data. In this paper, we computed upper and lower limits for probability interestingness measure based similarity measures without marginal probability such as Yule I and II, Michael, Digby, Baulieu, and Dispersion measure. And we compared these measures by real data and simulated experiment. By Warrens (2008), Coefficients with the same quantities in the numerator and denominator, that are bounded, and are close to each other in the ordering, are likely to be more similar. Thus, results on bounds provide means of classifying various measures. Also, knowing which coefficients are similar provides insight into the stability of a given algorithm.

Utilizing Purely Symmetric J Measure for Association Rules (연관성 규칙의 탐색을 위한 순수 대칭적 J 측도의 활용)

  • Park, Hee-Chang
    • Journal of the Korean Data Analysis Society
    • /
    • v.20 no.6
    • /
    • pp.2865-2872
    • /
    • 2018
  • In the field of data mining technique, there are various methods such as association rules, cluster analysis, decision tree, neural network. Among them, association rules are defined by using various association evaluation criteria such as support, confidence, and lift. Agrawal et al. (1993) first proposed this association rule, and since then research has been conducted by many scholars. Recently, studies related to crossover entropy have been published (Park, 2016b). In this paper, we proposed a purely symmetric J measure considering directionality and purity in the previously published J measure, and examined its usefulness by using examples. As a result, it is found that the pure symmetric J measure changes more clearly than the conventional J measure, the symmetric J measure, and the pure crossover entropy measure as the frequency of coincidence increases. The variation of the pure symmetric J measure was also larger depending on the magnitude of the inconsistency, and the presence or absence of the association was more clearly understood.

A measure of slope rotatability over all directions (모든 방향에 걸친 기울기 회전성의 측도)

  • 김혁주
    • The Korean Journal of Applied Statistics
    • /
    • v.6 no.1
    • /
    • pp.105-123
    • /
    • 1993
  • 반응표면의 기울기를 추정하기 위한 실험계획법이 가질 수 있는 바람직한 성질로, Hader와 Park(1978)이 제시한 "축 방향에 걸친 기울기 회전성"과, Park(1987)이 제시한 "모든 방향에 걸친 기울기 회전성"이 있다. 또한 주어진 임의의 실험계획에 대하여 축 방향에 걸친 기울기 회전성의 정도를 수치로 나타낼 수 있는 측도(measure)가 Park과 Kim(1992)에 의해 제시된 바 있다. 본 논문에서는 반응표면 실험계획법이 가지고 있는 모든 방향에 걸친 기울기 회전성의 정도를 알 수 있게 해 주는 측도를 개발하였다. 또한 이 측도를 여러 종류의 계획들에 적용하여 결과를 관찰하였다. 이 측도의 장점 중의 하나는 어떠한 계획에도 적용이 가능하다는 점이다. 계획에도 적용이 가능하다는 점이다.

  • PDF

Bounds of PIM-based similarity measures with partially marginal proportion (부분적 주변 비율에 의한 확률적 흥미도 측도 기반 유사성 측도의 상한 및 하한의 설정)

  • Park, Hee Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.4
    • /
    • pp.857-864
    • /
    • 2015
  • By Wikipedia, data mining is the computational process of discovering patterns in huge data sets involving methods at the intersection of association rule, decision tree, clustering, artificial intelligence, machine learning. Clustering or cluster analysis is the task of grouping a set of objects in such a way that objects in the same group are more similar to each other than to those in other groups. The similarity measures being used in the clustering may be classified into various types depending on the characteristics of data. In this paper, we computed bounds for similarity measures based on the probabilistic interestingness measure with partially marginal probability such as Peirce I, Peirce II, Cole I, Cole II, Loevinger, Park I, and Park II measure. We confirmed the absolute value of Loevinger measure wasthe upper limit of the absolute value of any other existing measures. Ordering of other measures is determined by the size of concurrence proportion, non-simultaneous occurrence proportion, and mismatch proportion.

Signed Hellinger measure for directional association (연관성 방향을 고려한 부호 헬링거 측도의 제안)

  • Park, Hee Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.2
    • /
    • pp.353-362
    • /
    • 2016
  • By Wikipedia, data mining is the process of discovering patterns in a big data set involving methods at the intersection of association rule, decision tree, clustering, artificial intelligence, machine learning. and database systems. Association rule is a method for discovering interesting relations between items in large transactions by interestingness measures. Association rule interestingness measures play a major role within a knowledge discovery process in databases, and have been developed by many researchers. Among them, the Hellinger measure is a good association threshold considering the information content and the generality of a rule. But it has the drawback that it can not determine the direction of the association. In this paper we proposed a signed Hellinger measure to be able to interpret operationally, and we checked three conditions of association threshold. Furthermore, we investigated some aspects through a few examples. The results showed that the signed Hellinger measure was better than the Hellinger measure because the signed one was able to estimate the right direction of association.

Index of union and other accuracy measures (Index of Union와 다른 정확도 측도들)

  • Hong, Chong Sun;Choi, So Yeon;Lim, Dong Hui
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.4
    • /
    • pp.395-407
    • /
    • 2020
  • Most classification accuracy measures for optimal threshold are divided into two types: one is expressed with cumulative distribution functions and probability density functions, the other is based on ROC curve and AUC. Unal (2017) proposed the index of union (IU) as an accuracy measure that considers two types to get them. In this study, ten kinds of accuracy measures (including IU) are divided into six categories, and the advantages of the IU are studied by comparing the measures belonging to each category. The optimal thresholds of these measures are obtained by setting various normal mixture distributions; subsequently, the first and second type of errors as well as the error sums corresponding to each threshold are calculated. The properties and characteristics of the IU statistic are explored by comparing the discriminative power of other accuracy measures based on error values.The values of the first type error and error sum of IU statistic converge to those of the best accuracy measures of the second category as the mean difference between the two distributions increases. Therefore, IU could be an accuracy measure to evaluate the discriminant power of a model.

Understanding of Degree and Radian by Measuring Arcs (호의 측도로 도(Degree)와 라디안 이해하기)

  • Choi, Eun Ah;Kang, Hyangim
    • School Mathematics
    • /
    • v.17 no.3
    • /
    • pp.447-467
    • /
    • 2015
  • The purpose of this study is to examine how the learning experience understanding degree and radian as the measurement of arc affects the conceptual understanding of radian and measuring angle. For this purpose, we investigated pre-service teachers' understanding about measurement of angle using a length of arc, and then conducted a teaching experiment with two middle school students. The results of analyzing pre-service teachers' and students' response are as follows. Students' experience interpreting the concept of degree into measurement of arc had a positive effect on understanding of radian and students' learning process in which they got measurement of angle as measurement of arc enabled conceptual understanding of 'linear measuring'. Also a circle context and a strategy dividing by arc operated as effective strategies for solving various problems about an angle. Finally, we confirmed that providing direct manipulative activities as a chance to explore relationships between an angle and arc measure can help students' conceptual understanding of measuring angle.