• Title/Summary/Keyword: 엔트로피 척도

Search Result 45, Processing Time 0.024 seconds

The Study on Information-Theoretic Measures of Incomplete Information based on Rough Sets (러프 집합에 기반한 불완전 정보의 정보 이론적 척도에 관한 연구)

  • 김국보;정구범;박경옥
    • Journal of Korea Multimedia Society
    • /
    • v.3 no.5
    • /
    • pp.550-556
    • /
    • 2000
  • This paper comes to derive optimal decision rule from incomplete information using the concept of indiscernibility relation and approximation space in Rough set. As there may be some errors in case that processing information contains multiple or missing data, the method of removing or minimizing these data is required. Entropy which is used to measure uncertainty or quantity in information processing field is utilized to remove the incomplete information of rough relation database. But this paper does not always deal with the information system which may be contained incomplete information. This paper is proposed object relation entropy and attribute relation entropy using Rough set as information theoretical measures in order to remove the incomplete information which may contain condition attribute and decision attribute of information system.

  • PDF

An Evaluation of Narrative Complexity Based on Knowledge Distribution Model and Information Entropy (정보이론에 근거한 지식분배 관점에서의 내러티브 복잡도 평가)

  • Kwon, Hochang;Kwon, Hyuk Tae;Yoon, Wan Chul
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2015.05a
    • /
    • pp.27-28
    • /
    • 2015
  • 내러티브의 복잡도는 수용자의 이해와 흥미에 직접적으로 영향을 미치기 때문에 창작과정에서 체계적으로 관리되어야 한다. 본 논문에서는 내러티브 구성에 있어서의 지식 분배 작업에 초점을 맞추어 정보 엔트로피 개념을 활용한 복잡도 평가 방법을 개발하였다. 수용자의 지식상태 변화 과정에서 발생하는 정보량을 엔트로피로 계산하여 복잡도 척도로 활용하였다. 실제 영화 내러티브를 대상으로 사례 연구를 수행하였고, 본 방법이 내러티브의 구조적 특성과 전개과정에서의 변화를 효과적으로 반영함을 확인하였다.

  • PDF

A Big Data Analysis by Between-Cluster Information using k-Modes Clustering Algorithm (k-Modes 분할 알고리즘에 의한 군집의 상관정보 기반 빅데이터 분석)

  • Park, In-Kyoo
    • Journal of Digital Convergence
    • /
    • v.13 no.11
    • /
    • pp.157-164
    • /
    • 2015
  • This paper describes subspace clustering of categorical data for convergence and integration. Because categorical data are not designed for dealing only with numerical data, The conventional evaluation measures are more likely to have the limitations due to the absence of ordering and high dimensional data and scarcity of frequency. Hence, conditional entropy measure is proposed to evaluate close approximation of cohesion among attributes within each cluster. We propose a new objective function that is used to reflect the optimistic clustering so that the within-cluster dispersion is minimized and the between-cluster separation is enhanced. We performed experiments on five real-world datasets, comparing the performance of our algorithms with four algorithms, using three evaluation metrics: accuracy, f-measure and adjusted Rand index. According to the experiments, the proposed algorithm outperforms the algorithms that were considered int the evaluation, regarding the considered metrics.

Similarity Analysis between Total RNA and Amplified RNA Using Entropy Measure (엔트로피 척도를 이용한 전체 RNA와 중폭 RNA의 유사성 분석)

  • Park, Chan-Ho;Cho, Sung-Bae;Shin, Ji-Hye;Kim, Sang-Cheol;Seo, Min-Young;Yang, Sang-Hwa;Rha, Sun-Young;Chung, Hyun-Cheol
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2003.10a
    • /
    • pp.139-146
    • /
    • 2003
  • 암의 조기 발견 및 예후 예측을 위하여 마이크로어레이 데이터를 이용할 수 있다. 하지만 이를 분석하기 위해서는 40${\mu}g$ 이상의 RNA 샘플이 필요한데, 실제 임상 시료를 사용하는 경우 요구되는 충분한 양을 얻기가 어려운 단점이 있다. 따라서 소량의 RNA 샘플을 채취한 후 PCR 증폭 과정을 통하여 요구되는 양의 샘플을 얻을 수 있는 RNA 증폭 방법이 시도되고 있고, 이를 마이크로어레이 실험에 이용하기 위해서는 증폭 전후의 유사성이 보장되어야 한다. 본 논문에서는 증폭 RNA와 전체 RNA의 유사성을 비교하기 위한 새로운 방법으로 엔트로피 기반의 방법을 제시한다. 아울러 다양한 조건에 따라서 엔트로피값을 측정하여 세포주와 조직에서 엔트로피 값이 어떻게 사용될 수 있는지 체계적인 분석을 하였다.

  • PDF

Kullback-Leibler Information-Based Tests of Fit for Inverse Gaussian Distribution (역가우스분포에 대한 쿨백-라이블러 정보 기반 적합도 검정)

  • Choi, Byung-Jin
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.6
    • /
    • pp.1271-1284
    • /
    • 2011
  • The entropy-based test of fit for the inverse Gaussian distribution presented by Mudholkar and Tian(2002) can only be applied to the composite hypothesis that a sample is drawn from an inverse Gaussian distribution with both the location and scale parameters unknown. In application, however, a researcher may want a test of fit either for an inverse Gaussian distribution with one parameter known or for an inverse Gaussian distribution with both the two partameters known. In this paper, we introduce tests of fit for the inverse Gaussian distribution based on the Kullback-Leibler information as an extension of the entropy-based test. A window size should be chosen to implement the proposed tests. By means of Monte Carlo simulations, window sizes are determined for a wide range of sample sizes and the corresponding critical values of the test statistics are estimated. The results of power analysis for various alternatives report that the Kullback-Leibler information-based goodness-of-fit tests have good power.

A Complexity Metric for Web Documentation Based on Entropy (엔트로피를 기반으로한 Web 문서들의 복잡도 척도)

  • Kim, Kap-Su
    • Journal of The Korean Association of Information Education
    • /
    • v.2 no.2
    • /
    • pp.260-268
    • /
    • 1998
  • In this paper, I propose a metric model for measuring complexity of Web documentations which are wrote by HTML and XML. The complexity of Web documentation has effect on documentation understandability which is an important metric in maintenance and reusing of Web documentation. The understandable documents have more effect on WEI. The proposed metric uses the entropy to represent the degree of information flows between Web documentations. The proposed documentation complexity measures the information flows in a Web document based on the information passing relationship between Web document files. I evaluate the proposed metric by using the complexity properties proposed by Weyuker, and measure the document complexity. I show effectiveness of analyzing the correlation between the number of document file and document complexity.

  • PDF

Automatic Quality Measurement of Gray-scale Handwriting Based on Extended Average Entropy (확장된 평균 엔트로피에 기반한 명도 영상 필기 데이터의 품질 자동 평가)

  • 박정선
    • Korean Journal of Cognitive Science
    • /
    • v.10 no.3
    • /
    • pp.77-83
    • /
    • 1999
  • With a surge of interest in OCR in 1990s a large number of handwriting or h handprinting databases have been built one after another around the world. One problem that researches encounter today is that all the databases differ in various ways including the script qualities. This paper proposes a method for measuring handwriting qualities that can be used for comparison of databases and objective test for character recognizers. The key idea i involved is classifying character samples into a number of groups each characterizing a set of qualities. In order to evaluate the proposed method we carried out experiments on KU-1 database. The result we achieve is meaningful and the method is helpful for the target tasks.

  • PDF

An Effective Feature Generation Method for Distributed Denial of Service Attack Detection using Entropy (엔트로피를 이용한 분산 서비스 거부 공격 탐지에 효과적인 특징 생성 방법 연구)

  • Kim, Tae-Hun;Seo, Ki-Taek;Lee, Young-Hoon;Lim, Jong-In;Moon, Jong-Sub
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.20 no.4
    • /
    • pp.63-73
    • /
    • 2010
  • Malicious bot programs, the source of distributed denial of service attack, are widespread and the number of PCs which were infected by malicious bot program are increasing geometrically thesedays. The continuous distributed denial of service attacks are happened constantly through these bot PCs and some financial incident cases have found lately. Therefore researches to response distributed denial of service attack are necessary so we propose an effective feature generation method for distributed denial of service attack detection using entropy. In this paper, we apply our method to both the DARPA 2000 datasets and also the distributed denial of service attack datasets that we composed and generated ourself in general university. And then we evaluate how the proposed method is useful through classification using bayesian network classifier.

The Generation of Control Rules for Data Mining (데이터 마이닝을 위한 제어규칙의 생성)

  • Park, In-Kyoo
    • Journal of Digital Convergence
    • /
    • v.11 no.11
    • /
    • pp.343-349
    • /
    • 2013
  • Rough set theory comes to derive optimal rules through the effective selection of features from the redundancy of lots of information in data mining using the concept of equivalence relation and approximation space in rough set. The reduction of attributes is one of the most important parts in its applications of rough set. This paper purports to define a information-theoretic measure for determining the most important attribute within the association of attributes using rough entropy. The proposed method generates the effective reduct set and formulates the core of the attribute set through the elimination of the redundant attributes. Subsequently, the control rules are generated with a subset of feature which retain the accuracy of the original features through the reduction.