• Title/Summary/Keyword: maximum entropy measure

Search Result 17, Processing Time 0.023 seconds

Minimum Variance Unbiased Estimation for the Maximum Entropy of the Transformed Inverse Gaussian Random Variable by Y=X-1/2

  • Choi, Byung-Jin
    • Communications for Statistical Applications and Methods
    • /
    • v.13 no.3
    • /
    • pp.657-667
    • /
    • 2006
  • The concept of entropy, introduced in communication theory by Shannon (1948) as a measure of uncertainty, is of prime interest in information-theoretic statistics. This paper considers the minimum variance unbiased estimation for the maximum entropy of the transformed inverse Gaussian random variable by $Y=X^{-1/2}$. The properties of the derived UMVU estimator is investigated.

Maximum entropy test for infinite order autoregressive models

  • Lee, Sangyeol;Lee, Jiyeon;Noh, Jungsik
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.3
    • /
    • pp.637-642
    • /
    • 2013
  • In this paper, we consider the maximum entropy test in in nite order autoregressiv models. Its asymptotic distribution is derived under the null hypothesis. A bootstrap version of the test is discussed and its performance is evaluated through Monte Carlo simulations.

Which country's end devices are most sharing vulnerabilities in East Asia? (거시적인 관점에서 바라본 취약점 공유 정도를 측정하는 방법에 대한 연구)

  • Kim, Kwangwon;Won, Yoon Ji
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.25 no.5
    • /
    • pp.1281-1291
    • /
    • 2015
  • Compared to the past, people can control end devices via open channel. Although this open channel provides convenience to users, it frequently turns into a security hole. In this paper, we propose a new human-centered security risk analysis method that puts weight on the relationship between end devices. The measure derives from the concept of entropy rate, which is known as the uncertainty per a node in a network. As there are some limitations to use entropy rate as a measure in comparing different size of networks, we divide the entropy rate of a network by the maximum entropy rate of the network. Also, we show how to avoid the violation of irreducible, which is a precondition of the entropy rate of a random walk on a graph.

Part-Of-Speech Tagging using multiple sources of statistical data (이종의 통계정보를 이용한 품사 부착 기법)

  • Cho, Seh-Yeong
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.18 no.4
    • /
    • pp.501-506
    • /
    • 2008
  • Statistical POS tagging is prone to error, because of the inherent limitations of statistical data, especially single source of data. Therefore it is widely agreed that the possibility of further enhancement lies in exploiting various knowledge sources. However these data sources are bound to be inconsistent to each other. This paper shows the possibility of using maximum entropy model to Korean language POS tagging. We use as the knowledge sources n-gram data and trigger pair data. We show how perplexity measure varies when two knowledge sources are combined using maximum entropy method. The experiment used a trigram model which produced 94.9% accuracy using Hidden Markov Model, and showed increase to 95.6% when combined with trigger pair data using Maximum Entropy method. This clearly shows possibility of further enhancement when various knowledge sources are developed and combined using ME method.

A Study on the Maximum Velocity and the Surface Velocity (최대유속과 표면유속에 관한 연구)

  • Choo, Tai Ho;Je, Sung Jin
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2006.05a
    • /
    • pp.351-355
    • /
    • 2006
  • The purpose of this study is to develop an efficient and useful equation of discharge measurement which can calculate easily discharge using only the surface velocity in both channels and rivers. The research results show: (1) Natural river have a propensity to establish and maintain an equilibrium state the corresponds to a value of the entropy parameter M; (2) Velocity distribution estimated by the method using surface velocity was compared with that of actual survey. It shows fairly close agreements between the estimated and the observed; (3) Developed equations for calculating the discharge using the surface velocity at the spot of the maximum velocity in a river section were established and show that the method of using fairly acceptable. An entropy based method for determining the discharge using only surface velocity in the rivers has been developed. The method presented is also efficient and applicable in estimating the discharge in high flows during the flood season that are very difficult or impossible to measure before, due to technical or theoretical reasons.

  • PDF

Application of Texture Feature Analysis Algorithm used the Statistical Characteristics in the Computed Tomography (CT): A base on the Hepatocellular Carcinoma (HCC) (전산화단층촬영 영상에서 통계적 특징을 이용한 질감특징분석 알고리즘의 적용: 간세포암 중심으로)

  • Yoo, Jueun;Jun, Taesung;Kwon, Jina;Jeong, Juyoung;Im, Inchul;Lee, Jaeseung;Park, Hyonghu;Kwak, Byungjoon;Yu, Yunsik
    • Journal of the Korean Society of Radiology
    • /
    • v.7 no.1
    • /
    • pp.9-15
    • /
    • 2013
  • In this study, texture feature analysis (TFA) algorithm to automatic recognition of liver disease suggests by utilizing computed tomography (CT), by applying the algorithm computer-aided diagnosis (CAD) of hepatocellular carcinoma (HCC) design. Proposed the performance of each algorithm was to comparison and evaluation. In the HCC image, set up region of analysis (ROA, window size was $40{\times}40$ pixels) and by calculating the figures for TFA algorithm of the six parameters (average gray level, average contrast, measure of smoothness, skewness, measure of uniformity, entropy) HCC recognition rate were calculated. As a result, TFA was found to be significant as a measure of HCC recognition rate. Measure of uniformity was the most recognition. Average contrast, measure of smoothness, and skewness were relatively high, and average gray level, entropy showed a relatively low recognition rate of the parameters. In this regard, showed high recognition algorithms (a maximum of 97.14%, a minimum of 82.86%) use the determining HCC imaging lesions and assist early diagnosis of clinic. If this use to therapy, the diagnostic efficiency of clinical early diagnosis better than before. Later, after add the effective and quantitative analysis, criteria research for generalized of disease recognition is needed to be considered.

Semi-automatic Construction of Training Data using Active Learning (능동 학습을 이용한 학습 데이터 반자동 구축)

  • Lee, Chang-Ki;Hur, Jeong;Wang, Ji-Hyun;Lee, Chung-Hee;Oh, Hyo-Jung;Jang, Myung-Gil;Lee, Young-Jik
    • 한국HCI학회:학술대회논문집
    • /
    • 2006.02a
    • /
    • pp.1252-1255
    • /
    • 2006
  • 본 논문은 정보검색, 정보추출, 번역, 자연어처리 등의 작업을 위한 통계적 방법론에서 필요한 학습 데이터 구축을 효율적으로 하기 위한 학습 데이터 반자동 구축 장치 및 그 방법에 대하여 기술한다. 본 논문에서는 학습 데이터 구축양을 줄이기 위해서 능동 학습을 이용한다. 또한 최근 각광 받고 있는 Conditional Random Fields(CRF)를 능동학습에 이용하기 위해서 CRF를 이용한 Confidence measure를 정의한다.

  • PDF

A Study on Decision Tree for Multiple Binary Responses

  • Lee, Seong-Keon
    • Communications for Statistical Applications and Methods
    • /
    • v.10 no.3
    • /
    • pp.971-980
    • /
    • 2003
  • The tree method can be extended to multivariate responses, such as repeated measure and longitudinal data, by modifying the split function so as to accommodate multiple responses. Recently, some decision trees for multiple responses have been constructed by Segal (1992) and Zhang (1998). Segal suggested a tree can analyze continuous longitudinal response using Mahalanobis distance for within node homogeneity measures and Zhang suggested a tree can analyze multiple binary responses using generalized entropy criterion which is proportional to maximum likelihood of joint distribution of multiple binary responses. In this paper, we will modify CART procedure and suggest a new tree-based method that can analyze multiple binary responses using similarity measures.

A Novel Iris recognition method robust to noises and translation (잡음과 위치이동에 강인한 새로운 홍채인식 기법)

  • Won, Jung-Woo;Kim, Jae-Min;Cho, Sung-Won;Choi, Kyung-Sam;Choi, Jin-Su
    • Proceedings of the KIEE Conference
    • /
    • 2003.11c
    • /
    • pp.392-395
    • /
    • 2003
  • This paper describes a new iris segmentation and recognition method, which is robust to noises. Combining statistical classification and elastic boundary fitting, the iris is first segmented. Then, the localized iris image is smoothed by a convolution with a Gaussian function, down-sampled by a factor of filtered with a Laplacian operator, and quantized using the Lloyd-Max method. Since the quantized output is sensitive to a small shift of the full-resolution iris image, the outputs of the Laplacian operator are computed for all space shifts. The quantized output with maximum entropy is selected as the final feature representation. An appropriate formulation of similarity measure is defined for the classification of the quantized output. Experimentally we showed that the proposed method produces superb performance in iris segmentation and recognition.

  • PDF

TAKES: Two-step Approach for Knowledge Extraction in Biomedical Digital Libraries

  • Song, Min
    • Journal of Information Science Theory and Practice
    • /
    • v.2 no.1
    • /
    • pp.6-21
    • /
    • 2014
  • This paper proposes a novel knowledge extraction system, TAKES (Two-step Approach for Knowledge Extraction System), which integrates advanced techniques from Information Retrieval (IR), Information Extraction (IE), and Natural Language Processing (NLP). In particular, TAKES adopts a novel keyphrase extraction-based query expansion technique to collect promising documents. It also uses a Conditional Random Field-based machine learning technique to extract important biological entities and relations. TAKES is applied to biological knowledge extraction, particularly retrieving promising documents that contain Protein-Protein Interaction (PPI) and extracting PPI pairs. TAKES consists of two major components: DocSpotter, which is used to query and retrieve promising documents for extraction, and a Conditional Random Field (CRF)-based entity extraction component known as FCRF. The present paper investigated research problems addressing the issues with a knowledge extraction system and conducted a series of experiments to test our hypotheses. The findings from the experiments are as follows: First, the author verified, using three different test collections to measure the performance of our query expansion technique, that DocSpotter is robust and highly accurate when compared to Okapi BM25 and SLIPPER. Second, the author verified that our relation extraction algorithm, FCRF, is highly accurate in terms of F-Measure compared to four other competitive extraction algorithms: Support Vector Machine, Maximum Entropy, Single POS HMM, and Rapier.