• Title/Summary/Keyword: entropy-based test

Search Result 67, Processing Time 0.029 seconds

Text Categorization Based on the Maximum Entropy Principle (최대 엔트로피 기반 문서 분류기의 학습)

  • 장정호;장병탁;김영택
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 1999.10b
    • /
    • pp.57-59
    • /
    • 1999
  • 본 논문에서는 최대 엔트로피 원리에 기반한 문서 분류기의 학습을 제안한다. 최대 엔트로피 기법은 자연언어 처리에서 언어 모델링(Language Modeling), 품사 태깅 (Part-of-Speech Tagging) 등에 널리 사용되는 방법중의 하나이다. 최대 엔트로피 모델의 효율성을 위해서는 자질 선정이 중요한데, 본 논문에서는 자질 집합의 선택을 위한 기준으로 chi-square test, log-likelihood ratio, information gain, mutual information 등의 방법을 이용하여 실험하고, 전체 후보 자질에 대한 실험 결과와 비교해 보았다. 데이터 집합으로는 Reuters-21578을 사용하였으며, 각 클래스에 대한 이진 분류 실험을 수행하였다.

  • PDF

Automatic Quality Measurement of Gray-scale Handwriting Based on Extended Average Entropy (확장된 평균 엔트로피에 기반한 명도 영상 필기 데이터의 품질 자동 평가)

  • 박정선
    • Korean Journal of Cognitive Science
    • /
    • v.10 no.3
    • /
    • pp.77-83
    • /
    • 1999
  • With a surge of interest in OCR in 1990s a large number of handwriting or h handprinting databases have been built one after another around the world. One problem that researches encounter today is that all the databases differ in various ways including the script qualities. This paper proposes a method for measuring handwriting qualities that can be used for comparison of databases and objective test for character recognizers. The key idea i involved is classifying character samples into a number of groups each characterizing a set of qualities. In order to evaluate the proposed method we carried out experiments on KU-1 database. The result we achieve is meaningful and the method is helpful for the target tasks.

  • PDF

Effect of Sintering Condition on Tensile Strength of Fe-based Non-equiatomic High Entropy Alloy (철계 비동일분율 고엔트로피 합금의 인장 강도에 미치는 소결 조건 영향)

  • Seo, Namhyuk;Jeon, Junhyub;Kim, Gwanghun;Park, Jungbin;Son, Seung Bae;Lee, Seok-Jae
    • Journal of Powder Materials
    • /
    • v.28 no.3
    • /
    • pp.221-226
    • /
    • 2021
  • We fabricate the non-equiatomic high-entropy alloy (NE-HEA) Fe49.5Mn30Co10Cr10C0.5 (at.%) using spark plasma sintering under various sintering conditions. Each elemental pure powder is milled by high-energy ball milling to prepare NE-HEA powder. The microstructure and mechanical properties of the sintered samples are investigated using various methods. We use the X-ray diffraction (XRD) method to investigate the microstructural characteristics. Quantitative phase analysis is performed by direct comparison of the XRD results. A tensile test is used to compare the mechanical properties of small samples. Next, electron backscatter diffraction analysis is performed to analyze the phase fraction, and the results are compared to those of XRD analysis. By combining different sintering durations and temperature conditions, we attempt to identify suitable spark plasma sintering conditions that yield mechanical properties comparable with previously reported values. The samples sintered at 900 and 1000℃ with no holding time have a tensile strength of over 1000 MPa.

Diversity and Genotypic Structure of ECOR Collection Determined by Repetitive Extragenic Palindromic PCR Genome Fingerprinting

  • HWANG KEUM-OK;JANG HYO-MI;CHO JAE-CHANG
    • Journal of Microbiology and Biotechnology
    • /
    • v.15 no.3
    • /
    • pp.672-677
    • /
    • 2005
  • The standard reference collection of strains for E. coli, the ECOR collection, was analyzed by a genome-based typing method. Seventy-one ECOR strains were subjected to repetitive extragenic palindromic PCR genome fingerprinting with BOX primers (BOX-PCR). Using a similarity value of 0.8 or more after cluster analysis of BOX-PCR fingerprinting patterns to define the same genotypes, we identified 28 genotypes in the ECOR collection. Shannon's entropy-based diversity index was 3.07, and the incident-based coverage estimator indicated potentially 420 genotypes among E. coli populations. Chi-square test of goodness-of-fit showed statistically significant association between the genotypes defined by BOX-PCR fingerprinting and the groups previously defined by multi-locus enzyme electrophoresis. This study suggests that the diversification of E. coli strains in natural populations is actively ongoing, and rep-PCR fingerprinting is a convenient and reliable method to type E. coli strains for the purposes ranging from ecology to quarantine.ine.

Tests for Exponentiality by Kullback-Leibler Information (지수분포의 검정을 위한 쿨백-레이블러 정보함수)

  • 김종태;이우동;강석복
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.5 no.2
    • /
    • pp.39-46
    • /
    • 2000
  • Recent]y van Es (1992) and Correa (1995) proposed an estimator of entropy. In this paper, we proposed the goodness of fit test statistics for exponentiality based on Vasicek's estimator and Correa's estimator of Kullback-Leibier Information. And we compare the power of the proposed test statistics with Kolmogorov-Sminov, Kuiper, Cramer von Mises, Watson, Andersen-Darling and Finkelstein and Schefer statistics.

  • PDF

A design of Context-Based Adaptive Variable Length Coder For H.264 (H.264용 Context-Based Adaptive Variable Length Coder(CAVLC) 설계)

  • Lee, Hong-Sic;Suh, Ki-Bum
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • v.9 no.2
    • /
    • pp.237-240
    • /
    • 2005
  • This paper propose an novel CAVLC architcture for H.264 and designed the CAVLC module which can be used in AMBA based design. This designed module can be operated in 420 cycle for one-macroblock and support both long-start code method using Annex B.1 and RTP. To verify the CAVLC architecture, we developed the reference C from JM8.5 and verified the our developed hardware using test vector generated by reference C. The designed circuit can be operated in 54MHz clock system, and has 14096 gate counts using Hynix 0.35 um TLM process.

  • PDF

TAKES: Two-step Approach for Knowledge Extraction in Biomedical Digital Libraries

  • Song, Min
    • Journal of Information Science Theory and Practice
    • /
    • v.2 no.1
    • /
    • pp.6-21
    • /
    • 2014
  • This paper proposes a novel knowledge extraction system, TAKES (Two-step Approach for Knowledge Extraction System), which integrates advanced techniques from Information Retrieval (IR), Information Extraction (IE), and Natural Language Processing (NLP). In particular, TAKES adopts a novel keyphrase extraction-based query expansion technique to collect promising documents. It also uses a Conditional Random Field-based machine learning technique to extract important biological entities and relations. TAKES is applied to biological knowledge extraction, particularly retrieving promising documents that contain Protein-Protein Interaction (PPI) and extracting PPI pairs. TAKES consists of two major components: DocSpotter, which is used to query and retrieve promising documents for extraction, and a Conditional Random Field (CRF)-based entity extraction component known as FCRF. The present paper investigated research problems addressing the issues with a knowledge extraction system and conducted a series of experiments to test our hypotheses. The findings from the experiments are as follows: First, the author verified, using three different test collections to measure the performance of our query expansion technique, that DocSpotter is robust and highly accurate when compared to Okapi BM25 and SLIPPER. Second, the author verified that our relation extraction algorithm, FCRF, is highly accurate in terms of F-Measure compared to four other competitive extraction algorithms: Support Vector Machine, Maximum Entropy, Single POS HMM, and Rapier.

A Self-Timed Ring based Lightweight TRNG with Feedback Structure (피드백 구조를 갖는 Self-Timed Ring 기반의 경량 TRNG)

  • Choe, Jun-Yeong;Shin, Kyung-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.2
    • /
    • pp.268-275
    • /
    • 2020
  • A lightweight hardware design of self-timed ring based true random number generator (TRNG) suitable for information security applications is described. To reduce hardware complexity of TRNG, an entropy extractor with feedback structure was proposed, which minimizes the number of ring stages. The number of ring stages of the FSTR-TRNG was determined to be a multiple of eleven, taking into account operating clock frequency and entropy extraction circuit, and the ratio of tokens to bubbles was determined to operate in evenly-spaced mode. The hardware operation of FSTR-TRNG was verified by FPGA implementation. A set of statistical randomness tests defined by NIST 800-22 were performed by extracting 20 million bits of binary sequences generated by FSTR-TRNG, and all of the fifteen test items were found to meet the criteria. The FSTR-TRNG occupied 46 slices of Spartan-6 FPGA device, and it was implemented with about 2,500 gate equivalents (GEs) when synthesized in 180 nm CMOS standard cell library.

Randomness Based Fuzzing Test Case Evaluation for Vulnerability Analysis of Industrial Control System (산업제어시스템 취약성 분석을 위한 무작위성 기반 퍼징 테스트 케이스 평가 기법)

  • Kim, SungJin;Shon, Taeshik
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.28 no.1
    • /
    • pp.179-186
    • /
    • 2018
  • The number of devices connect to the internet is rapidly increasing with the advent of the IoT(Internet of Things). The IoT has improved the convenience of life. However, it makes security issues such as privacy violations. Therefore cybersecurity is the most important issue to be discussed nowadays. Especially, various protocols are used for same purpose due to rapidly increase of IoT market. To deal with this security threat noble vulnerability analysis is needed. In this paper, we contribute to the IoT security by proposing a new randomness-based test case evaluation methodology using variance and entropy. The test case evaluation method proposed in this paper can evaluate the test cases at a high speed regardless of the test set size, unlike the traditional technique.

Multiple Path Based Vehicle Routing in Dynamic and Stochastic Transportation Networks

  • Park, Dong-joo
    • Proceedings of the KOR-KST Conference
    • /
    • 2000.02a
    • /
    • pp.25-47
    • /
    • 2000
  • In route guidance systems fastest-path routing has typically been adopted because of its simplicity. However, empirical studies on route choice behavior have shown that drivers use numerous criteria in choosing a route. The objective of this study is to develop computationally efficient algorithms for identifying a manageable subset of the nondominated (i.e. Pareto optimal) paths for real-time vehicle routing which reflect the drivers' preferences and route choice behaviors. We propose two pruning algorithms that reduce the search area based on a context-dependent linear utility function and thus reduce the computation time. The basic notion of the proposed approach is that ⅰ) enumerating all nondominated paths is computationally too expensive, ⅱ) obtaining a stable mathematical representation of the drivers' utility function is theoretically difficult and impractical, and ⅲ) obtaining optimal path given a nonlinear utility function is a NP-hard problem. Consequently, a heuristic two-stage strategy which identifies multiple routes and then select the near-optimal path may be effective and practical. As the first stage, we utilize the relaxation based pruning technique based on an entropy model to recognize and discard most of the nondominated paths that do not reflect the drivers' preference and/or the context-dependency of the preference. In addition, to make sure that paths identified are dissimilar in terms of links used, the number of shared links between routes is limited. We test the proposed algorithms in a large real-life traffic network and show that the algorithms reduce CPU time significantly compared with conventional multi-criteria shortest path algorithms while the attributes of the routes identified reflect drivers' preferences and generic route choice behaviors well.

  • PDF