• Title/Summary/Keyword: probabilistic relatedness

Search Result 4, Processing Time 0.018 seconds

Feature-Based Relation Classification Using Quantified Relatedness Information

  • Huang, Jin-Xia;Choi, Key-Sun;Kim, Chang-Hyun;Kim, Young-Kil
    • ETRI Journal
    • /
    • v.32 no.3
    • /
    • pp.482-485
    • /
    • 2010
  • Feature selection is very important for feature-based relation classification tasks. While most of the existing works on feature selection rely on linguistic information acquired using parsers, this letter proposes new features, including probabilistic and semantic relatedness features, to manifest the relatedness between patterns and certain relation types in an explicit way. The impact of each feature set is evaluated using both a chi-square estimator and a performance evaluation. The experiments show that the impact of relatedness features is superior to existing well-known linguistic features, and the contribution of relatedness features cannot be substituted using other normally used linguistic feature sets.

Analysis of Probabilistic Limits of Trait Identity in Inter-Strain Comparison of Genomic Fingerprints of Bacteria (균주간 유전체 지문 비교분석에서 유전형질 일치성의 확률적 한계 분석)

  • Zo, Young-Gun
    • Korean Journal of Microbiology
    • /
    • v.47 no.3
    • /
    • pp.263-267
    • /
    • 2011
  • Genomic fingerprinting methods are useful in determining relatedness among bacterial strains. However, random coincidences in sizes of two DNA fragments in two different fingerprints may occur, resulting in erroneous interpretation of relatedness between two bacterial genomes. In this study, I estimated the probability of occurrence of DNA bands of identical size in fingerprints of two unrelated genomes, so that the significance of fingerprint-based estimation of genome relatedness could be analyzed. The probability could be estimated as outputs of a function formulated with the three parameters: the numbers of observed fragments, all possible sizes of fragments and observed fragments common in a given pair of fingerprints. The parameter most instrumental to significance of relatedness estimation was the number of all possible sizes of fragments. To keep the number of coincidentally-common size of fragments below 10, about 200 fragments should be distinguishable in the fingerprints.

Automatic Recommendation of Panel Pool Using a Probabilistic Ontology and Researcher Networks (확률적 온톨로지와 연구자 네트워크를 이용한 심사자 자동 추천에 관한 연구)

  • Lee, Jung-Yeoun;Lee, Jae-Yun;Kang, In-Su;Shin, Suk-Kyung;Jung, Han-Min
    • Journal of the Korean Society for information Management
    • /
    • v.24 no.3
    • /
    • pp.43-65
    • /
    • 2007
  • Automatic recommendation system of panel pool should be designed to support universal, expertness, fairness, and reasonableness in the process of review of proposals. In this research, we apply the theory of probabilistic ontology to measure relatedness between terms in the classification of academic domain, enlarge the number of review candidates, and rank recommendable reviewers according to their expertness. In addition, we construct a researcher network connecting among researchers according to their various relationships like mentor, coauthor, and cooperative research. We use the researcher network to exclude inappropriate reviewers and support fairness of reviewer recommendation process. Our methodology recommending proper reviewers is verified from experts in the field of proposal examination. It propose the proper method for developing a resonable reviewer recommendation system.

Target Word Selection Disambiguation using Untagged Text Data in English-Korean Machine Translation (영한 기계 번역에서 미가공 텍스트 데이터를 이용한 대역어 선택 중의성 해소)

  • Kim Yu-Seop;Chang Jeong-Ho
    • The KIPS Transactions:PartB
    • /
    • v.11B no.6
    • /
    • pp.749-758
    • /
    • 2004
  • In this paper, we propose a new method utilizing only raw corpus without additional human effort for disambiguation of target word selection in English-Korean machine translation. We use two data-driven techniques; one is the Latent Semantic Analysis(LSA) and the other the Probabilistic Latent Semantic Analysis(PLSA). These two techniques can represent complex semantic structures in given contexts like text passages. We construct linguistic semantic knowledge by using the two techniques and use the knowledge for target word selection in English-Korean machine translation. For target word selection, we utilize a grammatical relationship stored in a dictionary. We use k- nearest neighbor learning algorithm for the resolution of data sparseness Problem in target word selection and estimate the distance between instances based on these models. In experiments, we use TREC data of AP news for construction of latent semantic space and Wail Street Journal corpus for evaluation of target word selection. Through the Latent Semantic Analysis methods, the accuracy of target word selection has improved over 10% and PLSA has showed better accuracy than LSA method. finally we have showed the relatedness between the accuracy and two important factors ; one is dimensionality of latent space and k value of k-NT learning by using correlation calculation.