Browse > Article
http://dx.doi.org/10.3745/KIPSTB.2007.14-B.5.383

Detection of Gene Interactions based on Syntactic Relations  

Kim, Mi-Young (성신여자대학교 컴퓨터정보학부)
Abstract
Interactions between proteins and genes are often considered essential in the description of biomolecular phenomena and networks of interactions are considered as an entre for a Systems Biology approach. Recently, many works try to extract information by analyzing biomolecular text using natural language processing technology. Previous researches insist that linguistic information is useful to improve the performance in detecting gene interactions. However, previous systems do not show reasonable performance because of low recall. To improve recall without sacrificing precision, this paper proposes a new method for detection of gene interactions based on syntactic relations. Without biomolecular knowledge, our method shows reasonable performance using only small size of training data. Using the format of LLL05(ICML05 Workshop on Learning Language in Logic) data we detect the agent gene and its target gene that interact with each other. In the 1st phase, we detect encapsulation types for each agent and target candidate. In the 2nd phase, we construct verb lists that indicate the interaction information between two genes. In the last phase, to detect which of two genes is an agent or a target, we learn direction information. In the experimental results using LLL05 data, our proposed method showed F-measure of 88% for training data, and 70.4% for test data. This performance significantly outperformed previous methods. We also describe the contribution rate of each phase to the performance, and demonstrate that the first phase contributes to the improvement of recall and the second and last phases contribute to the improvement of precision.
Keywords
Gene Interaction; Bioinformatics; Syntactic Relation; Information Extraction;
Citations & Related Records
연도 인용수 순위
  • Reference
1 C.Blaschke, M.A.Andrade, C.Ouzounis, and A.Valencia, 'Automatic extraction of biological information from scientific text: protein-protein interactions', Proceedings of the seventh international conference on Intelligent Systems for Molecular Biology (lSMB 99), pp. 60-67, 1999
2 B. Stapley, L. Kelley, and M. Sternberg, 'Predicting the sub-cellular location of proteins from text using support vector machines', Proceedings of the Pacific Symposium on Biocomputing, pp.374-385, 2002
3 J. Hakenberg, C. Plake, U. Leser, H. Kirsch, and D. R-Schuhmann, 'LLL05 Challenge: Genic Interaction Extraction - Identification of Language Patterns Based on Alignment and Finite State Automata', Proceedings of ICML05 workshop on Learning Language in Logic (LLL05), pp.38-45, 2005
4 N. Daraselia, A. Yuryev, S. Egorov, S. Novichkova, A. Nikitin, and I. Mazo, 'Extracting human protein interactions from medline using a full-sentence parser', Bioinformatics, Vol. 20, pp.604-611   DOI   ScienceOn
5 B. Rosario, and M. Hearst, 'Classifying semantic relations in bioscience texts', Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics(ACL), pp. 430-437, 2004   DOI
6 M. A. Greenwood, M. Stevenson, Y. Guo, H. Harkema, and A. Roberts, 'Automatically Acquiring a Linguistically Motivated Genic Interaction Extraction System', Proceedings of ICML05 workshop on Learning Language in Logic (LLL05), 2005
7 Rinaldi F, Schneider G, Kaljurand K, Hess M, Andronis C, Konstanti O, Persidis A: 'Mining of Functional Relations between Genes and Proteins over Biomedical Scientific Literature using a Deep-Linguistic Approach' Artificial Intelligence in Medicine, Vol. 39, Issue 2, pp. 127-136, 2007   DOI   ScienceOn
8 D. Otasek. K. Brown, I. Jurisica, 'Confirming protein-protein interactions by text mining', SIAM Conference on Text Mining, Bethesda, Maryland, April 2006
9 J.C.Park, H.S.Kim, J.j.Kim, 'Bidirectional Incremental Parsing for Automatic Pathway Identification with Combinatory Categorial Grammar', Pacific Symposium on Biocornputing (PSB), pp. 396-407, Hawaii, USA, 2001
10 J. Xiao, J. Su, G. Zhou, and C. Tan, 'Protein-Protein Interaction Extraction: A Supervised Learning Approach', Proceeding of the Symposium on Semantic Mining in Biomedicine, pp.51-59, 2005
11 M. Goadrich, L. Oliphant, J. Shavlik, 'Learning to Extract Genie Interactions Using Gleaner', Proceedings of ICML05 workshop on Learning Language in Logic (LLL05), 2005
12 L. Popelinsky, J. Blatak, 'Learning genic interactions without expert domain knowledge: Comparison of different ILP algorithms', Proceedings of ICML05 workshop on Learning Language in Logic (LLL05), 2005
13 J. Saric, L. Jensen, R. Ouzounova, I. Rojas, and P. Bork, 'Large-scale Extraction of Protein/Gene Relations for Model Organisms', Proceeding of the Symposium on Semantic Mining in Biomedicine, pp.50, 2005
14 S. Katrenko, M. S. Marshall, M. Roos, and P. Adriaans, 'Learning Biological Interactions from Medline Abstracts', Proceedings of ICML05 workshop on Learning Language in Logic (LLL05), 2005
15 M. Huang, X. Zhu, Y. Hao, D. G. Payan, K. Qu, and M. Li, 'Discovering patterns to extract protein-protein interactions from full texts', Bioinformatics, Vol. 20, pp.3604-3612, 2004   DOI   ScienceOn
16 S. Riedel, and E. Klein, 'Genic Interaction Extraction with Semantic and Syntactic Chains', Proceedings of ICML05 workshop on Learning Language in Logic (LLL05), 2005
17 D. Lin, 'Dependency-based evaluation of MINIPAR', In Workshop on the Evaluation of Parsing Systems, 1998
18 P. Uetz, R. L. Finley, Jr. 'From protein networks to biological systems', FEBS Lett 579:1821-182, 2005   DOI   ScienceOn