Browse > Article

Performance Enhancement of Tree Kernel-based Protein-Protein Interaction Extraction by Parse Tree Pruning and Decay Factor Adjustment  

Choi, Sung-Pil (한국과학기술정보연구원 정보기술연구실)
Choi, Yun-Soo (한국과학기술정보연구원 정보기술연구실)
Jeong, Chang-Hoo (한국과학기술정보연구원 정보기술연구실)
Myaeng, Sung-Hyon (한국과학기술원 전산학과)
Abstract
This paper introduces a novel way to leverage convolution parse tree kernel to extract the interaction information between two proteins in a sentence without multiple features, clues and complicated kernels. Our approach needs only the parse tree alone of a candidate sentence including pairs of protein names which is potential to have interaction information. The main contribution of this paper is two folds. First, we show that for the PPI, it is imperative to execute parse tree pruning removing unnecessary context information in deciding whether the current sentence imposes interaction information between proteins by comparing with the latest existing approaches' performance. Secondly, this paper presents that tree kernel decay factor can play an pivotal role in improving the extraction performance with the identical learning conditions. Consequently, we could witness that it is not always the case that multiple kernels with multiple parsers perform better than each kernels alone for PPI extraction, which has been argued in the previous research by presenting our out-performed experimental results compared to the two existing methods by 19.8% and 14% respectively.
Keywords
Protein-Protein Interaction Extraction; Kernel Methods; Convolution Parse Tree Kernel; Information Extraction; Relation Extraction;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Bunescu R., Ge R., Kate R., Marcotte E., Mooney R., Ramani, A., Wong, Y., "Comparative Experiments on Learning Information Extractors for Proteins and their Interactions," Artif. Intell. Med., Summarization and Information Extraction from Medical Documents, vol.33, pp.139-155, 2005.
2 Pyysalo S., Ginter F., Heimonen J., Bjorne J., Boberg J., Jarvinen J., Salakoski T., "BioInfer: a corpus for information extraction in the biomedical domain," BMC Bioinformatics, vol.8, no.50, 2007.
3 Pyysalo S., Sætre R., Tsujii J., Salakoski T., "Why Biomedical Relation Extraction Results are Incomparable and What to do about it," Proceedings of SMBM'08, 2008.
4 Bunescu R. C., Mooney R. J., "Subsequence Kernels for Relation Extraction," NIPS-2005, 2005.
5 Blaschke C., Andrade M., Ouzounis C., Valencia A., "Automatic extraction of biological information from scientific text: protein-protein interactions," Proc. Int. Conf. Intell. Syst. Mol. Biol., pp.60-67, 1999.
6 Collins M., Duffy N., "Convolution Kernels for Natural Language," NIPS-2001, 2001.
7 Vishwanathan S. V. N., Smola A. J., "Fast Kernels for String and Tree Matching," Advances in Neural Information Processing Systems, MIT Press, vol.15, pp.569-576, 2003.
8 Pyysalo S., Airola A., Heimonen J., Bjorne J., Ginter F., Salakoski T., "Comparative analysis of five protein-protein interaction corpora," BMC Bioinformatics, vol.9, no.S6, 2008.
9 GuoDong Z., Zhang M., Ji D., QiaoMing Z., "Tree kernel-based relation extraction with context-sensitive structured parse tree information," Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP/ CoNLL-2007), pp.728-736, 2007.
10 Bunescu R. C., Mooney R. J., "A shortest path dependency kernel for relation extraction," HLT '05: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, Association for Computational Linguistics, pp.724-731, 2005.
11 Moschitti A., "Making tree kernels practical for natural language learning," Proceedings of EACL'06, Trento, Italy, 2006.
12 Ding J., Berleant D., Nettleton D., Wurtele E., "Mining MEDLINE: abstracts, sentences, or phrases?" Proceedings of PSB'02, pp.326-337, 2002.
13 Ono T., Hishigaki H., Tanigam A., Takagi T., "Automated extraction of information on proteinprotein interactions from the biological literature," Bioinformatics, vol.17, no.2, pp.155-161, 2001.   DOI   ScienceOn
14 Nedellec C., "Learning language in logic-genic interaction extraction challenge," Proceedings of LLL'05, pp.31-37, 2005.
15 Airola A., Pyysalo S., Bjorne J., Pahikkala T., Ginter F., Salakoski T., "All-paths graph kernel for protein-protein interaction extraction with evaluation of cross-corpus learning," BMC Bioinformatics, vol.9, no.S2, 2008.
16 Fundel K., Kuffner R., Zimmer R., "RelEx-Relation extraction using dependency parse trees," Bioinformatics, vol.23, pp.365-371, 2007.   DOI   ScienceOn
17 Miwa M., Sætre R., Miyao Y., Tsujii J., "Proteinprotein interaction extraction by leveraging multiple kernels and parsers," International Journal of Medical Informatics, 2009.
18 Culotta A., Sorensen J., "Dependency tree kernels for relation extraction," ACL '04: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, Association for Computational Linguistics, 2004.