DOI QR코드

DOI QR Code

A Maximum Entropy-Based Bio-Molecular Event Extraction Model that Considers Event Generation

  • Received : 2013.11.12
  • Accepted : 2013.12.23
  • Published : 2015.06.30

Abstract

In this paper, we propose a maximum entropy-based model, which can mathematically explain the bio-molecular event extraction problem. The proposed model generates an event table, which can represent the relationship between an event trigger and its arguments. The complex sentences with distinctive event structures can be also represented by the event table. Previous approaches intuitively designed a pipeline system, which sequentially performs trigger detection and arguments recognition, and thus, did not clearly explain the relationship between identified triggers and arguments. On the other hand, the proposed model generates an event table that can represent triggers, their arguments, and their relationships. The desired events can be easily extracted from the event table. Experimental results show that the proposed model can cover 91.36% of events in the training dataset and that it can achieve a 50.44% recall in the test dataset by using the event table.

Keywords

References

  1. C. Blaschke, M.A. Andrade, C. Ouzounis, and A. Valencia, "Automatic extraction of biological information from scientific text: protein-protein interactions," in Proceedings of International Conference on Intelligent Systems for Molecular Biology, Germany, 1999, pp. 60-67.
  2. R. Bunescu, R. Mooney, A. Ramani, and E. Marcotte, "Integrating co-occurrence statistics with information extraction for robust retrieval of protein interactions from medline," in Proceedings of the Workshop on Linking Natural Language Processing and Biology, New York, NY, 2006, pp. 49-56.
  3. H. W. Chun, Y. Tsuruoka, J.D. Kim, R. Shiba, N. Nagata, T. Hishiki, and J. Tsujii, "Extraction of gene-disease relations from medline using domain dictionaries and machine learning," in Proceedings of the Pacific Symposium on Biocomputing, Maui, HI, 2006, pp. 4-15.
  4. F. Rinaldi, G. Schneider, K. Kaljurand, M. Hess, C. Andronis, O. Konstandi, and A. Persidis, "Mining of relations between proteins over biomedical scientific literature using a deep-linguistic approach," Artificial Intelligence in Medicine vol. 39, no. 2, 2007, pp. 127-136. https://doi.org/10.1016/j.artmed.2006.08.005
  5. A. Airola, S. Pyysalo, J. Bjorne, T. Pahikkala, F. Ginter, and T. Salakoski, "All-Paths Graph Kernel for Protein-Protein Interaction Extraction with Evaluation of Cross-corpus Learning, " BMC Bioinformatics vol. 9, Suppl. 11, 2008.
  6. K. M. Park, H. C. Cho, and H. C. Rim, "Utilizing various natural language processing techniques for biomedical interaction extraction," Journal of Information Processing Systems, vol. 7, no. 3, pp. 459-472, 2011. https://doi.org/10.3745/JIPS.2011.7.3.459
  7. J. D. Kim, T. Ohta, S. Pyysalo, Y. Kano, and J. Tsujii, "Extracting bio-molecular events from literature - the Bionlp'09 Shared Task," Computational Intelligence, vol. 27, no. 4, pp. 513-540, 2011. https://doi.org/10.1111/j.1467-8640.2011.00398.x
  8. J. Bjorne, J. Heimonen, F. Ginter, A. Airola, T. Pahikkala, and T. Salakoski, "Extracting contextualized complex biological events with rich graph-based feature sets," Computational Intelligence, vol. 27, no. 4, pp. 541-557, 2011. https://doi.org/10.1111/j.1467-8640.2011.00399.x
  9. E. Buyko, E. Faessler, J. Wermter, and U. Hahn, "Syntactic simplification and semantic enrichment - trimming dependency graphs for event extraction," Computational Intelligence, vol. 27, no. 4, pp. 610-644, 2011. https://doi.org/10.1111/j.1467-8640.2011.00402.x
  10. K. Hacioglu, "Semantic Role Labeling Using Dependency Trees," in Proceedings of the International Conference on Computational Linguistics, Geneva, Switzerland, 2004.
  11. R. T. H. Tsai, W. C. Chou, Y. C. Lin, C. L. Sung, W. Ku, Y. S. Su, T. Y. Sung, and W. L. Hsu, "Biosmile: adapting semantic role labeling for biomedical verbs: an exponential model coupled with automatically generated template features," in Proceedings of the HLT-NAACL BioNLP Workshop on Linking Natural Language and Biology, New York, NY, 2006, pp. 57-64.
  12. K. Fundel, R. Kuffner, and R. Zimmer, "RelEx-relation extraction using dependency parse trees," Bioinformatics, vol. 23, no. 3, pp. 365-371, 2007. https://doi.org/10.1093/bioinformatics/btl616
  13. H. Kilicoglu and S. Bergler, "Effective bio-event extraction using trigger words and syntactic dependencies," Computational Intelligence, vol. 27, no. 4, pp. 583-609, 2011. https://doi.org/10.1111/j.1467-8640.2011.00401.x
  14. J. Hakenberg, I. Solt, D. Tikk, V. H. Nguyen, L. Tari, Q. L. Nguyen, C. Baral, and U. Leser, "Molecular event extraction from link grammar parse trees in the BioNLP'09 Shared Task," Computational Intelligence, vol. 27, no. 4, pp. 665-680, 2011. https://doi.org/10.1111/j.1467-8640.2011.00404.x
  15. A. Vlachos, P. Buttery, D. O. Seaghdha, and T. Briscoe, "Biomedical event extraction without training data," in Proceedings of the BioNLP 2009 Workshop Companion Volume for Shared Task, Boulder, Colorado, 2009, pp. 37-40.
  16. R. Morante, V. Van Asch, and W. Daelemans," A memory-based learning approach to event extraction in biomedical texts," in Proceedings of the BioNLP 2009Workshop Companion Volume for Shared Task, Boulder, Colorado, 2009, pp. 59-67.
  17. H. G. Lee, H. C. Cho, M. J. Kim, J. Y. Lee, G. Hong, and H. C. Rim, "A multi-phase approach to biomedical event extraction," in Proceedings of the BioNLP 2009Workshop Companion Volume for Shared Task, Boulder, Colorado, 2009, pp. 107-110.
  18. S. Riedel, R. Saetre, H. W. Chun, T. Takagi, and J. Tsujii, "Bio-molecular event extraction with Markov logic," Computational Intelligence, vol. 27, no. 4, pp. 558-582, 2011. https://doi.org/10.1111/j.1467-8640.2011.00400.x
  19. S. Van Landeghem, B. De Baets, Y. Van de Peer, and Y. Saeys, "High-precision bio-molecular event extraction from text using parallel binary classifiers," Computational Intelligence, vol. 27, no. 4, pp. 645-664, 2011. https://doi.org/10.1111/j.1467-8640.2011.00403.x
  20. K. Sagae and A. Lavie, "A best-first probabilistic shift-reduce parser," in Proceedings of the COLING/ACL on Main Conference Poster Sessions, Sydney, Austrailia, 2006, pp. 691-698.
  21. E. Charniak, S. Goldwater, and M. Johnson, "Edgebased best-first chart parsing," in Proceedings of the Sixth Workshop on Very Large Corpora, Montreal, Canada, 1998, pp. 127-133.
  22. A. L. Berger, V. J. D. Pietra, and S. A. D. Pietra, "A maximum entropy approach to natural language processing," Computational Linguistics vol. 22, no.1, pp. 39-71, 1996.
  23. T. Lukasiewicz, "Credal networks under maximum entropy," in Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence, San Francisco, CA, 2000, pp. 363-370.
  24. L. Zhang, Maximum Entropy Modeling Toolkit for Python and C++. Shenyang: Natural Language Processing Lab., Northeastern University, 2004.
  25. A. Ratnaparkhi, "Learning to parse natural language with maximum entropy models," Machine Learning, vol. 34, no. 1-3, pp. 151-175, 1999. https://doi.org/10.1023/A:1007502103375