DOI QR코드

DOI QR Code

Biomedical Ontologies and Text Mining for Biomedicine and Healthcare: A Survey

  • Yoo, Ill-Hoi (Department of Health Management and Informatics, School of Medicine, University of Missouri-Columbia) ;
  • Song, Min (Information Systems department College of Computing Sciences New Jersey Institute of Technology)
  • Published : 2008.06.30

Abstract

In this survey paper, we discuss biomedical ontologies and major text mining techniques applied to biomedicine and healthcare. Biomedical ontologies such as UMLS are currently being adopted in text mining approaches because they provide domain knowledge for text mining approaches. In addition, biomedical ontologies enable us to resolve many linguistic problems when text mining approaches handle biomedical literature. As the first example of text mining, document clustering is surveyed. Because a document set is normally multiple topic, text mining approaches use document clustering as a preprocessing step to group similar documents. Additionally, document clustering is able to inform the biomedical literature searches required for the practice of evidence-based medicine. We introduce Swanson's UnDiscovered Public Knowledge (UDPK) model to generate biomedical hypotheses from biomedical literature such as MEDLINE by discovering novel connections among logically-related biomedical concepts. Another important area of text mining is document classification. Document classification is a valuable tool for biomedical tasks that involve large amounts of text. We survey well-known classification techniques in biomedicine. As the last example of text mining in biomedicine and healthcare, we survey information extraction. Information extraction is the process of scanning text for information relevant to some interest, including extracting entities, relations, and events. We also address techniques and issues of evaluating text mining applications in biomedicine and healthcare.

Keywords

References

  1. AGRAWAL, R., ET AL. 2000. Fast Discovery of Association Rules, Advances in Knowledge Discovery and Data Mining, U. Fayyad, et al., Editors. AAAI/MIT Press.
  2. ANDRADE, M. A. AND BORK, P. 2000. Automated extraction of information in molecular biology. FEBS Letters, 476:12-7. https://doi.org/10.1016/S0014-5793(00)01661-6
  3. ANDRADE, M., BLASCHKE, C. AND VALENCIA, A. 1999. AbXtract: Automatic Abstract eXtraction of keywords associated to protein function. Bioinformatics, 14(7):600-7.
  4. AO, H. AND TAKAGI, T. 2005. ALICE: An algorithm to extract abbreviations from MEDLINE, Journal of the American Medical Informatics Association, 12: 576-586. https://doi.org/10.1197/jamia.M1757
  5. BAEZA-YATES, R. AND RIBEIRO-NETO, B. 1999. Modern Information Retrieval.
  6. BERGER, A., DELLA PIETRA, S., AND DELLA PIERTA, V. 1999. A maximum entropy approach to natural language processing. Computational Linguistics, Vol. 22, p. 39-71
  7. BLASCHKE, C., ANDRADE, M. A., OUZOUNIS, C., AND VALENCIA, A. 1999. Automatic Extraction of Biological Information from Scientific Text: Protein-Protein Interactions, In Proceedings of the First International Conference on Intelligent Systems for Molecular Biology, 60-67.
  8. BODENREIDER, O. 2006. Lexical, Terminological, and Ontological Resources for Biological Text Mining, Text Mining for Biology and Biomedicine, Ananiadou S. and McNaught J. (eds.), Artech House, 43-66.
  9. BRANK, J., GROBELNIK, M., MILI -FRAYLING, N., AND MLADENI , D. 2002. Interaction of feature selection methods and linear classification models. Proceedings of the ICML-02 Workshop on Text Learning, Sydney, AU.
  10. BROWNE, A., MCCRAY, A., AND SRINIVASAN, S., The SPECIALIST Lexicon, Lister Hill National Center for Biomedical Communications, National Library of Medicine (NLM), http://lexsrv3.nlm.nih.gov/SPECIALIST/Projects/lexicon/current/release/LEX/DOCS/techrpt.pdf.
  11. BUCKLEY, C. AND LEWIT, A. F. 1985. Optimization of inverted vector searches. In Proceedings of SIGIR-85, 97-110.
  12. BUCKLEY, C., SALTON, G., ALLEN, J. AND SINGHAL, A. 1995. Automatic query expansion using SMART: TREC-3. In: D. K. Harman (ed.), The Third Text Retrieval Conference (TREC-3). U.S. Department of Commerce, 69-80.
  13. CHANG, J. T., SCHÜTZE, H. AND ALTMAN, R. B. 2002. Creating an Online Dictionary of Abbreviations from MEDLINE, The Journal of the American Medical Informatics Association, 9: 612-620. https://doi.org/10.1197/jamia.M1139
  14. COLLIER, N., NOBATA, C., AND TSUJII, J. 2000. Extracting the Names of Genes and Gene Products with a Hidden Markov Model. Proceedings of the 18th International Conference on Computational Linguistics (COLING2000), 201-207.
  15. COWIE, J. AND LEHNERT, W. 1996. Information extraction. Communications of ACM, 39:80-91. https://doi.org/10.1145/234173.234209
  16. CRAMMER, K., AND SINGER, Y. 2001. On the algorithmic implementation of multiclass kernelbased vector machines. Journal of Machine Learning Research, Vol. 2, p. 265-292.
  17. CRAVEN, M., AND KUMLIEN, J. 1999. Constructing Biological Knowledge Bases by Extracting Information from Text Sources. Proceedings of the 7th International Conference on Intelligent Systems for Molecular Biology, 77-86.
  18. CUTTING, D., KARGER, D., PEDERSEN, J., AND TUKEY, J. 1992. Scatter/Gather: A Cluster-based Approach to Browsing Large Document Collections, In Proceedings of SIGIR '92, 318-329.
  19. DE BRUIJN, B., AND MARTIN, J. 2002. Getting to the (C)ore of knowledge: mining biomedical literature. International Journal of Medical Informatics, (67): 7-18.
  20. DEMETRIOU, G., AND GAIZAUSKAS, R. 2002. Utilizing text mining results: The Pasta Web System. Proceedings of the Workshop on Natural Language Processing in the Biomedical Domain, 77-84.
  21. DING, J., BERLEANT, D., NETTLETON, D., AND WURTELE, E. 2002. Mining MEDLINE: abstracts, sentences, or phrases? Pacific Symposium on Biocomputing, 326-337.
  22. DOMINGOS, P., AND PAZZANI, M. 1997. On the Optimality of the Simple Bayesian Classifier under Zero-One Loss, in: Machine Learning, Vol. 29:2-3, p. 103-130. https://doi.org/10.1023/A:1007413511361
  23. DONALDSON, I., MARTIN, J., DE BRUIJN, B., AND WOLTING, C. 2003. "PreBIND and Textomy-mining the biomedical literature for protein-protein interactions using a support vector machine", BMC Bioinformatics, Vol. 4:11, p. 11-23. https://doi.org/10.1186/1471-2105-4-11
  24. DUMAIS, S. T., PLATT, J., HECKERMAN, D., AND SAHAMI, M. 1998. Inductive learning algorithms and representations for text categorization. Proceedings of CIKM-98, 7th ACM International Conference on Information and Knowledge Management, eds. G. Gardarin, J.C. French, N. Pissinou, K. Makki & L. Bouganim, ACMPress, New York, US: Bethesda, US, p. 148-155.
  25. Evidence-Based Medicine Working Group 1992. Evidence-based medicine. A new approach to teaching the practice of medicine. JAMA, Nov 1992; 268: 2420-2425. https://doi.org/10.1001/jama.1992.03490170092032
  26. FAN, W., WALLACE, L., RICH, S., AND ZHANG, Z. 2005. Tapping into the power of text mining, Communications of ACM, forthcoming.
  27. FRIEDMAN, C., KRA, P., YU, H., KRAUTHAMMER, M. AND RZHETSKY, A. 2001. GENIES: a natural-language processing system for the extraction of molecular pathways from journal articles. Bioinformatics, 17 Suppl 1, S74−82.
  28. FUKUDA, K., TAMURA, A., TSUNODA, T., AND TAKAGI, T. 1998. Toward information extraction: identifying protein names from biological papers. Pacific Symposium on Biocomputing, 707-18.
  29. GRUBER, T. R. A. 1993. Translation Approach to Portable Ontology Specifications. Knowledge Acquisition, Vol. 5, pp. 199-220. https://doi.org/10.1006/knac.1993.1008
  30. GRUBER, T. R. 1995. Towards Principles for the Design of Ontologies used for Knowledge Sharing. International Journal of Human-Computer Studies, 43, 907-928. https://doi.org/10.1006/ijhc.1995.1081
  31. GRUNINGER, M., AND LEE, J. 2002. Ontology applications and design, Communications of the ACM, February, Vol. 45, No. 2, 39-41. https://doi.org/10.1145/585597.585599
  32. HAHN, U., ROMACKER, M., AND SCHULZ, S. 2002. Creating Knowledge Repositories from Biomedical Reports: The MEDSYNDIKATE Text Mining System. Pacific Symposium on Biocomputing, 338-349.
  33. HEARST, M. A., SCHOELKOPF, B., DUMAIS, S., OSUNA, E., AND PLATT, J. 1998. Trends and Controversies-Support Vector Machines, in: IEEE Intelligent Systems, Vol. 13:4, p. 18-28.
  34. HERRMANN, K. 2001. Rakesh Agrawal: Athena: Mining-based Interactive Management of Text Databases, URL: http://www3.informatik.tumuenchen.de/lehre/WS2001/HSEM-bayer/textmining.pdf [as of 2002-03-02].
  35. HIRSCHMAN, L., PARK, J. C., TSUJII, J., WONG, L., AND WU, C. H. 2002. Accomplishments and challenges in literature data mining for biology. Bioinformatics, 18(12): 1553-1561. https://doi.org/10.1093/bioinformatics/18.12.1553
  36. HOTHO, A., MAEDCHE, A., AND STAAB, S. 2002. Text clustering based on good aggregations. Kunstliche Intelligenz (KI), 16, 4, 48-54.
  37. HRISTOVSKI, D., STARE, J., PETERLIN, B., AND DZEROSKI, S. 2001. Supporting discovery in medicine by association rule mining in Medline and UMLS, Medinfo, 10, 1344-1348.
  38. HRISTOVSKI, D., PETERLIN, B., MITCHELL, J. A., AND HUMPHREY, S. M. 2003. Improving literature based discovery support by genetic knowledge integration, Stud. Health Technol. Inform. 95:68-73.
  39. HUMPHREYS, K., DEMETRIOU, G., AND GAIZAUSKAS, R. 2000. Two applications of information extraction to biological science journal articles: enzyme interactions and protein structures. Pacific Symposium on Biocomputing, 505-16.
  40. JENSSEN, T. K., et al. 2001. A literature network of human genes for high-throughput analysis of gene expression. Nat. Genet., 28, 21-28.
  41. JENSSEN, T. K., LAEGREID, A., KOMOROWSKI, J., AND HOVIG, E. 2001. A literature network of human genes for high-throughput analysis of gene expression. Nature Genetics, 28(1):21-8.
  42. JENSSEN, T. K., LAEGREID, A., KOMOROWSKI, J., AND HOVIG, E. 2001. A literature network of human genes for high-throughput analysis of gene expression, Nature Genetics, Vol. 28, p. 21-28.
  43. JOACHIMS, T. 1998. Text categorization with support vector machines: learning with many relevant features. Proceedings of ECML-98, 10th European Conference on Machine Learning, eds. C. Nedellec & C. Rouveirol, Springer Verlag, Heidelberg, DE: Chemnitz, DE, p. 137-142.
  44. JOACHIMS, T. 1999. Transductive inference for text classification using support vector machines. Proceedings of ICML-99, 16th International Conference on Machine Learning, eds. I. Bratko & S. Dzeroski, Morgan Kaufmann Publishers, San Francisco, US: Bled, SL, p. 200-209.
  45. JOSHI, R., LI, X. L., RAMACHANDARAN, S., AND LEONG, T. Y. 2004. Automatic Model Structuring from Text using BioMedical Ontology, In American Association for Artificial Intelligence (AAAI) Workshop, pp. 74-79, San Jose, California, July.
  46. KARANIKAS, H., AND THEODOULIDIS, B. 2002. Knowledge discovery in text and text mining software, Technical report, UMIST−CRIM, Manchester.
  47. KAUFMAN, L., AND ROUSSEEUW, P. J. 1999. Finding Groups in Data: an Introduction to Cluster Analysis, John Wiley & Sons.
  48. KOLLER, D., AND SAHAMI, M. 1997. Hierarchically classifying documents using very few words. In Proceedings of ICML-97, 170-176.
  49. KRAUTHAMMER, M., RZHETSKY, A., MOROZOV, P., AND FRIEDMAN, C. 2000. Using BLAST for identifying gene and protein names in journal articles. Gene, 259(1-2): 245-252. https://doi.org/10.1016/S0378-1119(00)00431-5
  50. LEE, K., HWANG, Y., AND RIM, H. 2003. Two-Phase Biomedical NE Recognition based on SVMs. Proceedings of the ACL 2003 Workshop on Natural Language Processing in Biomedicine, 33-40.
  51. LEEK, T. R. 1997. Information Extraction Using Hidden Markov Models. MSc Thesis, Department of Computer Science, University of California, San Diego.
  52. LEOPOLD, E., AND KINDERMANN, J. 2002. Text Categorization with Support Vector Machines. How to Represent Texts in Input Space? Machine Learning, Vol. 46:1-3, p. 423-444. https://doi.org/10.1023/A:1012491419635
  53. LIN, J., AND DEMNER-FUSHMAN, D. 2007. Semantic Clustering of Answers to Clinical Questions, Proceedings of the 2007 Annual Symposium of the American Medical Informatics Association (AMIA 2007), Chicago, Illinois, pp. 458-462.
  54. LIU, F., JENSSEN, T. K., NYGAARD, V., SACK, J., AND HOVIG, E. 2004. FigSearch: Using Maximum Entropy Classifier to Categorize Biological Figures. Proceedings of IEEE Computational Systems Bioinformatics Conference, p. 476-477
  55. LIU, H. AND FRIEDMAN, C. 2003. Mining Terminological Knowledge in Large Biomedical Corpora, Proceedings of the Pacific Symposium on Biocomputing, 8: 415-426.
  56. MOONEY, R. J., AND NAHM, U. Y. 2003. Text Mining with Information Extraction, Multilingualism and Electronic Language Management: Proceedings of the 4th International MIDP Colloquium, 22-23 September, Bloemfontein, South Africa, pp. 141-160.
  57. NARAYANASWAMY, M., RAVIKUMAR, K. E. AND VIJAY-SHANKER, K. 2003. A biological named entity recognizer. Pacific Symposium on Biocomputing, 427-438.
  58. National Library of Medicine (NLM), MEDLINE, http://www.ncbi.nlm.nih.gov/sites/entrez?cmd=File&db=PubMed, 2008.
  59. National Library of Medicine (NLM), Medical Subject Headings (MeSH) Fact Sheet, http://www.nlm.nih.gov/pubs/factsheets/mesh.html, 2008.
  60. National Library of Medicine (NLM), Unified Medical Language System (UMLS) Fact Sheet, http://www.nlm.nih.gov/pubs/factsheets/umls.html, 2008.
  61. NG, S. K., AND WONG, M. 1999. Toward routine automatic pathway discovery from on-line scientific text abstracts. Genome Informatics Series: Workshop on Genome Informatics, 10: 104-112.
  62. OHTA, Y., YAMAMOTO, Y., OKAZAKI, T., UCHIYAMA, I., AND TAKAGI, T. 1997. Automatic construction of knowledge base from biological papers. Proceedings of International Conference on Intelligent System for Molecular Biology, 5:218-25.
  63. PAKHOMOV, S. V., RUGGIERI, A., AND CHUTE, C. G. 2002. Maximum entropy modeling for mining patient medication status from free text, Proc AMIA Symp, p. 587-91
  64. PANT, G., AND SRINIVASAN, P. 2005. Learning to crawl: Comparing classification schemes. ACM Transactions on Information Systems, Vol. 23, p. 430-462. https://doi.org/10.1145/1095872.1095875
  65. PANTEL, P., AND LIN, D. 2002. Document clustering with committees. In Proceedings of the 2002 ACM SIGMOD International Conference on Management of data, 199-206.
  66. PARK, J. C., KIM, H. S., AND KIM, J. J. 2001. Bidirectional Incremental Parsing for Automatic Pathway Identification with Combinatory Categorical Grammar, Pacific Symposium on Biocomputing, 396-407.
  67. PEREZ-IRATXETA, C., BORK, P., AND ANDRADE, M. A. 2002. Association of genes to genetically inherited diseases using data mining. Nat. Genet., 31, 316-319.
  68. PRATT, WANDA AND YETISGEN-YILDIZ, Meliha, 2003. LitLinker: capturing connections across the biomedical literature, K-CAP'03, pp. 105-112, Sanibel Island, FL, Oct. 23-25.
  69. PRATT, W., AND FAGAN, L. 2000. The Usefulness of Dynamically Categorizing Search Results, Journal of the American Medical Informatics Association, 7(6), pp. 605-617. https://doi.org/10.1136/jamia.2000.0070605
  70. PRATT, W., HEARST, M., AND FAGAN, L. 1999. A knowledge-based approach to organizing retrieved documents, AAAI '99: Proceedings of the 16th National Conference on Artificial Intelligence, Orlando, Florida, pp. 80-85.
  71. PROUX, D., RECHENMANN, F., AND JULLIARD, L. 2000. A pragmatic information extraction strategy for gathering data on genetic interactions. Proceedings of International Conference on Intelligent System for Molecular Biology, 8:279-85.
  72. PUSTEJOVSKY, J., CASTANO, J., ZHANG, J., KOTECKI, M., AND COCHRAN, B. 2002. Robust relational parsing over biomedical literature: extracting inhibit relations. Pacific Symposium on Biocomputing, 362-73.
  73. RAY, S. AND CRAVEN, M. 2001. Representing Sentence Structure in Hidden Markov Models for Information Extraction. Proceedings of the 17th International Joint Conference on Artificial Intelligence, Seattle, WA. Morgan Kaufmann.
  74. RAYCHAUDHURI, S., CHANG, J. T., SUTPHIN, P. D., AND ALTMAN, R. B. 2002. Associating genes with Gene Ontology codes using a maximum entropy analysis of biomedical literature, Genome Research, Vol. 12, p. 203-14. https://doi.org/10.1101/gr.199701
  75. RINDFLESCH, T. C., TANABE, L., WEINSTEIN, J. N., AND HUNTER, L. 2000. EDGAR: extraction of drugs, genes and relations from the biomedical literature. Pacific Symposium on Biocomputing, 517-28.
  76. RISH, I., An empirical study of naïve Bayes classifier. In IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence, p. 41-46.
  77. SCHWARTZ, A. S., AND HEARST, M. A. 2003. A simple algorithm for identifying abbreviation definitions in biomedical text, Proceedings of the Pacific Symposium on Biocomputing, 8: 451-462.
  78. SEBASTIANI, F. 2002. Machine Learning in automated text categorization. ACM Computing Surveys, Vol. 34, p. 1-47. https://doi.org/10.1145/505282.505283
  79. SHATKAY, H., AND FELDMAN, R. 2003. Mining the biomedical literature in the genomic era: An overview. Journal of Computational Biology, 10(6): 821-855. https://doi.org/10.1089/106652703322756104
  80. SIOLAS, G., AND D'ALCHÉ-BUC, F. 2000. Support Vector Machines based on a semantic kernel for text categorization, in: Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks (IJCNN'00), p. 205-209.
  81. SONG, M., AND YOO, I. 2007. A Hybrid Abbreviation Extraction Technique for Biomedical Literature, accepted in 2007 IEEE International Conference on Bioinformatics and Biomedicine (IEEE BIBM 2007), San Jose, CA, USA, Nov. 2-4.
  82. SRINIVASAN, P. 2004. Text mining: Generating hypotheses from MEDLINE, Journal of the American Society for Information Science, Vol. 55, No. 4, pp. 396-413. https://doi.org/10.1002/asi.10389
  83. STAPLEY, B. J., AND BENOIT, G. 2000. Biobibliometrics: information retrieval and visualization from co-occurrences of gene names in MEDLINE abstracts. Pacific Symposium on Biocomputing, 529-40.
  84. STAPLEY, B. J., KELLEY, L. A., AND STERNBERG, M. J. E. 2002. Predicting the sub-cellular location of proteins from text using support vector machines, Pacific Symposia in Biocomputing, p. 374-85.
  85. STEINBACH, M., KARYPIS, G., AND KUMAR, V. 2000. A comparison of document clustering techniques. Technical Report #00-034. Department of Computer Science and Engineering, University of Minnesota.
  86. SWANSON, D. R. 1986. Undiscovered public knowledge. Libr. Q. 56(2):103-118. https://doi.org/10.1086/601720
  87. SWANSON, D. R. 1987. Two medical literatures that are logically but not bibliographically connected. JASIS, 38(4):228-233. https://doi.org/10.1002/(SICI)1097-4571(198707)38:4<228::AID-ASI2>3.0.CO;2-G
  88. SWANSON, D. R., AND SMALHEISER, N. R. 1999. Implicit text linkages between Medline records: Using Arrowsmith as an aid to scientific discovery. Library Trends, 48(1):48-59.
  89. TANABE, L., SCHERF, U., SMITH, L. H., LEE, J. K., HUNTER, L., AND WEINSTEIN, J. N. 1999. MedMiner: an Internet text-mining tool for biomedical information, with application to gene expression profiling. Biotechniques, 27(6), 1210-4, 1216-7.
  90. THOMAS, J., MILWARD, D., OUZOUNIS, C., PULMAN, S., AND CARROLL, M. 2000. Automatic extraction of protein interactions from scientific abstracts. Pacific Symposium on Biocomputing, 541-52.
  91. VAN RIJSBERGEN, C. J. 1979. Information Retrieval, 2nd edition, London: Buttersworth.
  92. VAPNIK, V. N. 1995. The nature of statistical learning theory. Springer Verlag: Heidelberg, DE.
  93. WANG, B. B., MCKAY, R. I., ABBASS, H. A., AND BARLOW, M. 2002. Learning Text Classifier using the Domain Concept Hierarchy. In Proceedings of International Conference on Communications, Circuits and Systems 2002, China.
  94. WITTEN, I. H., AND FRANK, E. 2000. Data Mining - Practical Machine Learning Tools and Techniques with Java Implementations, Morgan Kaufmann Publishers: San Francisco.
  95. YAKUSHIJI, A., TATEISI, Y., MIYAO, Y., AND TSUJII, J. 2001. Event extraction from biomedical papers using afull parser. Pacific Symposium on Biocomputing, 408-19.
  96. YAMAMOTO, K., KUDO, T., KONAGAYA, A., AND MATSUMOTO, Y. 2003. Protein Name Tagging for Biomedical Annotation in Text. Proceedings of the ACL 2003 Workshop on Natural Language Processing in Biomedicine, 65-72.
  97. YANG, Y., AND LIU, X. 1999. A Re-Examination of Text Categorization Methods, in: Proceedings of the 22nd Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval, p. 42-49.
  98. YOO, I., HU, X., AND SONG, I.-Y. 2007. A Coherent Graph-based Semantic Clustering and Summarization Approach for Bi/omedical Literature and a New Summarization Evaluation Methods, BMC Bioinformatics, 8(Suppl 9):S4.
  99. YOO, I., HU, X., AND SONG, I.-Y. 2006. Integration of Semantic-based Bipartite Graph Representation and Mutual Refinement Strategy for Biomedical Literature Clustering, in the 12th SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 791-796, Philadelphia, USA, August 20-23.
  100. YU, H., HRIPCSAK, G., AND FRIEDMAN, C. 2002. Mapping abbreviations to full forms in biomedical articles, Journal of the American Medical Informatics Association, 9: 162-172.
  101. ZAMIR, O., AND ETZIONI, O. 1998. Web Document Clustering: A Feasibility Demonstration, In Proceedings of SIGIR 98, 46-54.
  102. ZU EISSEN, S. M., STEIN, B., AND POTTHAST, M. 2005. The suffix tree document model revisited, In Proceedings of the 5th International Conference on Knowledge Management, 596-603.

Cited by

  1. Towards Smart Homes Using Low Level Sensory Data vol.11, pp.12, 2011, https://doi.org/10.3390/s111211581
  2. Social relation extraction from texts using a support-vector-machine-based dependency trigram kernel vol.49, pp.1, 2013, https://doi.org/10.1016/j.ipm.2012.04.002
  3. Integrating unified medical language system and association mining techniques into relevance feedback for biomedical literature search vol.17, pp.S9, 2016, https://doi.org/10.1186/s12859-016-1129-z
  4. CoMAGC: a corpus with multi-faceted annotations of gene-cancer relations vol.14, pp.1, 2013, https://doi.org/10.1186/1471-2105-14-323
  5. Data Mining in Healthcare and Biomedicine: A Survey of the Literature vol.36, pp.4, 2012, https://doi.org/10.1007/s10916-011-9710-5
  6. Introducing semantic variables in mixed distance measures: Impact on hierarchical clustering vol.40, pp.3, 2014, https://doi.org/10.1007/s10115-013-0663-5
  7. A Systematic Review on Healthcare Analytics: Application and Theoretical Perspective of Data Mining vol.6, pp.2, 2018, https://doi.org/10.3390/healthcare6020054
  8. Enabling multi-level relevance feedback on PubMed by integrating rank learning into DBMS vol.11, pp.S2, 2010, https://doi.org/10.1186/1471-2105-11-S2-S6