References
- AGRAWAL, R., ET AL. 2000. Fast Discovery of Association Rules, Advances in Knowledge Discovery and Data Mining, U. Fayyad, et al., Editors. AAAI/MIT Press.
- ANDRADE, M. A. AND BORK, P. 2000. Automated extraction of information in molecular biology. FEBS Letters, 476:12-7. https://doi.org/10.1016/S0014-5793(00)01661-6
- ANDRADE, M., BLASCHKE, C. AND VALENCIA, A. 1999. AbXtract: Automatic Abstract eXtraction of keywords associated to protein function. Bioinformatics, 14(7):600-7.
- AO, H. AND TAKAGI, T. 2005. ALICE: An algorithm to extract abbreviations from MEDLINE, Journal of the American Medical Informatics Association, 12: 576-586. https://doi.org/10.1197/jamia.M1757
- BAEZA-YATES, R. AND RIBEIRO-NETO, B. 1999. Modern Information Retrieval.
- BERGER, A., DELLA PIETRA, S., AND DELLA PIERTA, V. 1999. A maximum entropy approach to natural language processing. Computational Linguistics, Vol. 22, p. 39-71
- BLASCHKE, C., ANDRADE, M. A., OUZOUNIS, C., AND VALENCIA, A. 1999. Automatic Extraction of Biological Information from Scientific Text: Protein-Protein Interactions, In Proceedings of the First International Conference on Intelligent Systems for Molecular Biology, 60-67.
- BODENREIDER, O. 2006. Lexical, Terminological, and Ontological Resources for Biological Text Mining, Text Mining for Biology and Biomedicine, Ananiadou S. and McNaught J. (eds.), Artech House, 43-66.
- BRANK, J., GROBELNIK, M., MILI -FRAYLING, N., AND MLADENI , D. 2002. Interaction of feature selection methods and linear classification models. Proceedings of the ICML-02 Workshop on Text Learning, Sydney, AU.
- BROWNE, A., MCCRAY, A., AND SRINIVASAN, S., The SPECIALIST Lexicon, Lister Hill National Center for Biomedical Communications, National Library of Medicine (NLM), http://lexsrv3.nlm.nih.gov/SPECIALIST/Projects/lexicon/current/release/LEX/DOCS/techrpt.pdf.
- BUCKLEY, C. AND LEWIT, A. F. 1985. Optimization of inverted vector searches. In Proceedings of SIGIR-85, 97-110.
- BUCKLEY, C., SALTON, G., ALLEN, J. AND SINGHAL, A. 1995. Automatic query expansion using SMART: TREC-3. In: D. K. Harman (ed.), The Third Text Retrieval Conference (TREC-3). U.S. Department of Commerce, 69-80.
- CHANG, J. T., SCHÜTZE, H. AND ALTMAN, R. B. 2002. Creating an Online Dictionary of Abbreviations from MEDLINE, The Journal of the American Medical Informatics Association, 9: 612-620. https://doi.org/10.1197/jamia.M1139
- COLLIER, N., NOBATA, C., AND TSUJII, J. 2000. Extracting the Names of Genes and Gene Products with a Hidden Markov Model. Proceedings of the 18th International Conference on Computational Linguistics (COLING2000), 201-207.
- COWIE, J. AND LEHNERT, W. 1996. Information extraction. Communications of ACM, 39:80-91. https://doi.org/10.1145/234173.234209
- CRAMMER, K., AND SINGER, Y. 2001. On the algorithmic implementation of multiclass kernelbased vector machines. Journal of Machine Learning Research, Vol. 2, p. 265-292.
- CRAVEN, M., AND KUMLIEN, J. 1999. Constructing Biological Knowledge Bases by Extracting Information from Text Sources. Proceedings of the 7th International Conference on Intelligent Systems for Molecular Biology, 77-86.
- CUTTING, D., KARGER, D., PEDERSEN, J., AND TUKEY, J. 1992. Scatter/Gather: A Cluster-based Approach to Browsing Large Document Collections, In Proceedings of SIGIR '92, 318-329.
- DE BRUIJN, B., AND MARTIN, J. 2002. Getting to the (C)ore of knowledge: mining biomedical literature. International Journal of Medical Informatics, (67): 7-18.
- DEMETRIOU, G., AND GAIZAUSKAS, R. 2002. Utilizing text mining results: The Pasta Web System. Proceedings of the Workshop on Natural Language Processing in the Biomedical Domain, 77-84.
- DING, J., BERLEANT, D., NETTLETON, D., AND WURTELE, E. 2002. Mining MEDLINE: abstracts, sentences, or phrases? Pacific Symposium on Biocomputing, 326-337.
- DOMINGOS, P., AND PAZZANI, M. 1997. On the Optimality of the Simple Bayesian Classifier under Zero-One Loss, in: Machine Learning, Vol. 29:2-3, p. 103-130. https://doi.org/10.1023/A:1007413511361
- DONALDSON, I., MARTIN, J., DE BRUIJN, B., AND WOLTING, C. 2003. "PreBIND and Textomy-mining the biomedical literature for protein-protein interactions using a support vector machine", BMC Bioinformatics, Vol. 4:11, p. 11-23. https://doi.org/10.1186/1471-2105-4-11
- DUMAIS, S. T., PLATT, J., HECKERMAN, D., AND SAHAMI, M. 1998. Inductive learning algorithms and representations for text categorization. Proceedings of CIKM-98, 7th ACM International Conference on Information and Knowledge Management, eds. G. Gardarin, J.C. French, N. Pissinou, K. Makki & L. Bouganim, ACMPress, New York, US: Bethesda, US, p. 148-155.
- Evidence-Based Medicine Working Group 1992. Evidence-based medicine. A new approach to teaching the practice of medicine. JAMA, Nov 1992; 268: 2420-2425. https://doi.org/10.1001/jama.1992.03490170092032
- FAN, W., WALLACE, L., RICH, S., AND ZHANG, Z. 2005. Tapping into the power of text mining, Communications of ACM, forthcoming.
- FRIEDMAN, C., KRA, P., YU, H., KRAUTHAMMER, M. AND RZHETSKY, A. 2001. GENIES: a natural-language processing system for the extraction of molecular pathways from journal articles. Bioinformatics, 17 Suppl 1, S74−82.
- FUKUDA, K., TAMURA, A., TSUNODA, T., AND TAKAGI, T. 1998. Toward information extraction: identifying protein names from biological papers. Pacific Symposium on Biocomputing, 707-18.
- GRUBER, T. R. A. 1993. Translation Approach to Portable Ontology Specifications. Knowledge Acquisition, Vol. 5, pp. 199-220. https://doi.org/10.1006/knac.1993.1008
- GRUBER, T. R. 1995. Towards Principles for the Design of Ontologies used for Knowledge Sharing. International Journal of Human-Computer Studies, 43, 907-928. https://doi.org/10.1006/ijhc.1995.1081
- GRUNINGER, M., AND LEE, J. 2002. Ontology applications and design, Communications of the ACM, February, Vol. 45, No. 2, 39-41. https://doi.org/10.1145/585597.585599
- HAHN, U., ROMACKER, M., AND SCHULZ, S. 2002. Creating Knowledge Repositories from Biomedical Reports: The MEDSYNDIKATE Text Mining System. Pacific Symposium on Biocomputing, 338-349.
- HEARST, M. A., SCHOELKOPF, B., DUMAIS, S., OSUNA, E., AND PLATT, J. 1998. Trends and Controversies-Support Vector Machines, in: IEEE Intelligent Systems, Vol. 13:4, p. 18-28.
- HERRMANN, K. 2001. Rakesh Agrawal: Athena: Mining-based Interactive Management of Text Databases, URL: http://www3.informatik.tumuenchen.de/lehre/WS2001/HSEM-bayer/textmining.pdf [as of 2002-03-02].
- HIRSCHMAN, L., PARK, J. C., TSUJII, J., WONG, L., AND WU, C. H. 2002. Accomplishments and challenges in literature data mining for biology. Bioinformatics, 18(12): 1553-1561. https://doi.org/10.1093/bioinformatics/18.12.1553
- HOTHO, A., MAEDCHE, A., AND STAAB, S. 2002. Text clustering based on good aggregations. Kunstliche Intelligenz (KI), 16, 4, 48-54.
- HRISTOVSKI, D., STARE, J., PETERLIN, B., AND DZEROSKI, S. 2001. Supporting discovery in medicine by association rule mining in Medline and UMLS, Medinfo, 10, 1344-1348.
- HRISTOVSKI, D., PETERLIN, B., MITCHELL, J. A., AND HUMPHREY, S. M. 2003. Improving literature based discovery support by genetic knowledge integration, Stud. Health Technol. Inform. 95:68-73.
- HUMPHREYS, K., DEMETRIOU, G., AND GAIZAUSKAS, R. 2000. Two applications of information extraction to biological science journal articles: enzyme interactions and protein structures. Pacific Symposium on Biocomputing, 505-16.
- JENSSEN, T. K., et al. 2001. A literature network of human genes for high-throughput analysis of gene expression. Nat. Genet., 28, 21-28.
- JENSSEN, T. K., LAEGREID, A., KOMOROWSKI, J., AND HOVIG, E. 2001. A literature network of human genes for high-throughput analysis of gene expression. Nature Genetics, 28(1):21-8.
- JENSSEN, T. K., LAEGREID, A., KOMOROWSKI, J., AND HOVIG, E. 2001. A literature network of human genes for high-throughput analysis of gene expression, Nature Genetics, Vol. 28, p. 21-28.
- JOACHIMS, T. 1998. Text categorization with support vector machines: learning with many relevant features. Proceedings of ECML-98, 10th European Conference on Machine Learning, eds. C. Nedellec & C. Rouveirol, Springer Verlag, Heidelberg, DE: Chemnitz, DE, p. 137-142.
- JOACHIMS, T. 1999. Transductive inference for text classification using support vector machines. Proceedings of ICML-99, 16th International Conference on Machine Learning, eds. I. Bratko & S. Dzeroski, Morgan Kaufmann Publishers, San Francisco, US: Bled, SL, p. 200-209.
- JOSHI, R., LI, X. L., RAMACHANDARAN, S., AND LEONG, T. Y. 2004. Automatic Model Structuring from Text using BioMedical Ontology, In American Association for Artificial Intelligence (AAAI) Workshop, pp. 74-79, San Jose, California, July.
- KARANIKAS, H., AND THEODOULIDIS, B. 2002. Knowledge discovery in text and text mining software, Technical report, UMIST−CRIM, Manchester.
- KAUFMAN, L., AND ROUSSEEUW, P. J. 1999. Finding Groups in Data: an Introduction to Cluster Analysis, John Wiley & Sons.
- KOLLER, D., AND SAHAMI, M. 1997. Hierarchically classifying documents using very few words. In Proceedings of ICML-97, 170-176.
- KRAUTHAMMER, M., RZHETSKY, A., MOROZOV, P., AND FRIEDMAN, C. 2000. Using BLAST for identifying gene and protein names in journal articles. Gene, 259(1-2): 245-252. https://doi.org/10.1016/S0378-1119(00)00431-5
- LEE, K., HWANG, Y., AND RIM, H. 2003. Two-Phase Biomedical NE Recognition based on SVMs. Proceedings of the ACL 2003 Workshop on Natural Language Processing in Biomedicine, 33-40.
- LEEK, T. R. 1997. Information Extraction Using Hidden Markov Models. MSc Thesis, Department of Computer Science, University of California, San Diego.
- LEOPOLD, E., AND KINDERMANN, J. 2002. Text Categorization with Support Vector Machines. How to Represent Texts in Input Space? Machine Learning, Vol. 46:1-3, p. 423-444. https://doi.org/10.1023/A:1012491419635
- LIN, J., AND DEMNER-FUSHMAN, D. 2007. Semantic Clustering of Answers to Clinical Questions, Proceedings of the 2007 Annual Symposium of the American Medical Informatics Association (AMIA 2007), Chicago, Illinois, pp. 458-462.
- LIU, F., JENSSEN, T. K., NYGAARD, V., SACK, J., AND HOVIG, E. 2004. FigSearch: Using Maximum Entropy Classifier to Categorize Biological Figures. Proceedings of IEEE Computational Systems Bioinformatics Conference, p. 476-477
- LIU, H. AND FRIEDMAN, C. 2003. Mining Terminological Knowledge in Large Biomedical Corpora, Proceedings of the Pacific Symposium on Biocomputing, 8: 415-426.
- MOONEY, R. J., AND NAHM, U. Y. 2003. Text Mining with Information Extraction, Multilingualism and Electronic Language Management: Proceedings of the 4th International MIDP Colloquium, 22-23 September, Bloemfontein, South Africa, pp. 141-160.
- NARAYANASWAMY, M., RAVIKUMAR, K. E. AND VIJAY-SHANKER, K. 2003. A biological named entity recognizer. Pacific Symposium on Biocomputing, 427-438.
- National Library of Medicine (NLM), MEDLINE, http://www.ncbi.nlm.nih.gov/sites/entrez?cmd=File&db=PubMed, 2008.
- National Library of Medicine (NLM), Medical Subject Headings (MeSH) Fact Sheet, http://www.nlm.nih.gov/pubs/factsheets/mesh.html, 2008.
- National Library of Medicine (NLM), Unified Medical Language System (UMLS) Fact Sheet, http://www.nlm.nih.gov/pubs/factsheets/umls.html, 2008.
- NG, S. K., AND WONG, M. 1999. Toward routine automatic pathway discovery from on-line scientific text abstracts. Genome Informatics Series: Workshop on Genome Informatics, 10: 104-112.
- OHTA, Y., YAMAMOTO, Y., OKAZAKI, T., UCHIYAMA, I., AND TAKAGI, T. 1997. Automatic construction of knowledge base from biological papers. Proceedings of International Conference on Intelligent System for Molecular Biology, 5:218-25.
- PAKHOMOV, S. V., RUGGIERI, A., AND CHUTE, C. G. 2002. Maximum entropy modeling for mining patient medication status from free text, Proc AMIA Symp, p. 587-91
- PANT, G., AND SRINIVASAN, P. 2005. Learning to crawl: Comparing classification schemes. ACM Transactions on Information Systems, Vol. 23, p. 430-462. https://doi.org/10.1145/1095872.1095875
- PANTEL, P., AND LIN, D. 2002. Document clustering with committees. In Proceedings of the 2002 ACM SIGMOD International Conference on Management of data, 199-206.
- PARK, J. C., KIM, H. S., AND KIM, J. J. 2001. Bidirectional Incremental Parsing for Automatic Pathway Identification with Combinatory Categorical Grammar, Pacific Symposium on Biocomputing, 396-407.
- PEREZ-IRATXETA, C., BORK, P., AND ANDRADE, M. A. 2002. Association of genes to genetically inherited diseases using data mining. Nat. Genet., 31, 316-319.
- PRATT, WANDA AND YETISGEN-YILDIZ, Meliha, 2003. LitLinker: capturing connections across the biomedical literature, K-CAP'03, pp. 105-112, Sanibel Island, FL, Oct. 23-25.
- PRATT, W., AND FAGAN, L. 2000. The Usefulness of Dynamically Categorizing Search Results, Journal of the American Medical Informatics Association, 7(6), pp. 605-617. https://doi.org/10.1136/jamia.2000.0070605
- PRATT, W., HEARST, M., AND FAGAN, L. 1999. A knowledge-based approach to organizing retrieved documents, AAAI '99: Proceedings of the 16th National Conference on Artificial Intelligence, Orlando, Florida, pp. 80-85.
- PROUX, D., RECHENMANN, F., AND JULLIARD, L. 2000. A pragmatic information extraction strategy for gathering data on genetic interactions. Proceedings of International Conference on Intelligent System for Molecular Biology, 8:279-85.
- PUSTEJOVSKY, J., CASTANO, J., ZHANG, J., KOTECKI, M., AND COCHRAN, B. 2002. Robust relational parsing over biomedical literature: extracting inhibit relations. Pacific Symposium on Biocomputing, 362-73.
- RAY, S. AND CRAVEN, M. 2001. Representing Sentence Structure in Hidden Markov Models for Information Extraction. Proceedings of the 17th International Joint Conference on Artificial Intelligence, Seattle, WA. Morgan Kaufmann.
- RAYCHAUDHURI, S., CHANG, J. T., SUTPHIN, P. D., AND ALTMAN, R. B. 2002. Associating genes with Gene Ontology codes using a maximum entropy analysis of biomedical literature, Genome Research, Vol. 12, p. 203-14. https://doi.org/10.1101/gr.199701
- RINDFLESCH, T. C., TANABE, L., WEINSTEIN, J. N., AND HUNTER, L. 2000. EDGAR: extraction of drugs, genes and relations from the biomedical literature. Pacific Symposium on Biocomputing, 517-28.
- RISH, I., An empirical study of naïve Bayes classifier. In IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence, p. 41-46.
- SCHWARTZ, A. S., AND HEARST, M. A. 2003. A simple algorithm for identifying abbreviation definitions in biomedical text, Proceedings of the Pacific Symposium on Biocomputing, 8: 451-462.
- SEBASTIANI, F. 2002. Machine Learning in automated text categorization. ACM Computing Surveys, Vol. 34, p. 1-47. https://doi.org/10.1145/505282.505283
- SHATKAY, H., AND FELDMAN, R. 2003. Mining the biomedical literature in the genomic era: An overview. Journal of Computational Biology, 10(6): 821-855. https://doi.org/10.1089/106652703322756104
- SIOLAS, G., AND D'ALCHÉ-BUC, F. 2000. Support Vector Machines based on a semantic kernel for text categorization, in: Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks (IJCNN'00), p. 205-209.
- SONG, M., AND YOO, I. 2007. A Hybrid Abbreviation Extraction Technique for Biomedical Literature, accepted in 2007 IEEE International Conference on Bioinformatics and Biomedicine (IEEE BIBM 2007), San Jose, CA, USA, Nov. 2-4.
- SRINIVASAN, P. 2004. Text mining: Generating hypotheses from MEDLINE, Journal of the American Society for Information Science, Vol. 55, No. 4, pp. 396-413. https://doi.org/10.1002/asi.10389
- STAPLEY, B. J., AND BENOIT, G. 2000. Biobibliometrics: information retrieval and visualization from co-occurrences of gene names in MEDLINE abstracts. Pacific Symposium on Biocomputing, 529-40.
- STAPLEY, B. J., KELLEY, L. A., AND STERNBERG, M. J. E. 2002. Predicting the sub-cellular location of proteins from text using support vector machines, Pacific Symposia in Biocomputing, p. 374-85.
- STEINBACH, M., KARYPIS, G., AND KUMAR, V. 2000. A comparison of document clustering techniques. Technical Report #00-034. Department of Computer Science and Engineering, University of Minnesota.
- SWANSON, D. R. 1986. Undiscovered public knowledge. Libr. Q. 56(2):103-118. https://doi.org/10.1086/601720
- SWANSON, D. R. 1987. Two medical literatures that are logically but not bibliographically connected. JASIS, 38(4):228-233. https://doi.org/10.1002/(SICI)1097-4571(198707)38:4<228::AID-ASI2>3.0.CO;2-G
- SWANSON, D. R., AND SMALHEISER, N. R. 1999. Implicit text linkages between Medline records: Using Arrowsmith as an aid to scientific discovery. Library Trends, 48(1):48-59.
- TANABE, L., SCHERF, U., SMITH, L. H., LEE, J. K., HUNTER, L., AND WEINSTEIN, J. N. 1999. MedMiner: an Internet text-mining tool for biomedical information, with application to gene expression profiling. Biotechniques, 27(6), 1210-4, 1216-7.
- THOMAS, J., MILWARD, D., OUZOUNIS, C., PULMAN, S., AND CARROLL, M. 2000. Automatic extraction of protein interactions from scientific abstracts. Pacific Symposium on Biocomputing, 541-52.
- VAN RIJSBERGEN, C. J. 1979. Information Retrieval, 2nd edition, London: Buttersworth.
- VAPNIK, V. N. 1995. The nature of statistical learning theory. Springer Verlag: Heidelberg, DE.
- WANG, B. B., MCKAY, R. I., ABBASS, H. A., AND BARLOW, M. 2002. Learning Text Classifier using the Domain Concept Hierarchy. In Proceedings of International Conference on Communications, Circuits and Systems 2002, China.
- WITTEN, I. H., AND FRANK, E. 2000. Data Mining - Practical Machine Learning Tools and Techniques with Java Implementations, Morgan Kaufmann Publishers: San Francisco.
- YAKUSHIJI, A., TATEISI, Y., MIYAO, Y., AND TSUJII, J. 2001. Event extraction from biomedical papers using afull parser. Pacific Symposium on Biocomputing, 408-19.
- YAMAMOTO, K., KUDO, T., KONAGAYA, A., AND MATSUMOTO, Y. 2003. Protein Name Tagging for Biomedical Annotation in Text. Proceedings of the ACL 2003 Workshop on Natural Language Processing in Biomedicine, 65-72.
- YANG, Y., AND LIU, X. 1999. A Re-Examination of Text Categorization Methods, in: Proceedings of the 22nd Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval, p. 42-49.
- YOO, I., HU, X., AND SONG, I.-Y. 2007. A Coherent Graph-based Semantic Clustering and Summarization Approach for Bi/omedical Literature and a New Summarization Evaluation Methods, BMC Bioinformatics, 8(Suppl 9):S4.
- YOO, I., HU, X., AND SONG, I.-Y. 2006. Integration of Semantic-based Bipartite Graph Representation and Mutual Refinement Strategy for Biomedical Literature Clustering, in the 12th SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 791-796, Philadelphia, USA, August 20-23.
- YU, H., HRIPCSAK, G., AND FRIEDMAN, C. 2002. Mapping abbreviations to full forms in biomedical articles, Journal of the American Medical Informatics Association, 9: 162-172.
- ZAMIR, O., AND ETZIONI, O. 1998. Web Document Clustering: A Feasibility Demonstration, In Proceedings of SIGIR 98, 46-54.
- ZU EISSEN, S. M., STEIN, B., AND POTTHAST, M. 2005. The suffix tree document model revisited, In Proceedings of the 5th International Conference on Knowledge Management, 596-603.
Cited by
- Towards Smart Homes Using Low Level Sensory Data vol.11, pp.12, 2011, https://doi.org/10.3390/s111211581
- Social relation extraction from texts using a support-vector-machine-based dependency trigram kernel vol.49, pp.1, 2013, https://doi.org/10.1016/j.ipm.2012.04.002
- Integrating unified medical language system and association mining techniques into relevance feedback for biomedical literature search vol.17, pp.S9, 2016, https://doi.org/10.1186/s12859-016-1129-z
- CoMAGC: a corpus with multi-faceted annotations of gene-cancer relations vol.14, pp.1, 2013, https://doi.org/10.1186/1471-2105-14-323
- Data Mining in Healthcare and Biomedicine: A Survey of the Literature vol.36, pp.4, 2012, https://doi.org/10.1007/s10916-011-9710-5
- Introducing semantic variables in mixed distance measures: Impact on hierarchical clustering vol.40, pp.3, 2014, https://doi.org/10.1007/s10115-013-0663-5
- A Systematic Review on Healthcare Analytics: Application and Theoretical Perspective of Data Mining vol.6, pp.2, 2018, https://doi.org/10.3390/healthcare6020054
- Enabling multi-level relevance feedback on PubMed by integrating rank learning into DBMS vol.11, pp.S2, 2010, https://doi.org/10.1186/1471-2105-11-S2-S6