Browse > Article
http://dx.doi.org/10.3743/KOSIM.2014.31.1.231

Inferring Undiscovered Public Knowledge by Using Text Mining-driven Graph Model  

Heo, Go Eun (연세대학교 문헌정보학과 대학원)
Song, Min (연세대학교 문헌정보학과)
Publication Information
Journal of the Korean Society for information Management / v.31, no.1, 2014 , pp. 231-250 More about this Journal
Abstract
Due to the recent development of Information and Communication Technologies (ICT), the amount of research publications has increased exponentially. In response to this rapid growth, the demand of automated text processing methods has risen to deal with massive amount of text data. Biomedical text mining discovering hidden biological meanings and treatments from biomedical literatures becomes a pivotal methodology and it helps medical disciplines reduce the time and cost. Many researchers have conducted literature-based discovery studies to generate new hypotheses. However, existing approaches either require intensive manual process of during the procedures or a semi-automatic procedure to find and select biomedical entities. In addition, they had limitations of showing one dimension that is, the cause-and-effect relationship between two concepts. Thus;this study proposed a novel approach to discover various relationships among source and target concepts and their intermediate concepts by expanding intermediate concepts to multi-levels. This study provided distinct perspectives for literature-based discovery by not only discovering the meaningful relationship among concepts in biomedical literature through graph-based path interference but also being able to generate feasible new hypotheses.
Keywords
biotext mining; literature based discovery; undiscovered public knowledge; graph model;
Citations & Related Records
연도 인용수 순위
  • Reference
1 LingPipe: Named entity tutorial. (2013, July 1). Retrieved from http://alias-i.com/lingpipe/demos/tutorial/ne/read-me.html/
2 LingPipe: Sentence boundary detection. (2013, July 1). Retrieved from http://alias-i.com/lingpipe/demos/tutorial/sentences/read-me.html/
3 MEDLINE, PubMed XML element descriptions and their attributes. (2013, October 10). Retrieved from http://www.nlm.nih.gov/bsd/licensee/elements_descriptions.html/
4 Narayanasamy, V., Mukhopadhyay, S., Palakal, M., & Potter, D. A. (2004). TransMiner: Mining transitive associations among biological objects from text. Journal of Biomedical Science, 11(6), 864-873.   DOI
5 Smalheiser, N. R., & Swanson, D. R. (1996a). Indomethacin and Alzheimer's disease. Neurology, 46(2), 583-583.
6 NegEx (2013, December 1). Retrieved from http://code.google.com/p/negex/
7 PubMed (2013, August 2). Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/
8 Smalheiser, N. R., & Swanson, D. R. (1994). Assessing a gap in the biomedical literature: Magnesium deficiency and neurologic disease. Neuroscience Research Communications, 15(1), 1-9.
9 Smalheiser, N. R., & Swanson, D. R. (1996b). Linking estrogen to Alzheimer's disease: An informatics approach. Neurology, 47(3), 809-810.   DOI   ScienceOn
10 Srinivasan, P. (2004). Text mining: Generating hypotheses from MEDLINE. Journal of the American Society for Information Science and Technology, 55(5), 396-413.   DOI   ScienceOn
11 Weeber, M., Klein, H., de Jong-van den Berg, L., & Vos, R. (2001). Using concepts in literaturebased discovery: Simulating Swanson's Raynaud-fish oil and migraine-magnesium discoveries. Journal of the American Society for Information Science and Technology, 52(7), 548-557.   DOI
12 Weeber, M., Vos, R., Klein, H., Aronson, A. R., & Molema, G. (2003). Generating hypotheses by discovering implicit associations in the literature: a case report of a search for new potential therapeutic uses for thalidomide. Journal of the American Medical Informatics Association, 10(3), 252-259.   DOI
13 Wilkowski, B., Fiszman, M., Miller, C., Hristovski, D., Arabandi, S., Rosemblat, G., & Rindflesch, T. (2011). Discovery browsing with semantic predications and graph theory. In AMIA Annual Symposium Proceedings.
14 Automatic Classification for English Verbs. (2013, July 1). Retrieved from http://www.cl.cam.ac.uk/-ls418/resource_release/
15 Cameron, D., Bodenreider, O., Yalamanchili, H., Danh, T., Vallabhaneni, S., Thirunarayan, K., Sheth, A. P., & Rindflesch, T. C. (2013). A graph-based recovery and decomposition of swanson's hypothesis using semantic predications. Journal of Biomedical Informatics, 46(2), 238-251.   DOI   ScienceOn
16 DiGiacomo, R. A., Kremer, J. M., & Shah, D. M. (1989). Fish oil dietary supplementation in patients with Raynaud's phenomenon: A doubleblind, controlled, prospective study. American Journal of Medicine, 8, 158-164.
17 Hristovski, D., Friedman, C., Rindflesch, T. C., & Peterlin, B. (2006). Exploiting semantic relations for literature-based discovery. In AMIA Annual Symposium Proceedings, 349-353. American Medical Informatics Association.
18 Frijters, R., Heupers, B., van Beek, P., Bouwhuis, M., van Schaik, R., de Vlieg, J., Polman, J., & Alkema, W. (2008). CoPub: a literature-based keyword enrichment tool for microarray data analysis. Nucleic Acids Research, 36(suppl 2), W406-W410.   DOI   ScienceOn
19 Hristovski, D., Rindflesch, T., & Peterlin, B. (2013). Using literature-based discovery to identify novel therapeutic approaches. Cardiovascular and Hematological Agents in Medicinal Chemistry, 11(1), 14-24.   DOI
20 Frijters, R., van Vugt, M., Smeets, R., van Schaik, R., de Vlieg, J., & Alkema, W. (2010). Literature mining for the discovery of hidden connections between drugs, genes and diseases. PLoS Computational Biology, 6(9), 1-11. e1000943.
21 Hristovski, D., Peterlin, B., Mitchell, J. A., & Humphrey, S. M. (2005). Using literature-based discovery to identify disease candidate genes. International Journal of Medical Informatics, 74(2), 289-298.   DOI
22 Kim, J. D., Ohta, T., Tateisi, Y., & Tsujii, J. (2003). GENIA corpus-a semantically annotated corpus for bio-textmining. Bioinformatics, 19(1), 180-182.   DOI   ScienceOn
23 Lafferty, J., McCallum, A., & Pereira, F. C. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In International Conference on Machine Learning, 282-289.
24 Liekens, A. M., De Knijf, J., Daelemans, W., Goethals, B., De Rijk, P., & Del-Favero, J. (2011). BioGraph: unsupervised biomedical knowledge discovery via automated hypothesis generation. Genome Biology, 12(6), R57.   DOI
25 Swanson, D. R. (1988). Migraine and magnesium: Eleven neglected connections. Perspectives in Biology and Medicine, 31(4), 526-557.   DOI
26 Sun, L., & Korhonen, A. (2009). Improving verb clustering with automatically acquired selectional preferences. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, 2, 638-647. Association for Computational Linguistics.
27 Swanson, D. R. (1986a). Undiscovered public knowledge. The Library Quarterly, 56(2), 103-118.   DOI
28 Swanson, D. R. (1986b). Fish oil, Raynaud's syndrome, and undiscovered public knowledge. Perspectives in Biology and Medicine, 30(1), 7-18.   DOI
29 Swanson, D. R. (1990a). Somatomedin C and arginine: Implicit connections between mutually isolated literatures. Perspectives in Biology and Medicine, 33(2), 157-186.   DOI
30 Swanson, D. R., & Smalheiser, N. R. (1997). An interactive system for finding complementary literatures: A stimulus to scientific discovery. Artificial Intelligence, 91(2), 183-203.   DOI   ScienceOn
31 Swanson, D. R., Smalheiser, N. R., & Bookstein, A. (2001). Information discovery from complementary literatures: Categorizing viruses as potential weapons. Journal of the American Society for Information Science and Technology, 52(10), 797-812.   DOI   ScienceOn
32 Swanson, D. R., Smalheiser, N. R., & Torvik, V. I. (2006). Ranking indirect connections in literature-based discovery: The role of medical subject headings. Journal of the American Society for Information Science and Technology, 57(11), 1427-1439.   DOI   ScienceOn
33 UMLS Reference Manual. (2013, October 10). Retrieved from http://www.ncbi.nlm.nih.gov/books/NBK9676/
34 Kilicoglu, H., Shin, D., Fiszman, M., Rosemblat, G., & Rindflesch, T. C. (2012). SemMedDB: a PubMed-scale repository of biomedical semantic predications. Bioinformatics, 28(23), 3158-3160.   DOI   ScienceOn