Browse > Article
http://dx.doi.org/10.3743/KOSIM.2019.36.2.175

Knowledge Trend Analysis of Uncertainty in Biomedical Scientific Literature  

Heo, Go Eun (연세대학교 문헌정보학과)
Song, Min (연세대학교 문헌정보학과)
Publication Information
Journal of the Korean Society for information Management / v.36, no.2, 2019 , pp. 175-199 More about this Journal
Abstract
Uncertainty means incomplete stages of knowledge of propositions due to the lack of consensus of information and existing knowledge. As the amount of academic literature increases exponentially over time, new knowledge is discovered as research develops. Although the flow of time may be an important factor to identify patterns of uncertainty in scientific knowledge, existing studies have only identified the nature of uncertainty based on the frequency in a particular discipline, and they did not take into consideration of the flow of time. Therefore, in this study, we identify and analyze the uncertainty words that indicate uncertainty in the scientific literature and investigate the stream of knowledge. We examine the pattern of biomedical knowledge such as representative entity pairs, predicate types, and entities over time. We also perform the significance testing using linear regression analysis. Seven pairs out of 17 entity pairs show the significant decrease pattern statistically and all 10 representative predicates decrease significantly over time. We analyze the relative importance of representative entities by year and identify entities that display a significant rising and falling pattern.
Keywords
text mining; uncertainty; semantic predication; burstiness; trend analysis;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Ioannidis, J. P., & Trikalinos, T. A. (2005). Early extreme contradictory estimates may appear in published research: The proteus phenomenon in molecular genetics research and randomized trials. Journal of Clinical Epidemiology, 58(6), 543-549. https://doi.org/10.1016/j.jclinepi.2004.10.019   DOI
2 Katz, S. M. (1996). Distribution of content words and phrases in text and language modelling. Natural Language Engineering, 2(1), 15-59. https://doi.org/10.1017/S1351324996001246   DOI
3 Madsen, R. E., Kauchak, D., & Elkan, C. (2005). Modeling word burstiness using the dirichlet distribution. In Proceedings of the 22nd International Conference on Machine Learning, (August): 545-552. https://doi.org/10.1145/1102351.1102420
4 Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, 3111-3119.
5 Palmer, F. R. (2014). Modality and the English modals. Routledge.
6 Rindflesch, T. C., & Fiszman, M. (2003). The interaction of domain knowledge and linguistic structure in natural language processing: Interpreting hypernymic propositions in biomedical text. Journal of Biomedical Informatics, 36(6), 462-477. https://doi.org/10.1016/j.jbi.2003.11.003   DOI
7 Rizomilioti, V. (2006). Exploring epistemic modality in academic discourse using corpora. In Information Technology in Languages for Specific Purposes, 53-71. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-28624-2_4
8 Russell, S., Norvig, P., & Intelligence, A. (1995). Artificial intelligence: A modern approach prentice-hall. Englewood cliffs, NJ.
9 Szarvas, G., Vincze, V., Farkas, R., Mora, G., & Gurevych, I. (2012). Cross-genre and cross-domain detection of semantic uncertainty. Computational Linguistics, 38(2), 335-367. https://doi.org/10.1162/COLI_a_00098   DOI
10 Szarvas, G., Vincze, V., Farkas, R., & Csirik, J. (2008, June). The BioScope corpus: Annotation for negation, uncertainty and their scope in biomedical texts. In Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing, 38-45. Association for Computational Linguistics.
11 Thompson, P., Nawaz, R., McNaught, J., & Ananiadou, S. (2011). Enriching a biomedical event corpus with meta-knowledge annotation. BMC bioinformatics, 12(1), 393. https://doi.org/10.1186/1471-2105-12-393   DOI
12 Vincze, V. (2013). Weasels, hedges and peacocks: Discourse-level uncertainty in Wikipedia articles. International Joint Conference on Natural Language Processing, (October): 383-391. Nagoya, Japan.
13 Chen, C., Song, M., & Heo, G. E. (2018). A scalable and adaptive method for finding semantically equivalent cue words of uncertainty. Journal of Informetrics, 12(1), 158-180. https://doi.org/10.1016/j.joi.2017.12.004   DOI
14 Vincze, V., Szarvas, G., Farkas, R., Mora, G., & Csirik, J. (2008). The BioScope corpus: Biomedical texts annotated for uncertainty, negation and their scopes. BMC Bioinformatics, 9(11), S9. https://doi.org/10.1186/1471-2105-9-S11-S9
15 Vold, E. T. (2006). Epistemic modality markers in research articles: A cross-linguistic and cross-disciplinary study. International Journal of Applied Linguistics, 16(1), 61-87. https://doi.org/10.1111/j.1473-4192.2006.00106.x   DOI
16 Wilbur, W. J., Rzhetsky, A., & Shatkay, H. (2006). New directions in biomedical text annotation: Definitions, guidelines and corpus construction. BMC Bioinformatics, 7(1), 356. https://doi.org/10.1186/1471-2105-7-356   DOI
17 Heo, G. E. (2019). The stream of uncertainty in scientific knowledge using topic modeling. Journal of the Korean Society for Information Management, 36(1), 191-213. http://dx.doi.org/10.3743/KOSIM.2019.36.1.191   DOI
18 Bodenreider, O. (2004). The unified medical language system (UMLS): Integrating biomedical terminology. Nucleic Acids Research, 32(suppl_1), D267-D270. https://doi.org/10.1093/nar/gkh061   DOI
19 Bourdieu, P. (1975). The specificity of the scientific field and the social conditions of the progress of reason. Information (International Social Science Council), 14(6), 19-47. https://doi.org/10.1177/053901847501400602   DOI
20 Chapman, W. W., Bridewell, W., Hanbury, P., Cooper, G. F., & Buchanan, B. G. (2001). A simple algorithm for identifying negated findings and diseases in discharge summaries. Journal of Biomedical Informatics, 34(5), 301-310. https://doi.org/10.1006/jbin.2001.1029   DOI
21 Church, K. W., & Gale, W. A. (1995). Poisson mixtures. Natural Language Engineering, 1(2), 163-190. https://doi.org/10.1017/S1351324900000139   DOI
22 Cordner, A., & Brown, P. (2013). Moments of uncertainty: Ethical considerations and emerging contaminants. In Sociological Forum, 28(3), 469-494. https://doi.org/10.1111/socf.12034   DOI
23 Falahati, R. (2006, February). The use of hedging across different disciplines and rhetorical sections of research articles. In Proceedings of the 22nd NorthWest Linguistics Conference (NWLC22), 99-112.
24 Friedman, C., Alderson, P. O., Austin, J. H., Cimino, J. J., & Johnson, S. B. (1994). A general natural-language text processor for clinical radiology. Journal of the American Medical Informatics Association, 1(2), 161-174. https://doi.org/10.1136/jamia.1994.95236146   DOI
25 Hyland, K. (1998). Hedging in scientific research articles (Vol. 54). John Benjamins Publishing.
26 Kuhn, T. S. (1970). The structure of scientific revolutions. University of Chicago Press.
27 Shwed, U., & Bearman, P. S. (2010). The temporal structure of scientific consensus formation. American Sociological Review, 75(6): 817-840. https://doi.org/10.1177/0003122410388488   DOI
28 Solti, I., Cooke, C. R., Xia, F., & Wurfel, M. M. (2009, November). Automated classification of radiology reports for acute lung injury: comparison of keyword and machine learning based natural language processing approaches. In 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshop, 314-319. IEEE. https://doi.org/10.1109/BIBMW.2009.5332081