Browse > Article
http://dx.doi.org/10.4275/KSLIS.2019.53.3.247

Network Analysis between Uncertainty Words based on Word2Vec and WordNet  

Heo, Go Eun (연세대학교 문헌정보학과)
Publication Information
Journal of the Korean Society for Library and Information Science / v.53, no.3, 2019 , pp. 247-271 More about this Journal
Abstract
Uncertainty in scientific knowledge means an uncertain state where propositions are neither true or false at present. The existing studies have analyzed the propositions written in the academic literature, and have conducted the performance evaluation based on the rule based and machine learning based approaches by using the corpus. Although they recognized that the importance of word construction, there are insufficient attempts to expand the word by analyzing the meaning of uncertainty words. On the other hand, studies for analyzing the structure of networks by using bibliometrics and text mining techniques are widely used as methods for understanding intellectual structure and relationship in various disciplines. Therefore, in this study, semantic relations were analyzed by applying Word2Vec to existing uncertainty words. In addition, WordNet, which is an English vocabulary database and thesaurus, was applied to perform a network analysis based on hypernyms, hyponyms, and synonyms relations linked to uncertainty words. The semantic and lexical relationships of uncertainty words were structurally identified. As a result, we identified the possibility of automatically expanding uncertainty words.
Keywords
Text Mining; Bibliometrics; Uncertainty; Word2Vec; WordNet; Network Analysis;
Citations & Related Records
Times Cited By KSCI : 3  (Citation Analysis)
연도 인용수 순위
1 Pyysalo, S., Ginter, F., Moen, H., Salakoski, T. and Ananiadou, S. 2013. "Distributional semantics resources for biomedical text processing." In: LBM. Tokyo: Database Center for Life Science.
2 Rei, M. and Briscoe, T. 2010. "Combining manual rules and supervised learning for hedge cue and scope detection." In Proceedings of the Fourteenth Conference on Computational Natural Language Learning---Shared Task, (July): 56-63. Association for Computational Linguistics.
3 Heo, G. E. and Song, M. 2013. "Examining the Intellectual Structure of a Medical Informatics Journal with Author Co-citation Analysis and Co-word Analysis." Journal of the Korean Society for Information Management, 30(2): 207-225.   DOI
4 Heo, G. E. and Song, M. 2019. "Knowledge Trend Analysis of Uncertainty in Biomedical Scientific Literature." Journal of the Korean Society for Information Management, 36(2): 175-199.   DOI
5 Heo, G. E. 2019. "The Stream of Uncertainty in Scientific Knowledge using Topic Modeling." Journal of the Korean Society for Information Management, 36(1): 191-213.   DOI
6 Wu, Z. and Palmer, M. 1994. "Verbs semantics and lexical selection." In Proceedings of the 32nd annual meeting on Association for Computational Linguistics, (June): 133-138. Association for Computational Linguistics.
7 Zhang, S., Zhao, H., Zhou, G. and Lu, B. L. 2010. "Hedge detection and scope finding by sequence labeling with normalized feature selection." In Proceedings of the Fourteenth Conference on Computational Natural Language Learning---Shared Task, (July): 92-99. Association for Computational Linguistics.
8 Blondel, V. D., Guillaume, J. L., Lambiotte, R. and Lefebvre, E. 2008. "Fast unfolding of communities in large networks." Journal of statistical mechanics: theory and experiment, 2008(10), P10008.
9 Banerjee, S. and Pedersen, T. 2002. "An adapted Lesk algorithm for word sense disambiguation using WordNet." In International Conference on Intelligent Text Processing and Computational Linguistics, 136-145. Springer, Berlin, Heidelberg.
10 Bastian, M., Heymann, S. and Jacomy, M. 2009. "Gephi: an open source software for exploring and manipulating networks." Icwsm, 8: 361-362.
11 Chen, C., Song, M. and Heo, G. E. 2018. "A scalable and adaptive method for finding semantically equivalent cue words of uncertainty." Journal of Informetrics, 12(1): 158-180. https://doi.org/10.1016/j.joi.2017.12.004   DOI
12 Heo, G. E., Kang, K. Y., Song, M. and Lee, J. H. 2017. "Analyzing the field of bioinformatics with the multi-faceted topic modeling technique." BMC bioinformatics, 18(7): 251.   DOI
13 Jeong Y. K., Heo, G. E. Kang, K. Y., Yoon, D. S. and Song, M. 2016. "Trajectory analysis of drug-research trends in pancreatic cancer on PubMed and ClinicalTrials." gov. Journal of Informetrics, 10(1): 273-285.   DOI
14 Kilicoglu, H. and Bergler, S. 2008. "Recognizing speculative language in biomedical research articles: a linguistically motivated perspective." BMC bioinformatics, 9(11): S10.   DOI
15 Kostoff, R. N., del Rio, J. A., Humenik, J. A., Garcia, E. O. and Ramirez, A. M. 2001. Citation mining: Integrating text mining and bibliometrics for research user profiling. Journal of the American Society for Information Science and Technology, 52(13): 1148-1156.   DOI
16 Daim, T. U., Rueda, G., Martin, H. and Gerdsri, P. 2006. "Forecasting emerging technologies: Use of bibliometrics and patent analysis." Technological Forecasting and Social Change, 73(8): 981-1012.   DOI
17 Farkas, R., Vincze, V., Mora, G., Csirik, J. and Szarvas, G. 2010. "The CoNLL-2010 shared task: learning to detect hedges and their scope in natural language text." 1-12. Association for Computational Linguistics.
18 Fernandes, E. R., Crestana, C. E. and Milidiu, R. L. 2010. "Hedge detection using the RelHunter approach." In Proceedings of the Fourteenth Conference on Computational Natural Language Learning---Shared Task, (July): 64-69. Association for Computational Linguistics.
19 Freeman, L. C. 1978. "Centrality in social networks conceptual clarification." Social networks, 1(3): 215-239.   DOI
20 Geaney, F., Scutaru, C., Kelly, C., Glynn, R. W. and Perry, I. J. 2015. "Type 2 diabetes research yield, 1951-2012: bibliometrics analysis and density-equalizing mapping." PloS one, 10(7): e0133009.   DOI
21 Hyland, K. 1996. "Talking to the academy: Forms of hedging in science research articles." Written communication, 13(2): 251-281.   DOI
22 Hyland, K. 1998. Hedging in scientific research articles, Vol. 54. John Benjamins Publishing.
23 Sanchez, L. M., Li, B. and Vogel, C. 2010. "Exploiting CCG structures with tree kernels for speculation detection." In Proceedings of the Fourteenth Conference on Computational Natural Language Learning---Shared Task, (July): 126-131. Association for Computational Linguistics.
24 Madani, F. and Weber, C. 2016. "The evolution of patent mining: Applying bibliometrics analysis and keyword network analysis." World Patent Information, 46: 32-48.   DOI
25 Lesk, M. 1986. "Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone." In Proceedings of the 5th annual international conference on Systems documentation, (pp. 24-26). ACM.
26 Li, X., Shen, J., Gao, X. and Wang, X. 2010. "Exploiting rich features for detecting hedges and their scope." In Proceedings of the Fourteenth Conference on Computational Natural Language Learning---Shared Task, (July): 78-83. Association for Computational Linguistics.
27 Light, M., Qiu, X. Y. and Srinivasan, P. 2004. "The language of bioscience: Facts, speculations, and statements in between." In Proceedings of BioLink 2004 workshop on linking biological literature, ontologies and databases: tools for users, (May): 17-24. Association for Computational Linguistics.
28 Malhotra, A., Younesi, E., Gurulingappa, H. and Hofmann-Apitius, M. 2013. "'Hypothesis Finder': a strategy for the detection of speculative statements in scientific text." PLoS computational biology, 9(7): e1003117.   DOI
29 Medlock, B. and Briscoe, T. 2007. "Weakly supervised learning for hedge classification in scientific literature." In ACL, (June): 992-999.
30 Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S. and Dean, J. 2013. "Distributed representations of words and phrases and their compositionality." In Advances in neural information processing systems, (pp. 3111-3119).
31 Miller, G. A., Beckwith, R., Fellbaum, C., Gross, D. and Miller, K. J. 1990. "Introduction to WordNet: An on-line lexical database." International journal of lexicography, 3(4): 235-244.   DOI
32 Palmer, F. R. 2014. Modality and the English modals. Routledge.
33 Tang, B., Wang, X., Wang, X., Yuan, B. and Fan, S. 2010. "A cascade method for detecting hedges and their scope in natural language text." In Proceedings of the Fourteenth Conference on Computational Natural Language Learning---Shared Task, (July): 13-17. Association for Computational Linguistics.
34 Song, M., Heo, G. E. and Kim, S. Y. 2014. "Analyzing topic evolution in bioinformatics: investigation of dynamics of the field with conference data in DBLP." Scientometrics, 101(1): 397-428.   DOI
35 Song, M., Heo, G. E. and Lee, D. H. 2014. "Identifying the Landscape of Alzheimer's Disease Research with Network and Content Analysis." Scientometrics, 102(1): 905-927.   DOI
36 Szarvas, G. 2008. "Hedge classification in biomedical texts with a weakly supervised selection of keywords." Proceedings of ACL-08: HLT, 281-289.
37 Szarvas, G., Vincze, V., Farkas, R. and Csirik, J. 2008. "The BioScope corpus: annotation for negation, uncertainty and their scope in biomedical texts." In Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing, (pp. 38-45). Association for Computational Linguistics.
38 Szarvas, G., Vincze, V., Farkas, R., Mora, G. and Gurevych, I. 2012. "Cross-genre and cross-domain detection of semantic uncertainty." Computational Linguistics, 38(2): 335-367. https://doi.org/10.1162/COLI_a_00098   DOI
39 Thompson, P., Nawaz, R., McNaught, J. and Ananiadou, S. 2011. "Enriching a biomedical event corpus with meta-knowledge annotation." BMC bioinformatics, 12(1): 393.   DOI
40 Vincze, V. 2013. "Weasels, hedges and peacocks: Discourse-level uncertainty in Wikipedia articles." International Joint Conference on Natural Language Processing, (October): 383-391. Nagoya, Japan.
41 Vincze, V., Szarvas, G., Farkas, R., Mora, G. and Csirik, J. 2008. "The BioScope corpus: biomedical texts annotated for uncertainty, negation and their scopes." BMC bioinformatics, 9(11): S9. https://doi.org/10.1186/1471-2105-9-S11-S9