Browse > Article
http://dx.doi.org/10.7472/jksii.2020.21.3.123

Korean Contextual Information Extraction System using BERT and Knowledge Graph  

Yoo, SoYeop (Dept. of AI.Software, Gachon University)
Jeong, OkRan (Dept. of AI.Software, Gachon University)
Publication Information
Journal of Internet Computing and Services / v.21, no.3, 2020 , pp. 123-131 More about this Journal
Abstract
Along with the rapid development of artificial intelligence technology, natural language processing, which deals with human language, is also actively studied. In particular, BERT, a language model recently proposed by Google, has been performing well in many areas of natural language processing by providing pre-trained model using a large number of corpus. Although BERT supports multilingual model, we should use the pre-trained model using large amounts of Korean corpus because there are limitations when we apply the original pre-trained BERT model directly to Korean. Also, text contains not only vocabulary, grammar, but contextual meanings such as the relation between the front and the rear, and situation. In the existing natural language processing field, research has been conducted mainly on vocabulary or grammatical meaning. Accurate identification of contextual information embedded in text plays an important role in understanding context. Knowledge graphs, which are linked using the relationship of words, have the advantage of being able to learn context easily from computer. In this paper, we propose a system to extract Korean contextual information using pre-trained BERT model with Korean language corpus and knowledge graph. We build models that can extract person, relationship, emotion, space, and time information that is important in the text and validate the proposed system through experiments.
Keywords
contextual information extraction; person extraction; relation extraction; sentiment extraction; BERT; knowledge graph;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Zhilin, Y., Zihng, D., Yiming, Y., Jaime, C., Ruslan, S., and Quoc V. L., "XLNet: Generalized Autoregressive Pretraining for Language Understanding," arXiv preprint, 2019. https://arxiv.org/abs/1906.08237
2 K. H. Park., S. H. Na., J. H. Shin., and Y. K. Kim., "BERT for Korean Natural Language Processing: Named Entity Tagging, Sentiment Analysis, Dependency Parsing and Semantic Role Labeling," Korea Computer Congress 2019, 2019, pp. 584-586. https://www.dbpia.co.kr/journal/articleDetail?nodeId=NODE08763261
3 S. Kwon., Y. Ko., and J. Seo, "Effective vector representation for the Korean named-entity recognition," Pattern Recognition Letters, Vol. 117, pp. 52-57, 2019. http://dx.doi.org/10.1016/j.patrec.2018.11.019   DOI
4 Sung-Il, Lee., "Contextualism and a Reflection on the Notions of 'Context'," Journal of Language Sciences, Vol. 17, No. 3, pp. 67-86, 2010. http://dx.doi.org/G704-001077.2010.17.3.003
5 Min-Woo, Lee., "Semantic Relations from the Contextual Perspective," Korean Semantics, Vol. 66, pp. 101-120, 2019. http://dx.doi.org/10.19033/sks.2019.12.66.101   DOI
6 M. S. Shin., "The Characteristics of the Contextual Meaning Evaluation Items of Words - Focusing on the Korean Language Subject of the College Scholastic Ability Text," KOED, No. 116, pp. 143-185, 2018. http://dx.doi.org/10.15734/koed..116.201809.143
7 A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever, "Improving language understanding by generative pre-training," https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf
8 A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, and I. Polosukhin, "Attention is all you need," In Proc. of the 31st Internationcal Conference on Neural Information Processing Systems, pp. 6000-6010, 2017. http://dx.doi.org/10.5555/3295222.3295349
9 SKTBrain, "Korean BERT pre-trained cased (KoBERT)," https://github.com/SKTBrain/KoBERT
10 Thomas, R., Fabian M. S., Johannes, H., Joanna, B., and Gerhard, W., "YAGO: A Mulitilingual Knowledge Base from Wikipedia, Wordnet, and Geonames," in Proc. of 15th International Semantic Web Conference, pp. 177-185, 2016. https://doi.org/10.1007/978-3-319-46547-0_19
11 R. Speer, J. Chin, and C. Havasi, "ConceptNet 5.5: An Open Multilingual Graph of General Knowledge," In Thirty-First AAAI Conference on Artificial Intelligence, 2017. https://dl.acm.org/doi/10.5555/3298023.3298212
12 S. S. Lee., "A Study on the Analysis of Semantic Relation and Category of the Korean Emotion Words," Journal of Korean Library and Information Science Society, Vol. 47, No. 2, pp. 51-70, 2016. http://dx.doi.org/10.16981/kliss.47.201606.51   DOI
13 E. Cambria, S. Poria, D. Hazarika, and K. Kwok, "SenticNet 5: Discovering Conceptual Primitives for Sentiment Analysis by Means of Context Embeddings," In Thirty-Second AAAI Conference on Artificial Intelligence, 2018. https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16839
14 Kmounlp, "Definition of Korean Named-Entity Task," https://github.com/kmounlp/NER
15 KAIST, "Korean Relation Extraction Gold Standard," https://github.com/machinereading/kor-re-gold
16 P. Ekman, "Are there basic emotions?" Psychological Review, Vol. 99, No. 3, pp. 550-553, 1992. http://dx.doi.org/10.1037/0033-295X.99.3.550   DOI
17 A. Chatterjee, K. N. Narahari, M. Joshi, P. Agrawal, "SemEval-2019 Task 3: EmoContext Contextual Emotion Detection in Text," in Proc. of the 13th International Workshop on Semantic Evaluation, pp. 39-48, 2019. http://dx.doi.org/10.18653/v1/S19-2005
18 Naver Developers, "Papago NMT API Reference," https://developers.naver.com/docs/nmt/reference/
19 Fellbaum, C., "WordNet: An Electronic Lexical Database," Cambridge, MA: MIT Press, 1998. http://dx.doi.org/10.1017/S0142716401221079
20 Google, "Google Colab," https://colab.research.google.com
21 Devlin, J., Chang, M. W., Lee, K., and Toutanova, K., "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding," in Proc. of NAACL, 2019. http://dx.doi.org/10.18653/v1/N19-1423
22 Deng, L., and Liu, Y. (Eds.), Deep Learning in Natural Language Processing, Springer, 2018. http://dx.doi.org/10.1007/978-981-10-5209-5
23 Palash, G., Sumit, P., and Karan, J., Deep Learning for Natural Language Processing, Apress, 2018. http://dx.doi.org/10.1007/978-1-4842-3685-7
24 Peters, M. E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., and Zettlemoyer, L., "Deep contextualized word representations," in Proc. of NAACL, 2018. http://dx.doi.org/10.18653/v1/N18-1202