[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.7472/jksii.2020.21.3.123

Korean Contextual Information Extraction System using BERT and Knowledge Graph

Yoo, SoYeop (Dept. of AI.Software, Gachon University)
Jeong, OkRan (Dept. of AI.Software, Gachon University)

Publication Information

Journal of Internet Computing and Services / v.21, no.3, 2020 , pp. 123-131 More about this Journal

Abstract

Along with the rapid development of artificial intelligence technology, natural language processing, which deals with human language, is also actively studied. In particular, BERT, a language model recently proposed by Google, has been performing well in many areas of natural language processing by providing pre-trained model using a large number of corpus. Although BERT supports multilingual model, we should use the pre-trained model using large amounts of Korean corpus because there are limitations when we apply the original pre-trained BERT model directly to Korean. Also, text contains not only vocabulary, grammar, but contextual meanings such as the relation between the front and the rear, and situation. In the existing natural language processing field, research has been conducted mainly on vocabulary or grammatical meaning. Accurate identification of contextual information embedded in text plays an important role in understanding context. Knowledge graphs, which are linked using the relationship of words, have the advantage of being able to learn context easily from computer. In this paper, we propose a system to extract Korean contextual information using pre-trained BERT model with Korean language corpus and knowledge graph. We build models that can extract person, relationship, emotion, space, and time information that is important in the text and validate the proposed system through experiments.

Keywords

contextual information extraction; person extraction; relation extraction; sentiment extraction; BERT; knowledge graph;

Citations & Related Records

Times Cited By KSCI : 1 (Citation Analysis)

Reference
Cited By KSCI

1	Zhilin, Y., Zihng, D., Yiming, Y., Jaime, C., Ruslan, S., and Quoc V. L., "XLNet: Generalized Autoregressive Pretraining for Language Understanding," arXiv preprint, 2019. https://arxiv.org/abs/1906.08237
2	K. H. Park., S. H. Na., J. H. Shin., and Y. K. Kim., "BERT for Korean Natural Language Processing: Named Entity Tagging, Sentiment Analysis, Dependency Parsing and Semantic Role Labeling," Korea Computer Congress 2019, 2019, pp. 584-586. https://www.dbpia.co.kr/journal/articleDetail?nodeId=NODE08763261
3	S. Kwon., Y. Ko., and J. Seo, "Effective vector representation for the Korean named-entity recognition," Pattern Recognition Letters, Vol. 117, pp. 52-57, 2019. http://dx.doi.org/10.1016/j.patrec.2018.11.019 DOI
4	Sung-Il, Lee., "Contextualism and a Reflection on the Notions of 'Context'," Journal of Language Sciences, Vol. 17, No. 3, pp. 67-86, 2010. http://dx.doi.org/G704-001077.2010.17.3.003
5	Min-Woo, Lee., "Semantic Relations from the Contextual Perspective," Korean Semantics, Vol. 66, pp. 101-120, 2019. http://dx.doi.org/10.19033/sks.2019.12.66.101 DOI
6	M. S. Shin., "The Characteristics of the Contextual Meaning Evaluation Items of Words - Focusing on the Korean Language Subject of the College Scholastic Ability Text," KOED, No. 116, pp. 143-185, 2018. http://dx.doi.org/10.15734/koed..116.201809.143
7	A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever, "Improving language understanding by generative pre-training," https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf
8	A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, and I. Polosukhin, "Attention is all you need," In Proc. of the 31st Internationcal Conference on Neural Information Processing Systems, pp. 6000-6010, 2017. http://dx.doi.org/10.5555/3295222.3295349
9	SKTBrain, "Korean BERT pre-trained cased (KoBERT)," https://github.com/SKTBrain/KoBERT
10	Thomas, R., Fabian M. S., Johannes, H., Joanna, B., and Gerhard, W., "YAGO: A Mulitilingual Knowledge Base from Wikipedia, Wordnet, and Geonames," in Proc. of 15th International Semantic Web Conference, pp. 177-185, 2016. https://doi.org/10.1007/978-3-319-46547-0_19
11	R. Speer, J. Chin, and C. Havasi, "ConceptNet 5.5: An Open Multilingual Graph of General Knowledge," In Thirty-First AAAI Conference on Artificial Intelligence, 2017. https://dl.acm.org/doi/10.5555/3298023.3298212
12	E. Cambria, S. Poria, D. Hazarika, and K. Kwok, "SenticNet 5: Discovering Conceptual Primitives for Sentiment Analysis by Means of Context Embeddings," In Thirty-Second AAAI Conference on Artificial Intelligence, 2018. https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16839
13	Kmounlp, "Definition of Korean Named-Entity Task," https://github.com/kmounlp/NER
14	KAIST, "Korean Relation Extraction Gold Standard," https://github.com/machinereading/kor-re-gold
15	S. S. Lee., "A Study on the Analysis of Semantic Relation and Category of the Korean Emotion Words," Journal of Korean Library and Information Science Society, Vol. 47, No. 2, pp. 51-70, 2016. http://dx.doi.org/10.16981/kliss.47.201606.51 DOI
16	P. Ekman, "Are there basic emotions?" Psychological Review, Vol. 99, No. 3, pp. 550-553, 1992. http://dx.doi.org/10.1037/0033-295X.99.3.550 DOI
17	A. Chatterjee, K. N. Narahari, M. Joshi, P. Agrawal, "SemEval-2019 Task 3: EmoContext Contextual Emotion Detection in Text," in Proc. of the 13th International Workshop on Semantic Evaluation, pp. 39-48, 2019. http://dx.doi.org/10.18653/v1/S19-2005
18	Naver Developers, "Papago NMT API Reference," https://developers.naver.com/docs/nmt/reference/
19	Fellbaum, C., "WordNet: An Electronic Lexical Database," Cambridge, MA: MIT Press, 1998. http://dx.doi.org/10.1017/S0142716401221079
20	Google, "Google Colab," https://colab.research.google.com
21	Deng, L., and Liu, Y. (Eds.), Deep Learning in Natural Language Processing, Springer, 2018. http://dx.doi.org/10.1007/978-981-10-5209-5
22	Palash, G., Sumit, P., and Karan, J., Deep Learning for Natural Language Processing, Apress, 2018. http://dx.doi.org/10.1007/978-1-4842-3685-7
23	Peters, M. E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., and Zettlemoyer, L., "Deep contextualized word representations," in Proc. of NAACL, 2018. http://dx.doi.org/10.18653/v1/N18-1202
24	Devlin, J., Chang, M. W., Lee, K., and Toutanova, K., "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding," in Proc. of NAACL, 2019. http://dx.doi.org/10.18653/v1/N19-1423

KSCI

Korean Contextual Information Extraction System using BERT and Knowledge Graph BERT와 지식 그래프를 이용한 한국어 문맥 정보 추출 시스템

Korean Contextual Information Extraction System using BERT and Knowledge Graph