Browse > Article
http://dx.doi.org/10.15207/JKCS.2018.9.12.047

Constructing for Korean Traditional culture Corpus and Development of Named Entity Recognition Model using Bi-LSTM-CNN-CRFs  

Kim, GyeongMin (Department of Computer Science and Engineering, Korea University)
Kim, Kuekyeng (Department of Computer Science and Engineering, Korea University)
Jo, Jaechoon (Department of Computer Science and Engineering, Korea University)
Lim, HeuiSeok (Department of Computer Science and Engineering, Korea University)
Publication Information
Journal of the Korea Convergence Society / v.9, no.12, 2018 , pp. 47-52 More about this Journal
Abstract
Named Entity Recognition is a system that extracts entity names such as Persons(PS), Locations(LC), and Organizations(OG) that can have a unique meaning from a document and determines the categories of extracted entity names. Recently, Bi-LSTM-CRF, which is a combination of CRF using the transition probability between output data from LSTM-based Bi-LSTM model considering forward and backward directions of input data, showed excellent performance in the study of object name recognition using deep-learning, and it has a good performance on the efficient embedding vector creation by character and word unit and the model using CNN and LSTM. In this research, we describe the Bi-LSTM-CNN-CRF model that enhances the features of the Korean named entity recognition system and propose a method for constructing the traditional culture corpus. We also present the results of learning the constructed corpus with the feature augmentation model for the recognition of Korean object names.
Keywords
Named Entity Recognition; Traditional culture; Corpus; Deep Learning; feature augmentation;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 D. Y. Lee, W. H. Yu, & H. S. Lim. (2017). Bi-directional LSTM-CNN-CRF for Korean Named Entity Recognition System with Feature Augmentation. Journal of the Korea Convergence Society[KCI], 8(12).
2 Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., & Dyer, C. (2016). Neural architectures for named entity recognition. arXiv preprint arXiv:1603.01360.
3 Ma, X., & Hovy, E. (2016). End-to-end sequence labeling via bi-directional lstm-cnns-crf. arXiv preprint arXiv:1603.01354.
4 Ling, W., Trancoso, I., Dyer, C., & Black, A. W. (2015). Character-based neural machine trans- lation. arXiv preprint arXiv:1511.04586.
5 Chiu, J. P., & Nichols, E. (2015). Named entity recognition with bidirectional LSTM-CNNs. arXiv preprint arXiv:1511.08308.
6 Nadeau D., Turney, P. D., & Matwin, S. (2006). Unsupervised named-entity recognition: Generating gazetteers and resolving ambiguity. In Conference of the Canadian Society for Computational Studies of Intelligence (pp. 266-277). Springer, Berlin, Heidelberg. DOI : 10.12811/JKCS.201.11.2.129
7 Zhu, X. (2006). Semi-supervised learning literature survey. Computer Science, University of Wisconsin-Madison, 2(3), 4. DOI : 10.22156/JKCS.2018.7.1.001
8 Derczynski, L., Maynard, D., Rizzo, G., van Erp, M., Gorrell, G., Troncy, R., ... & Bontcheva, K. (2015). Analysis of named entity recognition and linking for tweets. Information Processing & Management, 51(2), 32-49.   DOI
9 Cho, K., Van Merrienboer, B., Bahdanau, D., & Bengio, Y. (2014). On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv:1409.1259.
10 Graves, A., Mohamed, A. R., & Hinton, G. (2013, May). Speech recognition with deep recurrent neural networks. In Acoustics, speech and signal processing (icassp), 2013 ieee inter- national conference on (pp. 6645-6649). IEEE.
11 Santos, C. D., & Zadrozny, B. (2014). Learning character-level representations for part-of-speech tagging. In Proceedings of the 31st International Conference on Machine Learning (ICML-14) (pp. 1818-1826).
12 Pennington, J., Socher, R., & Manning, C. (2014). Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1532-1543).
13 Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2016). Enriching word vectors with subword information. arXiv preprint arXiv:1607.04606.
14 Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
15 S. H. Na & M. W. Min. (2016). Character Based LSTM CRFs for Named Entity Recognition, Korea Computer Congress (KCC).