Browse > Article
http://dx.doi.org/10.14400/JDC.2019.17.12.243

Performance Comparison Analysis on Named Entity Recognition system with Bi-LSTM based Multi-task Learning  

Kim, GyeongMin (Department of Computer Science and Engineering, Korea University)
Han, Seunggnyu (Department of Computer Science and Engineering, Korea University)
Oh, Dongsuk (Department of Computer Science and Engineering, Korea University)
Lim, HeuiSeok (Department of Computer Science and Engineering, Korea University)
Publication Information
Journal of Digital Convergence / v.17, no.12, 2019 , pp. 243-248 More about this Journal
Abstract
Multi-Task Learning(MTL) is a training method that trains a single neural network with multiple tasks influences each other. In this paper, we compare performance of MTL Named entity recognition(NER) model trained with Korean traditional culture corpus and other NER model. In training process, each Bi-LSTM layer of Part of speech tagging(POS-tagging) and NER are propagated from a Bi-LSTM layer to obtain the joint loss. As a result, the MTL based Bi-LSTM model shows 1.1%~4.6% performance improvement compared to single Bi-LSTM models.
Keywords
Deep Learning; Multi-task Learning; Part of speech tagging; Named entity Recognition; Traditional culture;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 J. Devlin, M. W. Chang, K. Lee & K. Toutanova. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 .
2 Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. Salakhutdinov & Q. V. Le. (2019). XLNet: Generalized Autoregressive Pretraining for Language Understanding. arXiv preprint arXiv:1906.08237 .
3 Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen & V. Stoyanov. (2019). Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
4 A. Lamurias, D. Sousa, L. A. Clarke & F. M. Couto. (2019). BO-LSTM: classifying relations via long short-term memory networks along biomedical ontologies. BMC bioinformatics, 20(1), 10.   DOI
5 C. Lyu, B. Chen, Y. Ren & D. Ji. (2017). Long short-term memory RNN for biomedical named entity recognition. BMC bioinformatics, 18(1), 462.   DOI
6 A. R. Tuor, R. Baerwolf, N. Knowles, B. Hutchinson, N. Nichols & R. Jasper. (2018, June). Recurrent neural network language models for open vocabulary event-level cyber anomaly detection. In Workshops at the Thirty-Second AAAI Conference on Artificial Intelligence.
7 S. KP. (2019). RNNSecureNet: Recurrent neural networks for Cyber security use-cases. arXiv preprint arXiv:1901.04281 .
8 G. Kim, K. Kim, J. Jo & H. Lim. (2018). Constructing for Korean Traditional culture Corpus and Development of Named Entity Recognition Model using Bi-LSTM-CNN-CRFs. Journal of the Korea Convergence Society, 9(12), 47-52. DOI : 10.15207/jkcs.2018.9.12.047   DOI
9 D. Lee, W. Yu & H. Lim. (2017). Bi-directional LSTM-CNN-CRF for Korean Named Entity Recognition System with Feature Augmentation. Journal of the Korea Convergence Society, 8(12), 55-62. DOI : 10.15207/JKCS.2017.8.12.055   DOI
10 S. Ruder. (2017). An overview of multi-task learning in deep neural networks. arXiv preprint arXiv:1706.05098.
11 R. Caruana. (1997). Multitask learning. Machine learning, 28(1), 41-75.   DOI
12 M. Long & J. Wang. (2015). Learning multiple tasks with deep relationship networks. arXiv preprint arXiv:1506.02117, 2.
13 Y. Zhang, Y. Wei & Q. Yang. (2018). Learning to multitask. In Advances in Neural Information Processing Systems (pp. 5771-5782).
14 T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado & J. Dean. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems (pp. 3111-3119).
15 J. Pennington, R. Socher & C. Manning. (2014, October). Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1532-1543).
16 P. Bojanowski, E. Grave, A. Joulin & T. Mikolov. (2017). Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5, 135-146.   DOI
17 M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee & L. Zettlemoyer. (2018). Deep contextualized word representations. arXiv preprint arXiv:1802.05365.