DOI QR코드

DOI QR Code

Expansion of Word Representation for Named Entity Recognition Based on Bidirectional LSTM CRFs

Bidirectional LSTM CRF 기반의 개체명 인식을 위한 단어 표상의 확장

  • 유홍연 (동아대학교 컴퓨터공학과) ;
  • 고영중 (동아대학교 컴퓨터공학과)
  • Received : 2016.11.15
  • Accepted : 2017.01.13
  • Published : 2017.03.15

Abstract

Named entity recognition (NER) seeks to locate and classify named entities in text into pre-defined categories such as names of persons, organizations, locations, expressions of times, etc. Recently, many state-of-the-art NER systems have been implemented with bidirectional LSTM CRFs. Deep learning models based on long short-term memory (LSTM) generally depend on word representations as input. In this paper, we propose an approach to expand word representation by using pre-trained word embedding, part of speech (POS) tag embedding, syllable embedding and named entity dictionary feature vectors. Our experiments show that the proposed approach creates useful word representations as an input of bidirectional LSTM CRFs. Our final presentation shows its efficacy to be 8.05%p higher than baseline NERs with only the pre-trained word embedding vector.

개체명 인식이란 문서 내에서 인명, 기관명, 지명, 시간, 날짜 등 고유한 의미를 가지는 개체명을 추출하여 그 종류를 결정하는 것을 말한다. 최근 개체명 인식 연구에서 가장 우수한 성능을 보여주고 있는 모델은 Bidirectional LSTM CRFs 모델이다. 이러한 LSTM 기반의 딥 러닝 모델은 입력이 되는 단어 표상에 의존적이다. 따라서 입력이 되는 단어를 잘 표현하기 위하여 단어 표상을 확장하는 방법에 대한 연구가 많이 진행되어지고 있다. 본 논문에서는 한국어 개체명 인식을 위하여 Bidirectional LSTM CRFs모델을 사용하고, 그 입력으로 사용되는 단어 표상을 확장하기 위해 사전 학습된 단어 임베딩 벡터, 품사 임베딩 벡터, 음절 기반에서 확장된 단어 임베딩 벡터, 그리고 개체명 사전 자질 벡터를 사용한다. 최종 단어 표상 확장 결과 사전 학습된 단어 임베딩 벡터만 사용한 것 보다 8.05%p의 성능 향상을 보였다.

Keywords

Acknowledgement

Supported by : 한국연구재단

References

  1. Changki Lee, Junseok Kim, Jeonghee Kim, Hyunki Kim, "Named Entity Recognition using Deep Learning," Korean Institute of Information Scientists and Engineers(KIISE), No. 12, pp. 423-425, 2014.
  2. Changki Lee, "Named Entity Recognition using Long Short-Term Memory Based Recurrent Neural Network," Korean Institute of Information Scientists and Engineers(KIISE), No. 6, pp. 645-647, 2015.
  3. Zhiheng Huang, Wei Xu, Kai Yu, "Bidirectional LSTM-CRF Models for Sequence Tagging," arXiv:1508.01991, 2015.
  4. R Collobert, J Weston, L Bottou, M Karlen, K Kavukcuoglu, P Kuksa, "Natural Language Processing (almost) from Scratch," The Journal of Machine Learning Research, Vol. 12, No. 8, pp. 2493-2537, 2011.
  5. Seung-Hoon Na, Minwoo Min, "Character-Based LSTM CRFs for Named Entity Recognition," Korea Computer Congress (KCC), 2016.
  6. Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian, Kazuya Kawakami, Chris Dyer, "Neural Architectures for Named Entity Recognition," arXiv: 1603.01360, 2016.
  7. Xuezhe Ma, Eduard Hovy, "End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF," arXiv:1603.01354, 2016.
  8. Jason P.C. Chiu, Eric Nichols, "Named Entity Recognition with Bidirectional LSTM-CNNs," arXiv:1511.08308, 2015.
  9. Wang Ling, Isabel Trancoso, Chris Dyer, Alan W Black, "Character-based Neural Machine Translation," arXiv:1511.04586, 2015.
  10. CN dos Santos, B Zadrozny, "Learning Character -level Representations for Part-of-Speech Tagging," Vol. 5, No. 2014, pp. 3830-3838, ICML, 2014.
  11. Chris Dyer, Miguel Ballesteros, Wang Ling, Austin Matthews, Noah A. Smith, "Transition-Based Dependency Parsing with Stack Long Short-Term Memory," arXiv:1505:08075, 2015.
  12. Swabha Swayamdipta, Miguel Ballesteros, Chris Dyer, Noah A. Smith, "Greedy, Joint Syntactic -Semantic Parsing with Stack LSTMs," arXiv: 1606.08954, 2016.
  13. G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever and R. R. Salakhutdinov, "Improving neural networks by preventing co-adaptation of feature detectors," arXiv:1207.0580, 2012.
  14. word2vec, [Online]. Available: http://code.google.com/archive/p/word2vec/
  15. TensorFlow, [Online]. Available: https://www.tensorflow.org/