DOI QR코드

DOI QR Code

Integrated Char-Word Embedding on Chinese NER using Transformer

트랜스포머를 이용한 중국어 NER 관련 문자와 단어 통합 임배딩

  • 김춘광 (한양대학교 컴퓨터소프트웨어학과) ;
  • 조인휘 (한양대학교 컴퓨터소프트웨어학과)
  • Published : 2021.05.12

Abstract

Since the words and words in Chinese sentences are continuous and the length of vocabulary is huge, Chinese NER(Named Entity Recognition) always based on character representation. In recent years, many Chinese research has been reconsidered how to integrate the word information into the Chinese NER model. However, the traditional sequence model has complex structure, the slow inference speed, and an additional dictionary information is needed, which is difficult to implement in the industry. The approach in this paper has the state of the art and parallelizable, which is integrated the char-word embeddings, so that the model learns word information. The proposed model is easy to implement, and outperforms traditional model in terms of speed and efficiency, which is improved f1-score on two dataset.

Keywords