• Title/Summary/Keyword: 10-등분 교차 검증

Search Result 1, Processing Time 0.014 seconds

Named Entity Recognition for Patent Documents Based on Conditional Random Fields (조건부 랜덤 필드를 이용한 특허 문서의 개체명 인식)

  • Lee, Tae Seok;Shin, Su Mi;Kang, Seung Shik
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.9
    • /
    • pp.419-424
    • /
    • 2016
  • Named entity recognition is required to improve the retrieval accuracy of patent documents or similar patents in the claims and patent descriptions. In this paper, we proposed an automatic named entity recognition for patents by using a conditional random field that is one of the best methods in machine learning research. Named entity recognition system has been constructed from the training set of tagged corpus with 660,000 words and 70,000 words are used as a test set for evaluation. The experiment shows that the accuracy is 93.6% and the Kappa coefficient is 0.67 between manual tagging and automatic tagging system. This figure is better than the Kappa coefficient 0.6 for manually tagged results and it shows that automatic named entity tagging system can be used as a practical tagging for patent documents in replacement of a manual tagging.