Keyphrase Extraction Using Active Learning and Clustering

Active Learning과 군집화를 이용한 고정키어구 추출

  • 이현우 (국립창원대학교 컴퓨터공학과 자연어처리연구실) ;
  • 차정원 (국립창원대학교 컴퓨터공학과)
  • Published : 2008.06.30

Abstract

We describe a new active learning method in conditional random fields (CRFs) framework for keyphrase extraction. To save elaboration in annotation, we use diversity and representative measure. We select high diversity training candidates by sentence confidence value. We also select high representative candidates by clustering the part-of-speech patterns of contexts. In the experiments using dialog corpus, our method achieves 86.80% and saves 88% training corpus compared with those of supervised method. From the results of experiment, we can see that the proposed method shows improved performance over the previous methods. Additionally, the proposed method can be applied to other applications easily since its implementation is independent on applications.

Keywords