DOI QR코드

DOI QR Code

Single Document Extractive Summarization Based on Deep Neural Networks Using Linguistic Analysis Features

언어 분석 자질을 활용한 인공신경망 기반의 단일 문서 추출 요약

  • 이경호 (충남대학교 전자전파정보통신공학과) ;
  • 이공주 (충남대학교 전파정보통신공학과)
  • Received : 2019.04.15
  • Accepted : 2019.05.19
  • Published : 2019.08.31

Abstract

In recent years, extractive summarization systems based on end-to-end deep learning models have become popular. These systems do not require human-crafted features and adopt data-driven approaches. However, previous related studies have shown that linguistic analysis features such as part-of-speeches, named entities and word's frequencies are useful for extracting important sentences from a document to generate a summary. In this paper, we propose an extractive summarization system based on deep neural networks using conventional linguistic analysis features. In order to prove the usefulness of the linguistic analysis features, we compare the models with and without those features. The experimental results show that the model with the linguistic analysis features improves the Rouge-2 F1 score by 0.5 points compared to the model without those features.

최근의 문서요약 시스템은 인공신경망을 이용한 End-to-End 방식이 주류를 이루고 있다. 이러한 시스템은 인간의 자질 추출 과정이 필요 없으며 데이터 중심의 접근 방법을 채택한다. 그러나 기존의 관련 연구들은 품사 정보, 개체명 정보, 단어의 빈도 정보와 같은 언어 분석 자질이 중요 문장을 선택하여 요약을 작성하는데 유용함을 보여왔다. 본 연구에서는 기존의 언어 분석 자질을 활용하여 인공신경망을 기반으로 한 단일 문서의 추출 요약 시스템을 제안한다. 언어 분석 자질의 유용성을 보이기 위해 자질을 사용하는 모델과 사용하지 않는 모델을 비교하였다. 실험 결과 자질을 사용하는 모델이 그렇지 않은 모델에 비해 약 0.5점의 Rouge-2 F1점수 향상을 보였다.

Keywords

JBCRJM_2019_v8n8_343_f0001.png 이미지

Fig. 1. Architecture of SumaRuNNer

JBCRJM_2019_v8n8_343_f0002.png 이미지

Fig. 2. Test of Validation

Table 1. Linguistic Analysis Features

JBCRJM_2019_v8n8_343_t0001.png 이미지

Table 2. Number Documents IN CNN/Dailymail

JBCRJM_2019_v8n8_343_t0002.png 이미지

Table 3. Results of Full-Length f1

JBCRJM_2019_v8n8_343_t0003.png 이미지

Table 4. Example of Summary

JBCRJM_2019_v8n8_343_t0004.png 이미지

References

  1. R. Nallapati, et al., Abstractive textsummarization using sequence-to-sequence rnns and beyond. arXiv preprintarXiv: 1602.06023, 2016.
  2. R. Nallapati, F. Zhai, and B. Zhou, "Summarunner: A recurrent neural network based sequence model for extractive summarization ofdocuments," in Thirty-First AAAI Conference on Artificial Intelligence, 2017.
  3. I. Sutskever, O. Vinyals, and Q. V. Le, "Sequence to sequence learning with neural networks," in Advances in Neural Information Processing Systems, 2014.
  4. A. Jadhav and V. Rajan, "Extractive Summarizationwith SWAP-NET: Sentences and Words from Alternating Pointer Networks," in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2018.
  5. A. Nenkova, and L. Vanderwende, "The impact offrequency on summarization," Microsoft Research, Redmond, Washington, Tech. Rep.MSR-TR-2005, 2005. 101.
  6. E. Filatova and V. Hatzivassiloglou, "Event-based extractive summarization," 2004.
  7. G. Erkan, and D. R. Radev, "Lexrank: Graph-basedlexical centrality as salience in text summarization," Journal of Artificial Intelligence Research, Vol.22, pp.457-479, 2004. https://doi.org/10.1613/jair.1523
  8. D. R. Radev, et al., "Centroid-based summarizationof multiple documents," Information Processing & Management, Vol.40, No.6, pp.919-938, 2004. https://doi.org/10.1016/j.ipm.2003.10.006
  9. R. McDonald, "A study of global inferencealgorithms in multi-document summarization," in European Conference onInformation Retrieval, 2007. Springer.
  10. D. Shen, et al., "Document summarization usingconditional random fields," IJCAI, Vol.7, pp.2862-2867, 2007.
  11. H. P. Edmundson, "New methods in automaticextracting," Journal of the ACM (JACM), Vol.16, No.2, pp.264-285, 1969. https://doi.org/10.1145/321510.321519
  12. J. Cheng and M. Lapata, "Neural summarization byextracting sentences and words," arXiv preprint arXiv:1603.07252, 2016.
  13. K. Cho, et al., "Learning phrase representationsusing RNN encoder-decoder for statistical machine translation," arXiv preprintarXiv:1406.1078, 2014.
  14. Y. Bengio, et al., "A neural probabilisticlanguage model," Journal of Machine Learning Research, Vol.3(Feb), pp.1137-1155, 2003.
  15. T. Mikolov, et al., "Efficient estimation of wordrepresentations in vector space," arXiv preprint arXiv:1301.3781, 2013.
  16. S. Menaka and N. Radha, "Text classificationusing keyword extraction technique," International Journal of Advanced Researchin Computer Science and Software Engineering, Vol.3, No.12, 2013.
  17. A. Hulth, "Improved automatic keyword extractiongiven more linguistic knowledge," in Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, 2003. Association for Computational Linguistics.
  18. M. Wu, et al., "Event-based summarization usingtime features," in International Conference on Intelligent Text Processing and Computational Linguistics, 2007. Springer.
  19. W. Li, et al., "Extractive summarization usinginter-and intra-event relevance," in Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of theAssociation for Computational Linguistics, 2006. Association for Computational Linguistics.
  20. K. M. Hermann, et al., "Teaching machines to readand comprehend," in Advances in Neural Information Processing Systems, 2015.
  21. C. Manning, et al., "The Stanford CoreNLP naturallanguage processing toolkit," in Proceedings of 52nd annual meeting of theassociation for computational linguistics: system demonstrations, 2014.
  22. D. P. Kingma and J. Ba, "Adam: A method forstochastic optimization," arXiv preprint arXiv:1412.6980, 2014.
  23. C.-Y. Lin, "Rouge: A package for automaticevaluation of summaries," Text Summarization Branches Out, pp.74-81, 2004.