Browse > Article
http://dx.doi.org/10.9708/jksci.2022.27.08.041

Text summarization of dialogue based on BERT  

Nam, Wongyung (Graduate School of Information, Yonsei University)
Lee, Jisoo (Graduate School of Information, Yonsei University)
Jang, Beakcheol (Graduate School of Information, Yonsei University)
Abstract
In this paper, we propose how to implement text summaries for colloquial data that are not clearly organized. For this study, SAMSum data, which is colloquial data, was used, and the BERTSumExtAbs model proposed in the previous study of the automatic summary model was applied. More than 70% of the SAMSum dataset consists of conversations between two people, and the remaining 30% consists of conversations between three or more people. As a result, by applying the automatic text summarization model to colloquial data, a result of 42.43 or higher was derived in the ROUGE Score R-1. In addition, a high score of 45.81 was derived by fine-tuning the BERTSum model, which was previously proposed as a text summarization model. Through this study, the performance of colloquial generation summary has been proven, and it is hoped that the computer will understand human natural language as it is and be used as basic data to solve various tasks.
Keywords
Text Mining; BERT; Text Summarization; Abstractive Summarization; Dialogue Data;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Bogdan Gliwa, Iwona Mochol, Maciej Biesek, Aleksander Wawer, "SAMSum Corpus: A Human-annotated Dialogue Dataset for Abstractive Summarization," Proceedings of the 2nd Workshop on New Frontiers in Summarization, Association for Computational Linguistics, November 2019.
2 Luhn H. "The automatic creation of literature abstracts." IBM J Res Dev 2, pp. 159-165, 1958.   DOI
3 Gambhir, Mahak, and Vishal Gupta. "Recent automatic text summarization techniques: a survey." Artificial Intelligence Review, Vol. 47, No. 1, pp. 1-66, 1, 2017.   DOI
4 Nallapati, Ramesh, Feifei Zhai, and Bowen Zhou. "Summarunner: A recurrent neural network based sequence model for extractive summarization of documents." Thirty-first AAAI conference on artificial intelligence. 2017.
5 Liu, Y., & Lapata, M.. "Text summarization with pretrained encoders.". EMNLP, 2019.
6 Feigenblat, G., Gunasekara, C., Sznajder, B., Joshi, S., Konopnicki, D., & Aharonov, R. "TWEETSUMM--A Dialog Summarization Dataset for Customer Service." EMNLP, 2021.
7 Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. "Bert: Pre-training of deep bidirectional transformers for language understanding.". NIPS, 2019.
8 Vaswani, Ashish, et al. "Attention is all you need." Advances in neural information processing systems 30, 2017.
9 Chin-Yew Lin, "ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out," pages 74–81, Barcelona, Spain, 2004.
10 Oh, Dongsuk, et al. "KoDialoGPT2: Modeling Chit-Chat Dialog in Korean." Annual Conference on Human and Language Technology. Human and Language Technology, 2021.
11 https://github.com/OpenNMT/OpenNMT-py
12 Abigail See, Peter J. Liu, and Christopher D. Manning. "Get to the point: Summarization with pointer generator networks," In Proceedings of the 55th Annual Meeting of the Association for Computational, Linguistics (Volume 1: Long Papers), pages 1073–1083, Vancouver, Canada, 2017.
13 Liu, Y. (2019). Fine-tune BERT for extractive summarization. arXiv preprint arXiv:1903.10318.
14 Dragomir R Radev, Eduard Hovy, and Kathleen McKeown. "Introduction to the special issue on summarization." Computational linguistics, Vol. 28, No. 4, pp. 399-408, 12, 2002.   DOI
15 ZHOU, Qingyu, et al. "Neural document summarization by jointly learning to score and select sentences." arXiv preprint arXiv:1807.02305, 2018.