Browse > Article
http://dx.doi.org/10.13088/jiis.2019.25.3.201

A Study on the Effect of the Document Summarization Technique on the Fake News Detection Model  

Shim, Jae-Seung (Graduate School of Business IT, Kookmin University)
Won, Ha-Ram (Graduate School of Business IT, Kookmin University)
Ahn, Hyunchul (Graduate School of Business IT, Kookmin University)
Publication Information
Journal of Intelligence and Information Systems / v.25, no.3, 2019 , pp. 201-220 More about this Journal
Abstract
Fake news has emerged as a significant issue over the last few years, igniting discussions and research on how to solve this problem. In particular, studies on automated fact-checking and fake news detection using artificial intelligence and text analysis techniques have drawn attention. Fake news detection research entails a form of document classification; thus, document classification techniques have been widely used in this type of research. However, document summarization techniques have been inconspicuous in this field. At the same time, automatic news summarization services have become popular, and a recent study found that the use of news summarized through abstractive summarization has strengthened the predictive performance of fake news detection models. Therefore, the need to study the integration of document summarization technology in the domestic news data environment has become evident. In order to examine the effect of extractive summarization on the fake news detection model, we first summarized news articles through extractive summarization. Second, we created a summarized news-based detection model. Finally, we compared our model with the full-text-based detection model. The study found that BPN(Back Propagation Neural Network) and SVM(Support Vector Machine) did not exhibit a large difference in performance; however, for DT(Decision Tree), the full-text-based model demonstrated a somewhat better performance. In the case of LR(Logistic Regression), our model exhibited the superior performance. Nonetheless, the results did not show a statistically significant difference between our model and the full-text-based model. Therefore, when the summary is applied, at least the core information of the fake news is preserved, and the LR-based model can confirm the possibility of performance improvement. This study features an experimental application of extractive summarization in fake news detection research by employing various machine-learning algorithms. The study's limitations are, essentially, the relatively small amount of data and the lack of comparison between various summarization technologies. Therefore, an in-depth analysis that applies various analytical techniques to a larger data volume would be helpful in the future.
Keywords
Fake News Detection; Document Summarization; Automated Fact Checking; Machine Learning; Domestic News;
Citations & Related Records
Times Cited By KSCI : 7  (Citation Analysis)
연도 인용수 순위
1 Radev, D. R., E. Hovy, and K. McKeown, "Introduction to the Special Issue on Summarization," Journal of Computational linguistics, Vol.28, (2002), 399-408.   DOI
2 Shin, H., "Naver 'Summary Bot', is it an evolution or editorial infringement?," Sisain, 2017, Available at https://www.sisain.co.kr/news/articleView.html?idxno=30828 (Accessed 30 August 2019).
3 Lee S., and H.-J. Kim, "Keyword Extraction from News Corpus using Modified TF-IDF," The Journal of Society for e-Business Studies, Vol.14, No.4(2009), 59-73.
4 Wang, W. Y., ""Liar, Liar Pants on Fire": A New Benchmark Dataset for Fake News Detection," Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vol.2, (2017), 422-426.
5 Yang, J, "'News' and 'Fake News' that ordinary citizens think," Media issue, Vol.5, No.1, (2019), 1-15.
6 Yoon, S. O., "A Study on the Characteristics and Problems of Fake News Regulations proposed by the National Assembly," Journal of Media Law, Ethics and Policy Research, Vol.18, No.1, (2019), 103-138.   DOI
7 Yun, T. U., and H. Ahn, "Fake News Detection for Korean News Using Text Mining and Machine Learning Techniques," Journal of Information Technology Applications & Management, Vol.25, No.1, (2018), 19-32.   DOI
8 Yun, Y., Ko, E., and Kim, N., "Subject-Balanced Intelligent Text Summarization Scheme," Journal of Intelligence and Information System, Vol.25, No.2, (2019), 141-166.   DOI
9 Zhang, H., Z. Fan, J. Zheng, and Q. Liu, "An Improving Deception Detection Method in Computer-Mediated Communication," Journal of Networks, Vol.7, No.11, (2012), 1811-1816.
10 Seol, J., and S. Lee, " lexrankr: LexRank based Korean multi-document summarization," Journal of the Korean Institute of Information Scientist and Engineers 2016 Winter Academic Conference, (2016), 458-460.
11 Zhao, Z., P. Resnick, and Q. Mei, "Enquiring Minds: Early Detection of Rumors in Social Media from Enquiry Posts," Proceedings of the 24th International Conference on World Wide Web, (2015), 1395-1405.
12 Erkan, G., and D. R. Radev, "LexRank: Graph-based Lexical Centrality as Salience in Text Summarization," Journal of Artificial Intelligence Research, Vol.22, (2004), 457-479.   DOI
13 Zubiaga, A., M. Liakata, and R. Procter, "Learning Reporting Dynamics during Breaking News for Rumour Detection in Social Media," arXiv preprint arXiv:1610.07363, (2016).
14 Afroz, S., M. Brennan, and R. Greenstadt, "Detecting Hoaxes, Frauds, and Deception in Writing Style Online," 2012 IEEE Symposium on Security and Privacy, (2012), 461-475.
15 Aker, A., L. Derczynski, and K. Bontcheva, "Simple Open Stance Classification for Rumour Analysis," Proceedings of Recent Advances in Natural Language Processing, (2017), 31-39.
16 Allahyari, M., S. Pouriyeh, M. Assefi, S. Safaei, E. D. Trippe, J. B. Gutierrez, and K. Kochut, "Text Summarization Techniques: A Brief Survey," International Journal of Advanced Computer Science and Applications, Vol.8, No.10, (2017), 397-405.
17 Bondiellia, A., and F., Marcelloni, "A survey on fake news and rumour detection techniques," Information Sciences, Vol.497, (2019), 38-55.   DOI
18 Castillo, C., M. Mendoza, and B. Poblete, "Information Credibility on Twitter," Proceedings of the 20th International Conference on World Wide Web, (2011), 675-684.
19 Diab, S., "Optimizing Stochastic Gradient Descent in Text Classification Based on Fine-Tuning Hyper-Parameters Approach," International Journal of Computer Science and Information Security, Vol.16, No.12, (2018), 155-160.
20 Esmaeilzadeh, S., G. X. Peh, and A. Xu, "Neural Abstractive Text Summarization and Fake News Detection," arXiv preprint arXiv:1904.00788, (2019).
21 Ferreira, W., and A. Vlachos, "Emergent: a novel data-set for stance classification," Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, (2016), 1163-1168.
22 Gambhir, M., and V. Gupta, "Recent automatic text summarization techniques: a survey," Artificial Intelligence Review, Vol.47, No.1, (2017), 1-66.   DOI
23 Giasemidis, G., C. Singleton, I. Agrafiotis, J. R. C. Nurse, A. Pilgrim, C. Willis, and D. V. Greetham, "Determining the Veracity of Rumours on Twitter," Social Informatics Part I, (2016), 185-205.
24 Hardalov, M., I. Koychev, and P. Nakov, "In Search of Credible News," Artificial Intelligence: Methodology, Systems, and Applications, (2016), 172-180.
25 Koo, B.-K., "News Summary Bots and Human Power in the Infinite Information Age," The Hankyoreh, 2017, Available at http://www.hani.co.kr/arti/economy/it/822844.html (Accessed 30 August 2019).
26 Hyun, Y., and N. Kim, "Text Mining-based Fake News Detection Using News And Social Media Data," The Journal of Society for e-Business Studies, Vol.23, No.4, (2018), 19-39.   DOI
27 Jeon, B., and H. Ahn, "A Collaborative Filtering System Combined with Users Review Mining : Application to the Recommendation of Smartphone Apps," Journal of Intelligence and Information System, Vol.21, No.2, (2015), 1-18.   DOI
28 Jin, H., "Compressed three news articles ... NAVER AI summary bot appeared," Digital Times, 2017, Available at http://www.dt.co.kr/contents.html?article_no=2017112902101131043001 (Accessed 30 August 2019).
29 Kim, K.-B., "A Passport Recognition and Face Verification Using Enhanced Fuzzy ART Based RBF Network and PCA Algorithm," Journal of Intelligence and Information System, Vol.12, No.1, (2006), 17-31.
30 Kim, S. S., and Green Consumer Network in Korea, Consumer Awareness Survey on Mobile Video Service, Kim Sung Soo Representative Office, 2018. Available at http://theminjoo.kr/inspectionDetail.do?nt_id=16&bd_seq=126493 (Accessed 30 August 2019).
31 Kwon, S., M. Cha, K. Jung, W. Chen, and Y. Wang, "Prominent Features of Rumor Propagation in Online Social Media," IEEE 13th International Conference on Data Mining, (2013), 1103-1108.
32 Lafferty, J., A. McCallum, and F. C. N. Pereira, "Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data," Proceedings of the 18th International Conference on Machine Learning 2001, (2001), 282-289.
33 Ma, J., W. Gao, P. Mitra, S. Kwon, B. J. Jansen, K. F. Wong, and M. Cha, "Detecting Rumors from Microblogs with Recurrent Neural Networks," Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, (2016), 3818-3824.
34 Park, S. S., and K. C., Lee, "A Comparative Study of Text analysis and Network embedding Methods for Effective Fake News Detection," Journal of Digital Convergence, Vol.17, No.5(2019), 137-143.   DOI