DOI QR코드

DOI QR Code

Analysis of Shipping and Logistics News Articles using Topic Modeling

토픽모델링을 활용한 해운물류 뉴스 분석

  • Hee-Young Yoon (Department of Tax Accounting, Soongeui Women's College) ;
  • Il-Youp Kwak (Department of Applied Statistics, Chung-Ang University)
  • 윤희영 (숭의여자대학교 세무회계과) ;
  • 곽일엽 (중앙대학교 응용통계학과)
  • Received : 2021.07.30
  • Accepted : 2021.08.29
  • Published : 2021.08.30

Abstract

This study focuses on three logistics-related news (Logistics Newspaper, Korea Shipping Gadget, and Korea Shipping Newspaper) in order to present changes in logistics issues, centering on Corona 19, which has recently had the greatest impact in the world. For data collection, two-year news articles in 2019 and 2020 (title, article, content, date, article classification, article URL) were collected through web crawling (using Python's BeautifulSoup, requests module) on the homepages of three representative logistics-related media companies. As for the data analysis methods, fundamental statistical analysis, Latent Dirichlet Allocation (LDA) for topic modeling, and Scattertext were performed. The analysis results were as follows. First, among the three news media related to logistics, the Korea Shipping Newspaper was carrying out the most active media activities. Second, through topic modeling with LDA, eight logistics-related topics were identified, and keywords and significant issues of each topic were presented. Third, the keywords were visually expressed through Scattertext. This is the first study to present changes in the logistics field, focusing on articles from representative logistics-related media in 2019 and 2020. In particular, 2019 and 2020 can be divided into before and after the outbreak of Corona 19, which has had a great impact not only on the logistics field but also on our lives as a whole. For future work, a multi-faceted approach is required, such as comparative studies of logistics issues between countries or presenting implications based on long-term time-series articles.

Keywords

Acknowledgement

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Ministry of Science and ICT (2020R1C1C1A01013020).

References

  1. Akrouchi, M. E., H. Benbrahim, and I. Kassou (2021), "End-to-end LDA-based Automatic Weak Signal Detection in Web News", Knowledge-Based Systems, 212, 106650
  2. Blei, D. M., A. Y. Ng, M. I. Jordan (2003), "Latent Dirichlet Allocation", Journal of Machine Learning Research, 3, 993-1022.
  3. Cho, Seong-Hwan (2018), "A Study on Analysis of the Trend of Blockchain by Key Words Network Analysis", Journal of Korea Institute of Information, Electronics, and Communication Technology (KIIECT), 11(5), 550-555.
  4. Jang, Duck-Hee and Ki-Muck Park (2017), "Policy Issue Discernment of the Ocean and Fishery Field by the Press : Focus on the Big Data Analysis on Journal Articles of Central Daily News Papers", The Journal of Maritime Business, 37, 195-203.
  5. Jung, Kil-Su, Sung-Hun Park and Gi-Tae Yeo (2020), "An Analysis of Social Mining of Big Data on the Bankruptcy of Hanjin Shipping", Journal of Shipping and Logistics, 36(1), 19-44. https://doi.org/10.37059/TJOSAL.2020.36.1.19
  6. Kessler, J. S. (2017), "Scattertext: a Browser-Based Tool for Visualizing how Corpora Differ", Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics-System Demonstrations, 85-90.
  7. Kim, Tae-Jong and Sang-Ok Park (2019), "An Analysis of Lifelong Education Topics by Using News Big Data", Journal of Lifelong Education, 25(3), 29-63.
  8. Kwon, Choong-Hoon (2019) "Exploring the Changes Trend in Media Articles on 'Autonomous Private High School' by Recent Governments -Using the BIGKinds System-", The Journal of Humanities and Social Sciences, 12(1), 1757-1771.
  9. Lee, Jae-Young (2019), "Historical Analysis of Global and Domestic Trends in Environmental Education by Using Big Data", Korean Journal of Environmental Education, 32(3), 261-275. https://doi.org/10.17965/KJEE.2019.32.3.261
  10. Lee, Kang-Su, Su-an Lee, Seok Kang, Chan-min Park and Jin-ho Kim (2016), "Design and Implementation of Text Visualization Tools for Analyzing Big Data in Logistics Industry", Journal of Information Technology and Architecture, 13(2), 355-365
  11. Noh, Yu-seok, Yong-Hwan Oh and Seong-Bae Park (2014), A Location-based Personalized News Recommendation, Bangkok, Thailand: 2014 International Conference on Big Data and Smart Computing (BIGCOMP).
  12. Papadimitriou, C. H., P. Raghavan, H. Tamaki, S. Vempala (2000), "Latent Semantic Indexing: A Probabilistic Analysis", Journal of Computer and System Sciences, 61(2), 217-235. https://doi.org/10.1006/jcss.2000.1711
  13. Park, Jung-Sub and Jae-Eun Lee (2020), "A Study on the Change of Logistics in the Pandemic Age", Korea Logistics Review, 30(4), 37-47.
  14. Park, Sang-Jun and Do-Yeon Won (2020), "Analysis of trends in consumption and happiness through the media since the 2000s - Focusing on Big Kinds and Google Trend Analysis", Review of Culture and Economy, 23(2), 59-84.
  15. Ho, S. Y. E., and P. Crosthwaite (2018), "Exploring Stance in the Manifestos of 3 Candidates for the Hong Kong Chief Executive Election 2017: Combining CDA and Corpus-like Insights", Discourse & Society, 29(6), 629-654. https://doi.org/10.1177/0957926518802934