DOI QR코드

DOI QR Code

Predicting the Unemployment Rate Using Social Media Analysis

  • Ryu, Pum-Mo (Dept. of ICT & Language Processing, School of Southeast Asian Studies, Busan University of Foreign Studies)
  • 투고 : 2017.08.25
  • 심사 : 2017.12.12
  • 발행 : 2018.08.31

초록

We demonstrate how social media content can be used to predict the unemployment rate, a real-world indicator. We present a novel method for predicting the unemployment rate using social media analysis based on natural language processing and statistical modeling. The system collects social media contents including news articles, blogs, and tweets written in Korean, and then extracts data for modeling using part-of-speech tagging and sentiment analysis techniques. The autoregressive integrated moving average with exogenous variables (ARIMAX) and autoregressive with exogenous variables (ARX) models for unemployment rate prediction are fit using the analyzed data. The proposed method quantifies the social moods expressed in social media contents, whereas the existing methods simply present social tendencies. Our model derived a 27.9% improvement in error reduction compared to a Google Index-based model in the mean absolute percentage error metric.

키워드

참고문헌

  1. N. Askitas and K. F. Zimmermann, "Google econometrics and unemployment forecasting," Applied Economics Quarterly, vol. 55, no. 2, pp. 107-120, 2009. https://doi.org/10.3790/aeq.55.2.107
  2. F. D'Amuri and J. Marcucci, "'Google it!' Forecasting the US unemployment rate with a Google job search index," FEEM Working Paper No. 31, 2010.
  3. J. Pavlicek and L. Kristoufek, "Nowcasting unemployment rates with google searches: evidence from the visegrad group countries," PloS One, vol. 10, no. 5, article no. e0127084, 2015.
  4. P. S. Dodds, K. D. Harris, I. M. Kloumann, C. A. Bliss, and C. M. Danforth, "Temporal patterns of happiness and information in a global social network: Hedonometrics and Twitter," PloS One, vol. 6, no. 12, article no. e26752, 2011.
  5. United Nations Global Pulse, "Using social media to add depth to unemployment statistics," UN Global Pulse White Paper, 2011.
  6. V. Lampos and N. Cristianini, "Nowcasting events from the social web with statistical learning," ACM Transactions on Intelligent Systems and Technology, vol. 3, no. 4, article no. 72, 2012.
  7. A. Signorini, A. M. Segre, and P. M. Polgreen, "The use of Twitter to track levels of disease activity and public concern in the US during the influenza A H1N1 pandemic," PloS One, vol. 6, no. 5, article no. e19467, 2011. https://doi.org/10.1371/journal.pone.0019467
  8. S. Lim, C. Lee, P. M. Ryu, H. Kim, S. K. Park, and D. Ra, "Domain-adaptation technique for semantic role labeling with structural learning," ETRI Journal, vol. 36, no. 3, pp. 429-438, 2014. https://doi.org/10.4218/etrij.14.0113.0645
  9. L. Velikovich, S. Blair-Goldensohn, K. Hannan, and R. McDonald, "The viability of web-derived polarity lexicons," in Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2010, pp. 777-785.
  10. K. J. Lee, J. E. Kim, and B. H. Yun, "Extracting multiword sentiment expressions by using a domain-specific corpus and a seed lexicon," ETRI Journal, vol. 35, no. 5, pp. 838-848, 2013. https://doi.org/10.4218/etrij.13.0113.0093
  11. C. Strapparava and A. Valitutti, "WordNet affect: an affective extension of WordNet," in Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC), Lisbon, Portugal, 2004, pp. 1083-1086.
  12. A. Esuli and F. Sebastiani, "SentiWordNet: a publicly available lexical resource for opinion mining," in Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC), Genoa, Italy, 2006, pp. 417-422.
  13. G. E. P. Box and G. M. Jenkins, Time Series Analysis: Forecasting and Control. Englewood Cliffs, NJ: Prentice Hall, 1976.