Browse > Article
http://dx.doi.org/10.5392/JKCA.2020.20.05.118

Quantitative Text Mining for Social Science: Analysis of Immigrant in the Articles  

Yi, Soo-Jeong (한국외국어대학교 아랍어통번역과)
Choi, Doo-Young (한국외국어대학교 중동아프리카학과)
Publication Information
Abstract
The paper introduces trends and methodological challenges of quantitative Korean text analysis by using the case studies of academic and news media articles on "migration" and "immigration" within the periods of 2017-2019. The quantitative text analysis based on natural language processing technology (NLP) and this became an essential tool for social science. It is a part of data science that converts documents into structured data and performs hypothesis discovery and verification as the data and visualize data. Furthermore, we examed the commonly applied social scientific statistical models of quantitative text analysis by using Natural Language Processing (NLP) with R programming and Quanteda.
Keywords
Data Mining; Quantitative Analysis; R Text Mining; Immigrant; Migration;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Team R Core, "R: A language and environment for statistical computing," 2013, http://www.Rproject.org
2 K. Benoit, K. Watanabe, H. Wang, P. Nulty, A. Obeng, S. Müller, and A. Matsuo, "Quanteda: An R Package for the Quantitative Analysis of Textual Data," Journal of Open Source Software, Vol.3, No.30, p.774, 2018.   DOI
3 Taku Kudo, "MeCab," Source Forge: http://sourceforge.net/projects/mecab, 2008.
4 Borsboom, Denny, Gideon J. Mellenbergh, and Jaap Van Heerden, "The theoretical status of latent variables," Psychological review, Vol.110, No.2, p.203, 2003   DOI
5 A. Frigessi, P. Bühlmann, I. Glad, M. Langaas, S. Richardson, and M. E. Vannucci, "Statistical Analysis for High-Dimensional Data," Springer, 2016
6 K. M. Quinn, B. L. Monroe, M. Colaresi, H. M. Crespin, and D. R. Radev, "How to analyze political attention with minimal assumptions and costs," American Journal of Political Science, Vol. 54, No.1, pp.209-228, 2010.   DOI
7 Baker, Paul, Costas Gabrielatos, and Tony McEnery, "Sketching Muslims: A corpus driven analysis of representations around the word 'Muslim'in the British press 1998-2009," Applied linguistics, Vol.34, No.3, pp.255-278, 2013.   DOI
8 H. Kluver, "Europeanization of lobbying activities: When national interest groups spill over to the European level," European Integration, Vol.32, No.2, pp.175-191, 2010.   DOI
9 Wilkerson, John, David Smith, and Nicholas Stramp, "Tracing the Flow of Policy Ideas in Legislatures: A Text ReuseApproach," American Journal of Political Science, Vol.59, No.4, pp.943-956, 2015.   DOI
10 Jansa, Joshua M., Eric R. Hansen, and Virginia H. Gray, "Copy and Paste Lawmaking: LegislativeProfessionalism and Policy Reinvention in the States," forthcoming, American Politics Research, published onlineMay, 31, 2018.
11 J. Grimmer, "A Bayesian Hierarchical Topic Modelfor Political Texts: Measuring Expressed Agendas in Sen-ate Press Releases," Political Analysis, Vol.18, No.1, pp.1-35, 2010.   DOI
12 B. Grun and K. Hornik, "topicmodels: an R package for fitting topic models," Journal of Statistical Software, Vol.40, No.13, pp.1-30, 2011.
13 Rozenas, Arturas and Denis Stukal, "How Autocrats Manipulate Economic News: Evidence from Russia'sState-Controlled Television," forthcoming, Journal of Politics, Vol.81, No.3, pp.982-996, 2018.
14 S. R. Baker, "Measuring Eco-nomic Policy Uncertainty," The Quarterly Journal of Economics, Vol.131, No.4, pp.1593-1636, 2016.   DOI
15 길호현, "텍스트마이닝을 위한 한국어 불용어 목록연구," 우리말글, Vol.78, pp.1-25, 2018.   DOI
16 Higuchi Koichi, 社会調査のための計量テキスト分析, ナカニシヤ出版, 2014.
17 Ithiel de Sola Pool, Trends in Content Analysis, University of Illinois Press, 1959.
18 Kulkarni, Parag, Sarang Joshi, and Meta S. Brown, Big data analytics, PHI Learning Pvt. Ltd., 2016.
19 W. H. Inmon, Daniel Linst, and Mary Levins, Data Architecture: A Primer for the Data Scientist, London: Academic Press, 2019.
20 A. Frigessi, P. Bühlmann, I. Glad, M. Langaas, S. Richardson, and M. E. Vannucci, "Statistical Analysis for High-Dimensional Data," Springer, 2016.