Browse > Article
http://dx.doi.org/10.13088/jiis.2018.24.3.199

Issue tracking and voting rate prediction for 19th Korean president election candidates  

Seo, Dae-Ho (Graduate School of Information, Yonsei University)
Kim, Ji-Ho (Division of Industrial Management Engineering, Korea University)
Kim, Chang-Ki (Graduate School of Information, Yonsei University)
Publication Information
Journal of Intelligence and Information Systems / v.24, no.3, 2018 , pp. 199-219 More about this Journal
Abstract
With the everyday use of the Internet and the spread of various smart devices, users have been able to communicate in real time and the existing communication style has changed. Due to the change of the information subject by the Internet, data became more massive and caused the very large information called big data. These Big Data are seen as a new opportunity to understand social issues. In particular, text mining explores patterns using unstructured text data to find meaningful information. Since text data exists in various places such as newspaper, book, and web, the amount of data is very diverse and large, so it is suitable for understanding social reality. In recent years, there has been an increasing number of attempts to analyze texts from web such as SNS and blogs where the public can communicate freely. It is recognized as a useful method to grasp public opinion immediately so it can be used for political, social and cultural issue research. Text mining has received much attention in order to investigate the public's reputation for candidates, and to predict the voting rate instead of the polling. This is because many people question the credibility of the survey. Also, People tend to refuse or reveal their real intention when they are asked to respond to the poll. This study collected comments from the largest Internet portal site in Korea and conducted research on the 19th Korean presidential election in 2017. We collected 226,447 comments from April 29, 2017 to May 7, 2017, which includes the prohibition period of public opinion polls just prior to the presidential election day. We analyzed frequencies, associative emotional words, topic emotions, and candidate voting rates. By frequency analysis, we identified the words that are the most important issues per day. Particularly, according to the result of the presidential debate, it was seen that the candidate who became an issue was located at the top of the frequency analysis. By the analysis of associative emotional words, we were able to identify issues most relevant to each candidate. The topic emotion analysis was used to identify each candidate's topic and to express the emotions of the public on the topics. Finally, we estimated the voting rate by combining the volume of comments and sentiment score. By doing above, we explored the issues for each candidate and predicted the voting rate. The analysis showed that news comments is an effective tool for tracking the issue of presidential candidates and for predicting the voting rate. Particularly, this study showed issues per day and quantitative index for sentiment. Also it predicted voting rate for each candidate and precisely matched the ranking of the top five candidates. Each candidate will be able to objectively grasp public opinion and reflect it to the election strategy. Candidates can use positive issues more actively on election strategies, and try to correct negative issues. Particularly, candidates should be aware that they can get severe damage to their reputation if they face a moral problem. Voters can objectively look at issues and public opinion about each candidate and make more informed decisions when voting. If they refer to the results of this study before voting, they will be able to see the opinions of the public from the Big Data, and vote for a candidate with a more objective perspective. If the candidates have a campaign with reference to Big Data Analysis, the public will be more active on the web, recognizing that their wants are being reflected. The way of expressing their political views can be done in various web places. This can contribute to the act of political participation by the people.
Keywords
19th president election; comments; text mining; Big data analysis;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Livne, A., Simmons, M. P., Adar, E., and Adamic, L. A., "The Party Is Over Here: Structure and Content in the 2010 Election," ICWSM, Vol. 11(2011), 17-21.
2 Mimno, David, Hanna Wallach, and Andrew McCallum." Gibbs sampling for logistic normal topic models withgraph-based priors," NIPS Workshop on Analyzing Graphs. Vol. 2008(2008).
3 Mullen, Tony, and Nigel Collier. "Sentiment analysis using support vector machines with diverse information sources," Proceedings of the 2004 conference on empirical methods in natural language processing. 2004.
4 O'Connor, Brendan, Routledge, B. R., and Smith, N. A, "From tweets to polls: Linking text sentiment to public opinion time series," Icwsm, Vol. 11, No.122-129(2010), 1-2.
5 Opinion Concentration Investigation Committee, "Opinion concentration survey results", 2016.01.21.
6 Pai. M. Y, Chen. M. Y, Chu. H. C, and Chen. Y. M, "Development of a semantic-based content mapping mechanism for information retrieval", Expert Systems with Applications, vol. 40, No. 7(2013), 2447-2461.   DOI
7 Park. H, Seo. W, Coh. B, Lee. J, and J. Yoon, "Technology Opportunity Discovery Based on Firms' Technologies and Products", Journal of the Korean Institute of Industrial Engineers, Vol. 40, No. 5(2014), 442-450.   DOI
8 Rayner, and Keith. "Visual attention in reading: Eye movements reflect cognitive processes," Memory and Cognition, Vol. 5, No. 4(1977), 443-448.   DOI
9 Liu, G. Y., Hu, J. M., and Wang, H. L., "A co-word analysis of digital library field in China. Scientometrics,", Vol. 91, No. 1(2012), 203-217.   DOI
10 Rebholz-Schuhmann, Dietrich, Harald Kirsch, and Francisco Couto, "Facts from text-is text mining ready to deliver?." PLoS biology, Vol. 3, No. 2(2005).
11 Tan, and Ah-Hwee., "Text mining: The state of the art and the challenges," In: Proceedings of the PAKDD 1999 Workshop on Knowledge Disocovery from Advanced Databases. Vol. 8(1999), 65-70.
12 Shin. S. W., and Y. W. Lee, "Noun and Keyword Extraction for Korean Information Processing", Journal of the Korea Computer Information Society, Vol. 14, No.3(2009), 51-56.
13 Song. H. J., H. S. Kim, and W. J. Lee, "The Impact of Cognitive Appraisal and Emotional Response on Political Behavior", Korean Media Scholarship, Vol. 51, No. 4(2008), 353-376.
14 Steyvers, Mark, and Tom Griffiths., "Probabilistic topic models," Handbook of latent semantic analysis, Vol. 427, No. 7(2007), 424-440.
15 Tan, S., Cheng, X., Wang, Y., and Xu, H., "Adapting naive bayes to domain adaptation for sentiment analysis," European Conference on Information Retrieval(2009), 337-349.
16 Welch, Susan, and John R. Hibbing., "The effects of charges of corruption on voting behavior in congressional elections, 1982-1990.," The Journal of Politics, Vol. 59, No. 1(1997), 226-239.   DOI
17 Bae. J. H., J. E. Son, and M Song, "Twitter analysis of 2012 presidential elections using text mining", Intelligence Information Research, Vol. 19, No.3(2013), 141-156.
18 Tumasjan, A., Sprenger, T. O., Sandner, P. G., and Welpe, I. M., "Predicting elections with twitter: What 140 characters reveal about political sentiment," Icwsm, Vol, 10, No. 1(2010), 178-185.
19 Vergeer, M., Hermans, L., and Sams, S. "Is the voteronly a tweet away? Micro blogging during the 2009 European Parliament election campaign in the Netherlands." First Monday, Vol, 16, No. 8(2011).
20 Wang, H., Can, D., Kazemzadeh, A., Bar, F., and Narayanan, S., "A system for real-time twitter sentiment analysis of 2012 us presidential election cycle," In Proceedings of the ACL 2012 System Demonstrations, Association for Computational Linguistics(2012), 115-120.
21 Yoo. H. J., "An Empirical Study on the Effect of Information Environment on Voter Choice in Election", Korean Political Science Bulletin, Vol. 42, No. 4(2008), 155-188.
22 Williams, Christine B., and Girish Gulati., "The political impact of Facebook: Evidence from the 2006 midterm elections and 2008 nomination contest," Politics and Technology Review 1.1. (2008), 11-24..
23 Xianghua, F., Guo, L., Yanyan, G., and Zhiqiang, W., "Multi-aspect sentiment analysis for Chinese online social reviews based on topic modeling and HowNet lexicon," Knowledge-Based Systems, Vol. 37(2013), 186-195.   DOI
24 Yonhap News, "Men in their 30s spend the most time for commenting in Naver news.", 2016.05.29.
25 Breitzman. A. F, and Mogee. M. E, "The many applications of patent analysis", Vol. 28(2002), 187-205.   DOI
26 Balota, David A., and James I. Chumbley. "Are lexical decisions a good measure of lexical access? The role of word frequency in the neglected decision stage," Journal of Experimental Psychology: Human perception and performance, Vol. 10, No. 3(1984), 340.   DOI
27 Balota, David A., and James I. Chumbley. "The locus of word-frequency effects in the pronunciation task: Lexical access and/or production?.," Journal of Memory and Language, Vol. 24, No. 1(1985), 89-106.   DOI
28 Blei, David M., Andrew Y. Ng, and Michael I. Jordan. "Latent dirichlet allocation." Journal of machine Learning, research 3(2003), 993-1022.
29 Castro, Rodrigo, Leonardo Kuffo, and Carmen Vaca., "Back to# 6D: Predicting Venezuelan states political election results through Twitter," eDemocracy and eGovernment (ICEDEG), 2017 Fourth International Conference(2017), 148-153.
30 Chumbley, James I., and David A. Balota, "A word's meaning affects the decision in lexical decision," Memory and Cognition, Vol. 12, No. 6(1984), 590-606.   DOI
31 Chakrabarti, Soumen. "Mining the Web: Discovering knowledge from hypertext data.", Elsevier(2002).
32 Chakraborty, Goutam, Murali Pagolu, and Satish Garla. "Text mining and analysis: practical methods, examples, and case studies using SAS.", SAS Institute(2014).
33 Chang, J., Gerrish, S., Wang, C., Boyd-Graber, J. L., and Blei, D. M., "Reading tea leaves: How humans interpret topic models.," In Advances in neural information processing systems, (2009), 288-296.
34 Fenoll, Vicente, and Lorena Cano-Oron. "Citizen engagement on Spanish political partie's Facebook pages: Analysis of the 2015 electoral campaign comments," Communication and Society, Vol. 30, No. 4(2017).
35 Cho. G. H, Lim. S. Y, and Hur. S, "An Analysis of the Research Methodologies and Techniques in the Industrial Engineering Using Text Mining", Journal of the Korean Institute of Industrial Engineers, vol. 40, No. 1(2014), 52-59.   DOI
36 Cho. S. G, and S. B. Kim, "Finding Meaningful Pattern of Key Words in IIE Transactions Using Text Mining", Journal of the Korean Institute of Industrial Engineers, Vol. 38, No. 1(2012), 67-73.   DOI
37 Choi, Y. J., and S.S. Park, "Interplay of text mining and data mining for classifying web contents.", Korean Journal of Cognitive Science Vol. 13, No. 3(2002), 33-46.
38 Chung, Jessica Elan, and Eni Mustafaraj., "Can collective sentiment expressed on twitter predict political elections?.," AAAI. Vol. 11(2011), 1770-1771.
39 DMC Media, "2017 Social Media Usage Behavior and Ad Contact Attitude Analysis Report", DMC Report, 2017.07.10.
40 Ferber, Paul, Franz Foltz, and Rudy Pugliese. "The internet and public participation: state legislature web sites and the many definitions of interactivity," Bulletin of Science, Technology and Society, Vol. 25, No. 1(2005), 85-93.   DOI
41 Ha. J. W., "A Study on Internet Politics Participation of College Students", Korean Press Information(2006), 369-405.
42 Just, Marcel A., and Patricia A. Carpenter. "A theory of reading: From eye fixations to comprehension," Psychological review, Vol. 87, No. 4(1980), 329.   DOI
43 He, Wu, Shenghua Zha, and Ling Li, "Social media competitive analysis and text mining: A case study in the pizza industry." International Journal of Information Management, Vol. 33, No. 3(2013), 464-472.   DOI
44 Holton. C, "Identifying disgruntled employee systems fraud risk through text mining: A simple solution for a multi-billion dollar problem.", Decision Support Systems, Vol. 46, No. 4(2009), 853-864.   DOI
45 Inhoff, and Albrecht Werner. "Two stages of word processing during eye fixations in the reading of prose," Journal of verbal learning and verbal behavior, Vol. 23, No. 5(1984), 612-624.   DOI
46 Jang. P. S., "Research on the main emotional analysis of social data", Journal of the Korea Computer Information Society, Vol. 19, No. 12(2014), 49-56.
47 Jung. I. T., "The Effect of Voter's Use of Social Media on the Determinants of Voting", Journalism Research, Vol. 18, No. 4(2014), 239-278.
48 Kam. J. S, Kim. M. W, and B. H. Hyun, "A Study on Analysis of Patent Information Based Biotechnology Research Trend and Promising Research Themes", The Korea Society for Innovation Management and Economics, Vol. 21, No. 2(2013), 25-56.
49 Kang. B. G, M. Y. Huh, and S. B. Choi, "Performance analysis of volleyball games using the social network and text mining techniques.", Journal of the Korean Data and Information Science Society Vol. 26, No. 3(2015), 619-630.   DOI
50 Kim. H. Y, "Analysis of an Inaugural Address of Korean Presidents Based on Network", Korea Content Association, Vol. 3, No. 2(2013), 67-68.
51 Liu, B., "Sentiment analysis and opinion mining," Synthesis lectures on human language technologies, Vol. 5, No. 1(2012), 1-167.   DOI
52 Kim. M, Notkin. D, Grossman. D, and Wilson. G, "Identifying and summarizing systematic code changes via rule inference", Software Engineering, IEEE Transactions on, vol. 39(2013), 45-62.   DOI
53 Lee, S. G., "Study on the Improvement of e-Learning Satisfaction based on Text Mining", Yonsei Univ Master Thesis(2018).
54 Lee, Y, N., E. J. Choi, and M. J. Kim, "Analysis of the effects of presidential candidates' SNS reputation on election results", Digital fusion research, Vol. 16, No. 2(2018), 195-201.