Browse > Article
http://dx.doi.org/10.13088/jiis.2016.22.3.045

Public Sentiment Analysis of Korean Top-10 Companies: Big Data Approach Using Multi-categorical Sentiment Lexicon  

Kim, Seo In (School of Business, Hanyang University)
Kim, Dong Sung (School of Business, Hanyang University)
Kim, Jong Woo (School of Business, Hanyang University)
Publication Information
Journal of Intelligence and Information Systems / v.22, no.3, 2016 , pp. 45-69 More about this Journal
Abstract
Recently, sentiment analysis using open Internet data is actively performed for various purposes. As online Internet communication channels become popular, companies try to capture public sentiment of them from online open information sources. This research is conducted for the purpose of analyzing pulbic sentiment of Korean Top-10 companies using a multi-categorical sentiment lexicon. Whereas existing researches related to public sentiment measurement based on big data approach classify sentiment into dimensions, this research classifies public sentiment into multiple categories. Dimensional sentiment structure has been commonly applied in sentiment analysis of various applications, because it is academically proven, and has a clear advantage of capturing degree of sentiment and interrelation of each dimension. However, the dimensional structure is not effective when measuring public sentiment because human sentiment is too complex to be divided into few dimensions. In addition, special training is needed for ordinary people to express their feeling into dimensional structure. People do not divide their sentiment into dimensions, nor do they need psychological training when they feel. People would not express their feeling in the way of dimensional structure like positive/negative or active/passive; rather they express theirs in the way of categorical sentiment like sadness, rage, happiness and so on. That is, categorial approach of sentiment analysis is more natural than dimensional approach. Accordingly, this research suggests multi-categorical sentiment structure as an alternative way to measure social sentiment from the point of the public. Multi-categorical sentiment structure classifies sentiments following the way that ordinary people do although there are possibility to contain some subjectiveness. In this research, nine categories: 'Sadness', 'Anger', 'Happiness', 'Disgust', 'Surprise', 'Fear', 'Interest', 'Boredom' and 'Pain' are used as multi-categorical sentiment structure. To capture public sentiment of Korean Top-10 companies, Internet news data of the companies are collected over the past 25 months from a representative Korean portal site. Based on the sentiment words extracted from previous researches, we have created a sentiment lexicon, and analyzed the frequency of the words coming up within the news data. The frequency of each sentiment category was calculated as a ratio out of the total sentiment words to make ranks of distributions. Sentiment comparison among top-4 companies, which are 'Samsung', 'Hyundai', 'SK', and 'LG', were separately visualized. As a next step, the research tested hypothesis to prove the usefulness of the multi-categorical sentiment lexicon. It tested how effective categorial sentiment can be used as relative comparison index in cross sectional and time series analysis. To test the effectiveness of the sentiment lexicon as cross sectional comparison index, pair-wise t-test and Duncan test were conducted. Two pairs of companies, 'Samsung' and 'Hanjin', 'SK' and 'Hanjin' were chosen to compare whether each categorical sentiment is significantly different in pair-wise t-test. Since category 'Sadness' has the largest vocabularies, it is chosen to figure out whether the subgroups of the companies are significantly different in Duncan test. It is proved that five sentiment categories of Samsung and Hanjin and four sentiment categories of SK and Hanjin are different significantly. In category 'Sadness', it has been figured out that there were six subgroups that are significantly different. To test the effectiveness of the sentiment lexicon as time series comparison index, 'nut rage' incident of Hanjin is selected as an example case. Term frequency of sentiment words of the month when the incident happened and term frequency of the one month before the event are compared. Sentiment categories was redivided into positive/negative sentiment, and it is tried to figure out whether the event actually has some negative impact on public sentiment of the company. The difference in each category was visualized, moreover the variation of word list of sentiment 'Rage' was shown to be more concrete. As a result, there was huge before-and-after difference of sentiment that ordinary people feel to the company. Both hypotheses have turned out to be statistically significant, and therefore sentiment analysis in business area using multi-categorical sentiment lexicons has persuasive power. This research implies that categorical sentiment analysis can be used as an alternative method to supplement dimensional sentiment analysis when figuring out public sentiment in business environment.
Keywords
Sentiment Analysis; dimensional sentiment structure; categorical sentiment structure; Multi-categorical sentiment lexicon;
Citations & Related Records
Times Cited By KSCI : 17  (Citation Analysis)
연도 인용수 순위
1 Lee, H. N., G. Y. Choi, S. W. Jung, S. J. Park and Y. S. Jung, "Strategic feeling defined through weight analysis of representative feelings," Proceeding of Eromonomics Society of Korea, 281-285.
2 Lee, J. W., H. J. Song, E. K. Nah and H. S. Kim, "Classification of Emotion Terms in Korean," Korean Journal of Journalism & Communication Studies, Vol.52, No.1(2008), 85-116.
3 Lee, K. B., J. B. Baik and S. W, Lee, "Estimating a Pleasure-Displeasure Index of Word based on Word Similarity in SNS," Journal of KIISE : Computing Practices and Letters, Vol.20, No.3(2014), 159-164.
4 Lee, S. H, J. Choi and J. W. Kim, "Sentiment analysis on movie review through building modified sentiment dictionary by movie genre," Journal of Intelligence and Information Systems, Vol.22, No.2(2016), 97-113.   DOI
5 Lee, S. Y, J. S. Ham and I. J. Ko, "A Classification and Selection Method of Emotion Based on Classifying Emotion Terms by Users," Korean Journal of the Science of Emotion and Sensibility, Vol.19, No.1(2016), 39-49.   DOI
6 Park, I. C, "Study on Brand Image Enhancement and Sensitivity Advertising," The Treatise on The Plastic Media, Vol.18, No.2(2015), 127-132.
7 Park, I. J. and K. H. Min, "Making a List of Korean Emotion Terms and Exploring Dimensions Underlying Them," The Korean journal of social and personality psychology, Vol.19, No.1(2005), 109-129.
8 Rhee, J. W., H. J. Song, E. K. Na and H. S. Kim, "Classification of Emotion Terms in Korean," Korean Journal of Journalism & Communication Studies, Vol.52, No.2(2008), 85-116.
9 Rhee, S. Y, J. S. Ham and L. J. Ko, "A Classification and Selection Method of Emotion Based on Classifying Emotion Terms by Users," Korean Journal of the Science of Emotion and Sensibility, Vol.15, No.1(2012), 105-120.
10 Seo, J. H, H. J. Jo and J. T. Choi, "Design for Opinion Dictionary of Emotion Applying Rules for Antonym of the Korean Grammar," Journal of Korean Institute of Information Technology, Vol.13, No.2(2015), 109-117.
11 Sohn, S. J, M. S. Park, J. E. Park and J. H. Sohn, "Korean Emotion Vocabulary: Extraction and Categorization of Feeling words," Science of Emotion and Sensibility, Vol.15, No.1(2012), 105-120.
12 Song, M. J., "Tracking on Attention to the Emotion and Sensibility and its Application at the Innovative Companies: Focused on Content Analysis of Annual Reports," Science of Emotion and Sensibility, Vol.19, No.1(2016), 39-48.   DOI
13 Ahn, E. J. and Y. H. Hwang, "Theory and Practice of Lemma List Construction for a Dictionary-Focused on Yonsei Contemporary Korean Dictionary Compilation," Journal of Korealex, Vol.15(2010), 165-193.
14 Ahn, J. G. and H. W. Kim, "Building a Korean Sentiment Dictionary and Applications of Natural Language Processing," Proceeding of Journal of Intelligence and Information Systems, Vol.2014, No.11, 177-182.
15 An. J. Y, J. H. Bae, N. G. Han and M. Song, "A Study of "Emotion Trigger' by Text Mining Techniques," Journal of Intelligence and Information Systems, Vol.21, No.2(2015), 69-92.   DOI
16 Ahn, S. H, S. H. Lee and O. S. Kwon, "A Study of Activation dimension: A mirage in the affective space," Korean Journal of Social Psychology, Vol.7, No.1(1993), 107-123.
17 Baek, B. H., L. K. Ha and B. C. Ahn, "An Extration Method of Sentiment Information from Unstructured Big Data on SNS," Journal of Korea Multimedia Society, Vol.17, No.6(2014), 671-680.   DOI
18 Ekman, P. and H. Oster, "Facial Expressions of Emotion," Annual Review of Psychology, Vol.30(1979), 527-554.   DOI
19 Cha, Y. S., J. H. Park, J. H. Kim, S. Y. Kim, D. K. Kim and M. C. Whang, "Validity analysis of the social emotion model based on relation types in SNS," Science of Emotion and Sensibility, Vol.15, No.2(2012), 283-296.
20 Choi, S. J., Y. E. Song and O. B. Kwon, "Analyzing Contextual Polarity of Unstructured Data for Measuring Subjective Well-Being," Journal of Intelligence and Information Systems, Vol.22, No.1(2015), 83-105.   DOI
21 Greenwald, M. K., E. W. Cook and P. J. Lang, "Affective judgment and psychophysiological response: Dimensional covariation in the evaluation of pictorial stimuli," Journal of Psychophysiology, Vol.3, No.1(2007), 17-25.
22 Jang, P. S., "Study on Principal Sentiment Analysis of Social Data," Journal of Korean Institute of Information Technology, Vol.19, No.12(2014), 49-56.
23 Jung, J. S, D. S. Kim and J. W. Kim, "Influence analysis of Internet buzz to corporate performance: Individual stock price prediction using sentiment analysis of online news," Journal of Intelligence and Information Systems, Vol.21, No.4(2015), 37-51.   DOI
24 Kang, S. A., Y. S. Kim and S. H. Choi, "Study on the social issue sentiment classification using text mining," Journal of the Korean Data & Information Science Society, Vol.26, No.5(2015), 1167-1173.   DOI
25 Kim, D. H., T. M. Cho and J. H. Lee, "A Domain Adaptive Sentiment Dictionary Construction Method for Domain Sentiment Analysis," Proceedings of the Korean Society of Computer Information Conference, Vol.23, No.1(2015), 15-18.
26 Kotler, P., "Marketing 3.0: From Products to Customers to the Human Spirit," 1000, Wiley, 2010.
27 Kim, M. K., J. H. Kim, M. H. Cha and S. H. Chae, "An Emotion Scanning System on Text Documents," Korean Journal of the Science of Emotion and Sensibility, Vol.12, No.4(2009), 433-442.
28 Kim, S. W. and N. G. Kim, "A Study on the Effect of Using Sentiment Lexicon in Opinion Classification," Journal of Intelligence and Information Systems, Vol.20, No.1(2014), 121-128.
29 Kim, Y. S., N. G. Kim and S. R. Jung, "Stock-Index Invest Model Using News Big Data Opinion Mining," Journal of Intelligence and Information Systems, Vol.18, No.2(2012), 143-156.   DOI
30 Kwon, O. K. and J. Heo, "Automatic Clustering of Korean Sentiment Words Based on Newspaper Articles, Proceeding of Korean Information Science Society, Vol.2014, No.12(2014), 147-149.
31 Lee, D. H., H. K. Kang, S. H. Kim and C, M, Lee, "Autocorrelation Analysis of the Sentiment with Stock Information Appearing on Big-Data" Korean Journal of Finance Engineering, Vol.12, No.2(2013), 79-96.