Browse > Article
http://dx.doi.org/10.9717/kmms.2014.17.2.232

An Empirical Comparison of Machine Learning Models for Classifying Emotions in Korean Twitter  

Lim, Joa-Sang (상명대학교 미디어소프트웨어학과)
Kim, Jin-Man (상명대학교 일반대학원 컴퓨터과학과)
Publication Information
Abstract
As online texts have been rapidly growing, their automatic classification gains more interest with machine learning methods. Nevertheless, comparatively few research could be found, aiming for Korean texts. Evaluating them with statistical methods are also rare. This study took a sample of tweets and used machine learning methods to classify emotions with features of morphemes and n-grams. As a result, about 76% of emotions contained in tweets was correctly classified. Of the two methods compared in this study, Support Vector Machines were found more accurate than Na$\ddot{i}$ve Bayes. The linear model of SVM was not inferior to the non-linear one. Morphological features did not contribute to accuracy more than did the n-grams.
Keywords
Machine Learning; Support Vector Machine; Na$\ddot{i}$ve Bayes; Twitter emotion classification;
Citations & Related Records
Times Cited By KSCI : 6  (Citation Analysis)
연도 인용수 순위
1 Michael W Morris and Dacher Keltner, "How Emotions Work: the Social Functions of Emotional Expression in Negotiations," Research in Organizational Behavior, Vol. 22, pp. 1-50, 2000.   DOI
2 Gerald L Clore, Norbert Schwarz, and Michael Conway, Handbook of Social Cognition, Psychology Press, New York, pp. 323-417, 1994.
3 홍초희, 김학수, "트윗 감정 분류를 위한 다양한 기계학습 자질에 대한 비교 연구," 한국콘텐츠학회논문지, 제12권, 제12호, pp. 471-478, 2012.   과학기술학회마을   DOI   ScienceOn
4 이철성, 최동희, 김성순, 강재우, "한글 마이크로블로그 텍스트의 감정 분류 및 분석," 정보과학회논문지:데이타베이스, 제40권, 제3호, pp. 159-167, 2013.   과학기술학회마을
5 김민철, 심규승, 한남기, 김예은, 송민, "트위터상의 악의적 이용 자동분류," 한국문헌정보학회지, 제47권, 제1호, pp. 269-286, 2013.
6 Angela Fahrni and Manfred Klenner, "Old Wine or Warm Beer: Target-specific Sentiment Analysis of Adjectives," Proc. The Symposium on Affective Language in Human and Machine , pp. 60-63, 2008.
7 Minqing Hu and Bing Liu, "Mining and Summarizing Customer Reviews," Proc. The Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 168-177, 2004.
8 Bo Pang and Lillian Lee, "A Sentimental Education: Sentiment Analysis using Subjectivity Summarization based on Minimum Cuts," Proc. The 42nd Annual Meeting on Association for Computational Linguistics, pp. 271, 2004.
9 Xiaowen Ding, Bing Liu, and Philip S Yu, "A Holistic Lexicon-based Approach to Opinion Mining," Proc. The International Conference on Web Search and Web Data Mining, pp. 231-240, 2008.
10 Maite Taboada, Julian Brroke, Milan Tofiloski, Kimberly Voll, and Manfred Stede, "Lexicon-based Methods for Sentiment Analysis," Computational Linguistics, Vol. 37, No. 2, pp. 267-307, 2011.   DOI
11 Ley Zhang, Riddhiman Ghosh, Mohamed Dekhil, Meichun Hsu, and Bing Liu, Combining Lexiconbased and Learning-based Methods for Twitter Sentiment Analysis, HP Laboratories, Technical Report HPL-2011, Vol. 89, 2011.
12 Taku Kudo, MeCab. version 0.996, 2013.
13 Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan, "Thumbs Up? Sentiment Classification using Machine Learning Techniques," Proc. Emnlp 2002, pp. 79-86, 2002.
14 이공주, 김재훈, 서형원, 류길수, "뉴스 댓글의 감정 분류를 위한 자질 가중치 설정," 한국마린엔지니어링학회지, 제34권, 제6호, pp. 871-879, 2010.   과학기술학회마을   DOI   ScienceOn
15 Alec Go, Richa Bhayani, and Lei Huang, Twitter Sentiment Classification using Distant Supervision, CS224N Project Report, Stanford, pp. 1-12, 2009.
16 이준호, 안정수, 박현주, 김명호, "한글 문서의 효과적인 검색을 위한 n-Gram 기반의 색인 방법," 정보관리학회지, 제13권, 제1호, pp. 47-63, 1996.   과학기술학회마을
17 김철수, 김양범, "대용량 전자사전 구축을 위한 국어 대사전의 통계 정보," 한국콘텐츠학회논문지, 제7권, 제6호, pp. 60-68, 2007.   과학기술학회마을   DOI
18 J Susan Milton and Jesse C Arnold, Introduction to Probability and Statistics: Principles and Applications for Engineering and the Computing Sciences, McGraw-Hill, Inc., New York, 2002.
19 Bernhard E Boser, Isabelle M Guyon, and Vladimir N Vapnik, "A Training Algorithm for Optimal Margin Classifiers," Proc. The Fifth Annual Workshop on Computational Learning Theory, pp. 144-152, 1992.
20 Jiawei Han, Micheline Kamber, and Jian Pei, Data Mining: Concepts and Techniques, Morgan kaufmann, San Francisco, California, 2006.
21 Thorsten Joachims, "Text Categorization with Support Vector Machines: Learning with Many Relevant Features," 1998.
22 Yiming Yang and Xin Liu, "A Re-examination of Text Categorization Methods," Proc. The 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 42-49, 1999.
23 Jason DM Rennie and Ryan Rifkin, Improving Multi Class Text Classification with the Support Vector Machine, Technical Report 2001-026, MIT. 2001.
24 황두성, "지지벡터기계를 이용한 다중 분류 문제의 학습과 성능 비교," 멀티미디어학회논문지, 제11권, 제7호, pp. 1035-1042, 2008.   과학기술학회마을
25 Sotiris B Kotsiantis, "Supervised Machine Learning: a Review of Classification Techniques," Informatica, Vol. 31, No. 3, pp. 249-268, 2007.
26 Peggy A Thoits, "The Sociology of Emotions," Annual Review of Sociology, Vol. 15, pp. 317-342, 1989.   DOI   ScienceOn
27 Fabrice Colas and Pavel. Brazdil, "Comparison of Svm and Some Older Classification Algorithms in Text Classification Tasks," In Artificial Intelligence in Theory and Practice, Vol. 217, pp. 169-178, 2006.   DOI