Browse > Article
http://dx.doi.org/10.7465/jkdi.2016.27.4.845

Classification of ratings in online reviews  

Choi, Dongjun (Department of Statistics, University of Seoul)
Choi, Hosik (Applied Information Statistics, Kyonggi University)
Park, Changyi (Department of Statistics, University of Seoul)
Publication Information
Journal of the Korean Data and Information Science Society / v.27, no.4, 2016 , pp. 845-854 More about this Journal
Abstract
Sentiment analysis or opinion mining is a technique of text mining employed to identify subjective information or opinions of an individual from documents in blogs, reviews, articles, or social networks. In the literature, only a problem of binary classification of ratings based on review texts in an online review. However, because there can be positive or negative reviews as well as neutral reviews, a multi-class classification will be more appropriate than the binary classification. To this end, we consider the multi-class classification of ratings based on review texts. In the preprocessing stage, we extract words related with ratings using chi-square statistic. Then the extracted words are used as input variables to multi-class classifiers such as support vector machines and proportional odds model to compare their predictive performances.
Keywords
Multi-class classification; opinion mining; sentiment analysis; word cloud;
Citations & Related Records
Times Cited By KSCI : 6  (Citation Analysis)
연도 인용수 순위
1 Agresti, A. (2002). Categorical data analysis, 2nd Ed., Wiley, New Jersey
2 Bae, K. Y., Park, J.-H., Kim, J. S., and Chae, M., Kang, M., and Lee, Y.-S. (2013). Analysis of the abstracts of research articles in food related to climate change using a text-mining algorithm. Journal of the Korean Data & Information Science Society, 24, 1429-1437.   DOI
3 Chae, M., Kang, M., and Kim, Y. (2013). Documents recommendation using large citation data. Journal of the Korean Data & Information Science Society, 24, 999-1011.   DOI
4 Hand, D. J. and Till, R. J. (2001). A simple generalisation of the area under the ROC curve for multiple class classification problems. Machine Learning, 45, 171-186.   DOI
5 Hsu, C.-W. and Lin, C.-J. (2002). A comparison of methods for multiclass support vector machines, IEEE Transactions on neural networks, 13, 415-425.   DOI
6 Kim, K.-J. and Ahn, H.C. (2010). Customer level classification model usings ordinal multiclass support vector machines. Asia Pacific Journal of Information Systems, 20, 23-37.
7 Kim, S. O., Lee, S. Y., Lee, S. J., and Lee, H. C. (2013). A study of development for movie recommendation system algorithm using filtering. Journal of the Korean Data & Information Science Society, 24, 803-813.   DOI
8 Kim, S. and Kim, N. (2014). A Study on the effect of using sentiment lexicon in opinion classification. Journal of Intelligence and Information Systems, 20, 133-148.
9 Lee, H and Hong, T. (2015). Terms based sentiment classification for online review using support vector machine. Information Systems Review, 17, 49-64.
10 Lee, H. and Suh, Y. (2014). Social media comparative analysis based on multidimensional scaling. Journal of the Korean Data & Information Science Society, 25, 665-676.   DOI
11 Liu, B. (2012). Sentiment analysis and opinion mining, Morgan & Claypool Publishers, San Bernardino, California.
12 Munzert, S., Rubba, C., Meissner, P. and Nyhuis, D. (2015). Automated data collecction with R, Wiley, West Sussex, United Kingdom.
13 Vapnik, V. (1995). The nature of statistical learning, Springer, New York.