Browse > Article
http://dx.doi.org/10.11627/jkise.2018.41.4.131

Unstructured Data Quantification Scheme Based on Text Mining for User Feedback Extraction  

Jo, Jung-Heum (Industrial Engineering, Hongik University)
Chung, Yong-Taek (Industrial Engineering, Hongik University)
Choi, Seong-Wook (Industrial Engineering, Hongik University)
Ok, Changsoo (Industrial Engineering, Hongik University)
Publication Information
Journal of Korean Society of Industrial and Systems Engineering / v.41, no.4, 2018 , pp. 131-137 More about this Journal
Abstract
People write reviews of numerous products or services on the Internet, in their blogs or community bulletin boards. These unstructured data contain important emotions and opinions about the author's product or service, which can provide important information for future product design or marketing. However, this text-based information cannot be evaluated quantitatively, and thus they are difficult to apply to mathematical models or optimization problems for product design and improvement. Therefore, this study proposes a method to quantitatively extract user's opinion or preference about a specific product or service by utilizing a lot of text-based information existing on the Internet or online. The extracted unstructured text information is decomposed into basic unit words, and positive rate is evaluated by using existing emotional dictionaries and additional lists proposed in this study. This can be a way to effectively utilize unstructured text data, which is being generated and stored in vast quantities, in product or service design. Finally, to verify the effectiveness of the proposed method, a case study was conducted using movie review data retrieved from a portal website. By comparing the positive rates calculated by the proposed framework with user ratings for movies, a guideline on text mining based evaluation of unstructured data is provided.
Keywords
Text Mining; Sentiment Analysis; Unstructured Data; Movie Review; Evaluation Framework;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Aggarwal, C.C. and Zhai, C.X., Mining Text Data, New York, Springer, 2012, pp. 11-35.
2 Chang J., A Sentiment Analysis Algorithm for Automatic Product Reviews Classification in On-Line Shopping Mall, Journal of Society for e-Business Studies, 2009, Vol. 14, No. 4, pp. 19-33.
3 Das, T.K. and Kumar, P.M., Big data analytics : A framework for unstructured data analysis, International Journal of Engineering Technology, 2013, Vol. 5, No. 1, pp. 153-156.   DOI
4 Gantz, J. and Reinsel, D., The digital universe in 2020 : Big data, bigger digital shadows, and biggest growth in the far east, IDC iView : IDC Anal. Future, 2012, Vol. 2007, pp. 1-16.
5 Ghose, A. and Ipeirotis, P.G., Estimating the Helpfulness and Economic Impact of Product Reviews : Mining Text and Reviewer Characteristics, IEEE Transactions on Knowledge and Data Engineering, 2011, Vol. 23, No. 10, pp. 1498-1512.   DOI
6 Hu, M. and Liu, B., Mining and summarizing customer reviews, '04 Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, 2004, Washington, USA, pp. 168-177.
7 Kam, M. and Song, M., A Study on Differences of Contents and Tones of Arguments among Newspapers Using Text Mining Analysis, Journal of Intelligence and Information System, 2012, Vol. 18, No. 3, pp. 53-77.   DOI
8 Kim, K.A. and Ku, J.H., A Study on the Potential and Limitation of Pre-producing Dramas through Social Analysis-focusing on a jtbc drama , Journal of the Korea Academia-Industrial cooperation Society, 2018, Vol. 19, No. 2, pp. 164-172.   DOI
9 Kim, K.H., Chae, M.S., and Lee, B.T., Text Mining-Based Emerging Trend Analysis for e-Learning Contents Targeting for CEO, Information Systems Review, 2016, Vol. 19, pp. 2-4.
10 Kim, S., Introduction to Statistics, Seoul, Hakjisa, 2007, pp. 96-97.
11 Laudauer, T.K., Foltz, P.W., and Laham, D., An Introduction to Latent Semantic Analysis, Journal Discourse Processes, 1998, Vol. 25, No. 2-3, pp. 259-284.   DOI
12 Yoon, J., Song, J., and Ryu, T., Quantifying the Process of Patent Right Quality Evaluation : Combined Application of AHP, Text Mining and Regression Analysis, Journal of Society of Korea Industrial and Systems Engineering, 2015, Vol. 38, No. 2, pp. 17-30.   DOI
13 Le, Q.V. and Mikolov, T., Distributed Representations of Sentences and Documents, Proceedings of the 31st International Conference on International Conference on Machine Learning, Beijing China, 2014, Vol. 32, pp. 1188-1196.
14 Tan, A., Text Mining : The state of the art and the challenges, In Proceedings of the PAKDD 1999 Workshop on Knowledge Disocovery from Advanced Databases, 1999, pp. 65-70.
15 Wikidipia, https://ko.wikipedia.org/wiki/%EB% 8C%80%ED%95%9C%EB%AF%BC%EA%B5%AD%EC%9D%98_%EC%9D%B8%ED%84%B0%EB%84%B7_%EC%8B%A0%EC%A1%B0%EC%96%B4_%EB%AA%A9%EB%A1%9D(accessed on 11 November, 2018).