Browse > Article
http://dx.doi.org/10.13088/jiis.2013.19.1.095

Predicting the Direction of the Stock Index by Using a Domain-Specific Sentiment Dictionary  

Yu, Eunji (Graduate School of Business IT, Kookmin University)
Kim, Yoosin (Graduate School of Business IT, Kookmin University)
Kim, Namgyu (Graduate School of Business IT, Kookmin University)
Jeong, Seung Ryul (Graduate School of Business IT, Kookmin University)
Publication Information
Journal of Intelligence and Information Systems / v.19, no.1, 2013 , pp. 95-110 More about this Journal
Abstract
Recently, the amount of unstructured data being generated through a variety of social media has been increasing rapidly, resulting in the increasing need to collect, store, search for, analyze, and visualize this data. This kind of data cannot be handled appropriately by using the traditional methodologies usually used for analyzing structured data because of its vast volume and unstructured nature. In this situation, many attempts are being made to analyze unstructured data such as text files and log files through various commercial or noncommercial analytical tools. Among the various contemporary issues dealt with in the literature of unstructured text data analysis, the concepts and techniques of opinion mining have been attracting much attention from pioneer researchers and business practitioners. Opinion mining or sentiment analysis refers to a series of processes that analyze participants' opinions, sentiments, evaluations, attitudes, and emotions about selected products, services, organizations, social issues, and so on. In other words, many attempts based on various opinion mining techniques are being made to resolve complicated issues that could not have otherwise been solved by existing traditional approaches. One of the most representative attempts using the opinion mining technique may be the recent research that proposed an intelligent model for predicting the direction of the stock index. This model works mainly on the basis of opinions extracted from an overwhelming number of economic news repots. News content published on various media is obviously a traditional example of unstructured text data. Every day, a large volume of new content is created, digitalized, and subsequently distributed to us via online or offline channels. Many studies have revealed that we make better decisions on political, economic, and social issues by analyzing news and other related information. In this sense, we expect to predict the fluctuation of stock markets partly by analyzing the relationship between economic news reports and the pattern of stock prices. So far, in the literature on opinion mining, most studies including ours have utilized a sentiment dictionary to elicit sentiment polarity or sentiment value from a large number of documents. A sentiment dictionary consists of pairs of selected words and their sentiment values. Sentiment classifiers refer to the dictionary to formulate the sentiment polarity of words, sentences in a document, and the whole document. However, most traditional approaches have common limitations in that they do not consider the flexibility of sentiment polarity, that is, the sentiment polarity or sentiment value of a word is fixed and cannot be changed in a traditional sentiment dictionary. In the real world, however, the sentiment polarity of a word can vary depending on the time, situation, and purpose of the analysis. It can also be contradictory in nature. The flexibility of sentiment polarity motivated us to conduct this study. In this paper, we have stated that sentiment polarity should be assigned, not merely on the basis of the inherent meaning of a word but on the basis of its ad hoc meaning within a particular context. To implement our idea, we presented an intelligent investment decision-support model based on opinion mining that performs the scrapping and parsing of massive volumes of economic news on the web, tags sentiment words, classifies sentiment polarity of the news, and finally predicts the direction of the next day's stock index. In addition, we applied a domain-specific sentiment dictionary instead of a general purpose one to classify each piece of news as either positive or negative. For the purpose of performance evaluation, we performed intensive experiments and investigated the prediction accuracy of our model. For the experiments to predict the direction of the stock index, we gathered and analyzed 1,072 articles about stock markets published by "M" and "E" media between July 2011 and September 2011.
Keywords
Big Data Analysis; Opinion Mining; Sentiment Dictionary Construction; Stock Index Prediction; Text Mining;
Citations & Related Records
Times Cited By KSCI : 4  (Citation Analysis)
연도 인용수 순위
1 Chen, H. and D. Zimbra, "AI and Opinion Mining", IEEE Intelligent Systems, Vol.25, No.3(2010), 74-80.
2 Chung, F. L., "Chak-man Ng : Discovering the Correlation between Stock Time Series and Financial News",Web Intelligence, Vol.1(2008), 880-883.
3 Fu, T. C., K. K. Lee, D. C. M. Sze, and F. L. Chung, "Chak-man Ng : Discovering the Correlation between Stock Time Series and Financial News", Web Intelligence, Vol.1(2008), 9-12.
4 Gartner, "Gartner identifies the top10 strategic technologies for 2011", 2010.
5 Gartner, "2012 Hype Cycle for Emerging Technologies", 2012.
6 Jung, Y., Y. Choi, and S. H. Myeang, "A study on Negation Handling and Term Weighting Schemes and Their Effects on Mood-based Text Classification", Cognitive Science, Vol. 19, No.4(2008), 477-497.
7 Ahn, H., S. P. Jeon, and J. B. Chay, "The effects of the News Related to the North-South Korean Relationship on the Korean Stock Markets", Korea Institute of Finance : Analysis of Korea Finance, Vol.16, No.2(2010), 199-231.
8 Ahn, S. and S. Cho, "Stock Prediction Using News Text Mining and Time Series Analysis", Korea Computer Congress, Vol.27, No.1(2010), 364-369.
9 Kim, J., S. Lee, and H. Yong, "Automatic Classification Scheme of Opinions Written in Korea", Journal of Korean Institute of Information Scientists and Engineers : Database, Vol.38, No.6(2011), 423-428.
10 Kim, M., J. Kim, M. Cha, and S. H. Chae, "An Emotion Scanning System on Text Documents", Cognitive Science, Vol.12, No.4(2009), 433-442.
11 Kim, S. W. and H. Ahn, "Development of an Intellient Trading System Using Support Vector Machines and Genetic Algorithms", Journal of Intelligence and Information Systems, Vol.16, No.1(2010), 71-92.
12 Kim, Y., N. Kim, S. R. Jeong, "Stock-index Invest Model Using News Big Data Opinion Mining", Journal of Intelligence and Information Systems, Vol.18, No.2(2012), 143-156.
13 Liu, B., "Opinion Mining", Department of Computer Science University of Illinois at Chicago, 2010.
14 Mckinsey, and Company, "Big Data : The next Frontier for Innovation, Competition, and Productivity", 2011.
15 Mitchell, M. L. and J. H. Mulherin, "The Impact of Public Information on the Stock Market", The Journal of Finance, Vol.49, No.3(1994), 923-950.   DOI   ScienceOn
16 Paik, W., M. H. Kyoung, K. S. Min, H. R. Oh, C. Lim, and M. S. Shin, "Multi-stage News Classification System for Predicting Stock Price Changes", Journal of the Korean Society for Information Management, Vol.24, No.2(2007), 123-141.   과학기술학회마을   DOI   ScienceOn
17 Park, J. and I. Han, "Predicting Korea Composite Stock Index(KOSPI) Using Artificial Neural Network", Journal of Intelligence and Information Systems, Vol.1, No.2(1995), 359-371.
18 Schumaker, R. P. and H. Chen, "Textual Analysis of Stock Market Prediction Using Breaking Financial News : The AZFinText System", ACM Transactions on Information Systems, Vol.27, No.2(2009).
19 Sehgal, V. and C. Song, "SOPS : Stock Prediction using Web Sentiment Department of Computer Science University of Maryland College Park, Maryland, USA", Seventh IEEE International Conference on Data Mining : Workshops, (2007), 21-26.
20 Song, J. and S. Lee, "Automatic Construction of Positive/Negative Feature-Predicate Dictionary for Polarity Classification of Product Reviews", Journal of Korean Institute of Information Scientists and Engineers: Software and Application, Vol.38, No.3(2011), 115-177.
21 Yune, H., H. Kim, and J. Y. Jang, "An Efficient Search Method of Product Review using Opinion Mining Techniques", Journal of Korean Institute of Information Scientists and Engineers : Computing Practices and Letters, Vol.16, No.2(2010), 135-259.
22 Lee, G., "Economic News and Stock Market Correlation : A Study of the UK Market", Conference on Terminology and Knowledge Engineering, 2002.
23 Mittermayer, M. A. and G. Knolmayer, "Text Mining Systems for Market Response to News : A Survey", The Institute of Information Systems Working Papers, 2006.
24 Song, C., "News and Financial Prices", International Economic Journal, Vol.8, No.3(2002), 1-34.