Browse > Article
http://dx.doi.org/10.13088/jiis.2019.25.3.001

Predicting stock movements based on financial news with systematic group identification  

Seong, NohYoon (College of Business, KAIST)
Nam, Kihwan (College of Business, Korea Advanced Institute of Science and Technology (KAIST))
Publication Information
Journal of Intelligence and Information Systems / v.25, no.3, 2019 , pp. 1-17 More about this Journal
Abstract
Because stock price forecasting is an important issue both academically and practically, research in stock price prediction has been actively conducted. The stock price forecasting research is classified into using structured data and using unstructured data. With structured data such as historical stock price and financial statements, past studies usually used technical analysis approach and fundamental analysis. In the big data era, the amount of information has rapidly increased, and the artificial intelligence methodology that can find meaning by quantifying string information, which is an unstructured data that takes up a large amount of information, has developed rapidly. With these developments, many attempts with unstructured data are being made to predict stock prices through online news by applying text mining to stock price forecasts. The stock price prediction methodology adopted in many papers is to forecast stock prices with the news of the target companies to be forecasted. However, according to previous research, not only news of a target company affects its stock price, but news of companies that are related to the company can also affect the stock price. However, finding a highly relevant company is not easy because of the market-wide impact and random signs. Thus, existing studies have found highly relevant companies based primarily on pre-determined international industry classification standards. However, according to recent research, global industry classification standard has different homogeneity within the sectors, and it leads to a limitation that forecasting stock prices by taking them all together without considering only relevant companies can adversely affect predictive performance. To overcome the limitation, we first used random matrix theory with text mining for stock prediction. Wherever the dimension of data is large, the classical limit theorems are no longer suitable, because the statistical efficiency will be reduced. Therefore, a simple correlation analysis in the financial market does not mean the true correlation. To solve the issue, we adopt random matrix theory, which is mainly used in econophysics, to remove market-wide effects and random signals and find a true correlation between companies. With the true correlation, we perform cluster analysis to find relevant companies. Also, based on the clustering analysis, we used multiple kernel learning algorithm, which is an ensemble of support vector machine to incorporate the effects of the target firm and its relevant firms simultaneously. Each kernel was assigned to predict stock prices with features of financial news of the target firm and its relevant firms. The results of this study are as follows. The results of this paper are as follows. (1) Following the existing research flow, we confirmed that it is an effective way to forecast stock prices using news from relevant companies. (2) When looking for a relevant company, looking for it in the wrong way can lower AI prediction performance. (3) The proposed approach with random matrix theory shows better performance than previous studies if cluster analysis is performed based on the true correlation by removing market-wide effects and random signals. The contribution of this study is as follows. First, this study shows that random matrix theory, which is used mainly in economic physics, can be combined with artificial intelligence to produce good methodologies. This suggests that it is important not only to develop AI algorithms but also to adopt physics theory. This extends the existing research that presented the methodology by integrating artificial intelligence with complex system theory through transfer entropy. Second, this study stressed that finding the right companies in the stock market is an important issue. This suggests that it is not only important to study artificial intelligence algorithms, but how to theoretically adjust the input values. Third, we confirmed that firms classified as Global Industrial Classification Standard (GICS) might have low relevance and suggested it is necessary to theoretically define the relevance rather than simply finding it in the GICS.
Keywords
Online News; Stock prediction; Random matrix theory; hierarchical clustering;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Aghabozorgi, S., and Y. W. Teh, "Stock Market Co-Movement Assessment Using a Three-Phase Clustering Method," Expert Systems with Applications, Vol.11, No.4(2014) 1301-1314.   DOI
2 Aiolli, F., and M. Donini, "Easymkl: A Scalable Multiple Kernel Learning Algorithm," Neurocomputing, Vol.169, No.1(2015), 215-224.   DOI
3 Bun, J., R. Allez, J.-P. Bouchaud, and M. Potters, "Rotational Invariant Estimator for General Noisy Matrices," IEEE Transactions on Information Theory, Vol.62, No.12(2016), 7475-7490.   DOI
4 Bun, J., J.-P. Bouchaud, and M. Potters, "Cleaning Large Correlation Matrices: Tools from Random Matrix Theory," Physics Reports, Vol.666, No.1(2017), 1-109.   DOI
5 Cho, C. H., and T. Mooney, "Stock Return Comovement and Korean Business Groups," Review of Development Finance, Vol.5, No.2(2015), 71-81.   DOI
6 Garcia, A., "Global Financial Indices and Twitter Sentiment: A Random Matrix Theory Approach," Physica A: Statistical Mechanics and its Applications, Vol.461, No.1(2016), 509-522.   DOI
7 Groth, S. S., and J. Muntermann, "An Intraday Market Risk Management Approach Based on Textual Analysis," Decision Support Systems, Vol.50, No.4(2011), 680-691.   DOI
8 Gu, Y., C. Wang, D. You, Y. Zhang, S. Wang, and Y. Zhang, "Representative Multiple Kernel Learning for Classification in Hyperspectral Imagery," IEEE Transactions on Geoscience and Remote Sensing, Vol.50, No.7(2012), 2852-2865.   DOI
9 Jain, A., S. V. Vishwanathan, and M. Varma, "Spf-Gmkl: Generalized Multiple Kernel Learning with a Million Kernels," Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining: ACM, (2012), 750-758.
10 Hsu, C., C. Chang, and C. Lin, "A Practical Guide to Support Vector Classification," Department of Computer Science National Taiwan University, 2010.
11 Keerthi, S. S., and C.-J. Lin, "Asymptotic Behaviors of Support Vector Machines with Gaussian Kernel," Neural computation, Vol.15, No.7(2003), 1667-1689.   DOI
12 Kim, D.-H., and H. Jeong, "Systematic Analysis of Group Identification in Stock Markets," Physical Review E, Vol.72, No.4(2005), 046133.   DOI
13 Laloux, L., P. Cizeau, M. Potters, and J.-P. Bouchaud, "Random Matrix Theory and Financial Correlations," International Journal of Theoretical and Applied Finance, Vol.3, No.3 (2000), 391-397.   DOI
14 Loh, L., "Co-Movement of Asia-Pacific with European and Us Stock Market Returns: A Cross-Time-Frequency Analysis," Research in International Business and Finance, Vol.29, No.1(2013), 1-13.   DOI
15 Morck, R., B. Yeung, and W. Yu, "The Information Content of Stock Markets: Why Do Emerging Markets Have Synchronous Stock Price Movements?," Journal of financial economics, Vol.58, No.1-2(2000), 215-260.   DOI
16 Seong, N., and K. Nam, "Combining Macro-economical Effects with Sentiment Analysis for Stock Index Prediction," Entrue Journal of Information Technology, Vol.16, No.2(2017), 41-54.
17 Nam, K., and N. Seong, "Financial News-Based Stock Movement Prediction Using Causality Analysis of Influence in the Korean Stock Market," Decision Support Systems, Vol.117, No.1(2019), 100-112.   DOI
18 Hagenau, M., M. Liebmann, and D. Neumann, "Automated News Reading: Stock Price Prediction Based on Financial News Using Context-Capturing Features," Decision Support Systems, Vol.55, No.3(2013), 685-697.   DOI
19 Park, E. L., and S. Cho, "Konlpy: Korean Natural Language Processing in Python," Proceedings of the 26th Annual Conference on Human & Cognitive Language Technology, (2014).
20 Rua, A., and L. C. Nunes, "International Comovement of Stock Market Returns: A Wavelet Analysis," Journal of Empirical Finance, Vol.16, No.4(2009), 632-639.   DOI
21 Seong, N., and K. Nam, "Online News-Based Stock Price Forecasting Considering Homogeneity in the Industrial Sector," Journal of Intelligence and Information Systems, Vol.24, No.2(2018), 1-19.   DOI
22 Shynkevich, Y., T. M. McGinnity, S. A. Coleman, and A. Belatreche, "Forecasting Movements of Health-Care Stock Prices Based on Different Categories of News Articles Using Multiple Kernel Learning," Decision Support Systems, Vol.85, No.1(2016), 74-83.   DOI
23 Vui, C. S., G. K. Soon, C. K. On, R. Alfred, and P. Anthony, "A Review of Stock Market Prediction with Artificial Neural Network (Ann), "IEEE International Conference on Control System, Computing and Engineering: IEEE, (2013) 477-482.