DOI QR코드

DOI QR Code

A domain-specific sentiment lexicon construction method for stock index directionality

주가지수 방향성 예측을 위한 도메인 맞춤형 감성사전 구축방안

  • Kim, Jae-Bong (Department of Big Data Application and Security, Korea University) ;
  • Kim, Hyoung-Joong (Department of Big Data Application and Security, Korea University)
  • 김재봉 (고려대학교 빅데이터응용및보안학과) ;
  • 김형중 (고려대학교 빅데이터응용및보안학과)
  • Received : 2017.05.18
  • Accepted : 2017.06.25
  • Published : 2017.06.30

Abstract

As development of personal devices have made everyday use of internet much easier than before, it is getting generalized to find information and share it through the social media. In particular, communities specialized in each field have become so powerful that they can significantly influence our society. Finally, businesses and governments pay attentions to reflecting their opinions in their strategies. The stock market fluctuates with various factors of society. In order to consider social trends, many studies have tried making use of bigdata analysis on stock market researches as well as traditional approaches using buzz amount. In the example at the top, the studies using text data such as newspaper articles are being published. In this paper, we analyzed the post of 'Paxnet', a securities specialists' site, to supplement the limitation of the news. Based on this, we help researchers analyze the sentiment of investors by generating a domain-specific sentiment lexicon for the stock market.

개인용 디바이스의 발달로 개인들이 손쉽게 인터넷에 접속할 수 있게 되었으며, 소셜미디어를 통한 정보의 공유와 습득이 일반화 되고 있다. 특히 분야별 전문 커뮤니티가 발달하며 사회적 영향력을 행사하고 있어 기업과 정부는 이들의 의견을 반영하여 전략을 수립하는 일에 관심을 기울이고 있다. 온라인상의 다양한 텍스트로부터 대중의 의견을 읽어내는 것을 오피니언마이닝이라고 한다. 그 중 하나인 감성사전은 방대한 비정형데이터를 빠르게 파악하는 도구로 여러 분야에서 활용되고 있다. 주식시장은 사회의 여러 요인을 반영하여 변동한다. 최근에는 버즈량 분석 등 빅데이터를 기반으로 오피니언마이닝을 활용한 주식시장 연구가 시도되고 있다. 대표적인 예로 뉴스와 같은 텍스트 데이터 분석을 활용한 연구들이 발표되고 있다. 본 논문에서는 뉴스의 정제된 형식과 한정된 어휘를 사용한 기존연구를 보완하고자 증권전문 사이트 'Paxnet'의 게시 글을 분석대상으로 삼아 주식시장 맞춤형 감성사전을 구축하여 투자자들의 감성을 분석하는 데 기여했다.

Keywords

References

  1. C. Han, and K. Kim, "Twitter's impact on the election of TV debates," Journal of Digital Contents Society, Vol. 14, No.2, p.207-214, 2013 https://doi.org/10.9728/dcs.2013.14.2.207
  2. C. Snijders, U. Matzat, and U. Reips, "Big data: Big gaps of knowledge in the field of Internet science," International Journal of Internet Science, Vol. 7, No. 1, pp. 1-5, 2012.
  3. B. Liu, Sentiment Analysis and Opinion Mining, Morgan & Claypool, 2012.
  4. S. Ahn and S. B. Cho, "Stock prediction using news text mining and time series analysis," Proceedings of Korea Intelligent Information Systems Society Conference, pp. 364-369, 2010.
  5. Y. Kim, N. Kim, and S. R. Jeong, "Stock-index invest model using news big data opinion mining," KIIS Journal of Intelligence and Information Systems, Vol. 18, No. 2, pp. 143-156, 2012.
  6. E. Yu, Y. Kim, N. Kim, and S. R. Jeong, "Predicting the direction of the stock index by using a domain-specific sentiment dictionary," KIIS Journal of Intelligence and Information Systems, vol. 19, No. 1, pp. 95-110, 2013. https://doi.org/10.13088/jiis.2013.19.1.095
  7. E. Cha and T. Hong, "S&P500 Stock price index prediction using news emotion analysis and SVM," Proceedings of Korea Society of Management Information Systems Conference, pp. 173-178, 2016.
  8. D. Kim, T. Cho and J. H. Lee, "A domain adaptive sentiment dictionary construction method for domain sentiment analysis," Proceedings of the Korean Society of Computer Information Conference, Vol. 23, No. 1, pp. 15-18, 2015.
  9. S. H. Lee, J. Cui and J. W. Kim, "Sentiment Analysis on Movie Review Through Building Modified Sentiment Dictionary by Movie Genre," KIIS Journal of Intelligence and Information Systems, Vol. 22, No. 2, pp. 97-113, 2016. https://doi.org/10.13088/jiis.2016.22.2.097
  10. B. Pang and L. Lee, Foundations and Trends(R) in Information Retrieval, Vol. 2, now Publishers Inc, 2008.
  11. T. Nasukawa and J. Yi, "Sentiment Analysis: Capturing Favorability Using Natural Language Processing," in Proceeding of the 2nd International Conference on Knowledge Capture, Sanibel Island, FL, USA, pp. 70-77, 2003.
  12. K. Dave, S. Lawrence, and D. M. Pennock, "Mining the Peanut Gallery: Opinion Extraction and Semantic Classification of Product Reviews," in Proceeding of the 12th International Conference on World Wide Web, Budapest, Hungary, pp. 519-528, 2003.
  13. J.Lee, W. Lee, J. Park and j. Choi, "The Blog Polarity Classification Technique using Opinion Mining," Journal of Digital Contents Society, Vol. 15, No 4, p.559-568, 2014 https://doi.org/10.9728/dcs.2014.15.4.559
  14. M. Hu and B. Liu, "Mining Opinion Features in Customer Reviews," AAAI journal of American Association for the Artificial Intelligence, Vol. 4, No. 4, pp. 755-760, 2004.
  15. S. M. Kim and E. Hovy, "Determining the Sentiment of Opinions," in Proceeding of the 20th International Conference on Computational Linguistics. Association for Computational Linguistics, Geneva, Switzerland, No. 1367, 2004.
  16. A. Hassan and D. Radev, "Identifying Text Polarity Using Random Walks," in Proceeding of the 48th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Uppsala, Sweden, pp. 395-403, 2010.
  17. P. D. Turney and M. L. Littman(2002, May). Unsupervised learning of semantic orientation from a hundred-billion-word corpus. arXiv preprint[Online], cs/0212012, NRC-44929, pp. 1-9, Available: https://arxiv.org/ftp/cs/papers/0212/0212012.pdf
  18. J. An and H. W. Kim, "Building a Korean Sentiment Lexicon Using Collective Intelligence," KIIS Journal of Intelligence and Information Systems, Vol. 21, No. 2, pp. 49-67, 2015. https://doi.org/10.13088/jiis.2015.21.2.49
  19. E. Riloff and J. Shepherd, "A Corpus-based Approach for Building Semantic Lexicons," Proceedings of the Second Conference on Empirical Methods in Natural Language Processing, pp. 117-124, 1997.
  20. V. Hatzivassiloglou and K. R. McKeown, "Predicting the Semantic Orientation of Adjectives," in Proceeding of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics, Madrid, Spain, pp. 174-181, 1997.
  21. H. Chen and D. Zimbra(2010, June). AI and Opinion Mining. IEEE Journals ans Mazagines[Online]. 25(3), pp. 74-80, 2010. Available: http://ieeexplore.ieee.org/abstract/document/5475086/
  22. S. Shin, Read Emotions in the Article! Understanding Emotional Analysis, IDG Korea, pp. 1-11, 2014.
  23. J. Song and S. Lee, "Automatic Construction of Positive/Negative Feature-predicate Dictionary for Polarity Classification of Product Reviews" Journal of KISS: Software and Applications, Vol. 38, No. 3, pp. 157-168, 2011.
  24. J. S. Jeong, D. S. Kim and J. W. Kim. "Influence Analysis of Internet Buzz to Corporate Performance : Individual Stock Price Prediction Using Sentiment Analysis of Online News," KIIS Journal of Intelligence and Information Systems, Vol. 21, No. 4, pp. 37-51, 2015.
  25. S. Song, D. Lee and S. Lee. "Identifying Sentiment Polarity of Korean Vocabulary Using PMI," Journal of Korea Information Science Society, Vol. 37, No. 1(C), pp. 260-265, 2010.