텍스트 분석을 활용한 정보의 수요 공급 기반 뉴스 가치 평가 방안

A Method for Evaluating News Value based on Supply and Demand of Information Using Text Analysis

  • 이동훈 (국민대학교 비즈니스IT전문대학원) ;
  • 최호창 (국민대학교 경영학부 경영학전공) ;
  • 김남규 (국민대학교 비즈니스IT전문대학원)
  • Lee, Donghoon (Graduate School of Business IT, Kookmin University) ;
  • Choi, Hochang (School of Business Administration, Kookmin University) ;
  • Kim, Namgyu (School of Management Information Systems, Kookmin University)
  • 투고 : 2016.11.19
  • 심사 : 2016.12.18
  • 발행 : 2016.12.31


최근 정보 유통의 주요 매체인 인터넷 뉴스와 SNS의 매체 간 특성 차이를 주목한 많은 연구가 있었음에도 불구하고, 양 매체의 차이를 정보의 수요 및 공급 관점에서 파악한 연구는 상대적으로 매우 부족하다. 일반적으로 새로운 정보는 언론사의 뉴스 기사를 통해 대중에게 노출되고, 대중은 이러한 기사에 대한 의견 또는 추가정보를 SNS를 통해 공유함으로써 해당 정보를 수용함과 동시에 확산시킨다. 이러한 측면에서 언론사가 뉴스를 제공하는 행위를 정보의 공급으로 파악할 수 있으며, 대중은 SNS를 통해 이에 대한 관심을 능동적으로 나타냄으로써 해당 정보에 대한 소비 수요를 표출하는 것으로 이해할 수 있다. 이는 상품 및 서비스의 가격이 수요와 공급의 관계에 의해 결정되는 것과 유사한 원리로, 정보의 가치를 정보 수요와 정보 공급의 관계에 기반을 두어 측정할 수 있음을 시사한다. 본 연구에서는 정보 공급의 대표 매체로 인터넷 뉴스 기사를, 정보 수요를 나타내는 대표 매체로 트위터를 선정하고, 특정 이슈에 대한 뉴스의 정보로서의 가치를 이와 관련된 트위터의 양으로 평가하는 뉴스가치지수(NVI, News Value Index)를 고안하여 제시한다. 구체적으로 제안 방법론은 각 이슈별로 NVI를 도출하고 이를 통해 시간의 흐름에 따른 정보 가치의 변화를 시각화하여 나타낸다. 또한 본 연구에서는 제안 방법론의 실무 적용 가능성을 평가하기 위해 인터넷 뉴스 387,018건과 트윗 31,674,795건에 대한 실험을 수행하였다. 그 결과 대부분의 이슈가 전체 정보 시장의 평균 가치에 수렴하는 형태로 변화함을 알 수 있었으며, 꾸준히 평균 이상의 가치를 가지며 정보 시장을 장악하는 등 특이한 양상을 보이는 흥미로운 이슈도 존재함을 파악할 수 있었다.

Given the recent development of smart devices, users are producing, sharing, and acquiring a variety of information via the Internet and social network services (SNSs). Because users tend to use multiple media simultaneously according to their goals and preferences, domestic SNS users use around 2.09 media concurrently on average. Since the information provided by such media is usually textually represented, recent studies have been actively conducting textual analysis in order to understand users more deeply. Earlier studies using textual analysis focused on analyzing a document's contents without substantive consideration of the diverse characteristics of the source medium. However, current studies argue that analytical and interpretive approaches should be applied differently according to the characteristics of a document's source. Documents can be classified into the following types: informative documents for delivering information, expressive documents for expressing emotions and aesthetics, operational documents for inducing the recipient's behavior, and audiovisual media documents for supplementing the above three functions through images and music. Further, documents can be classified according to their contents, which comprise facts, concepts, procedures, principles, rules, stories, opinions, and descriptions. Documents have unique characteristics according to the source media by which they are distributed. In terms of newspapers, only highly trained people tend to write articles for public dissemination. In contrast, with SNSs, various types of users can freely write any message and such messages are distributed in an unpredictable way. Again, in the case of newspapers, each article exists independently and does not tend to have any relation to other articles. However, messages (original tweets) on Twitter, for example, are highly organized and regularly duplicated and repeated through replies and retweets. There have been many studies focusing on the different characteristics between newspapers and SNSs. However, it is difficult to find a study that focuses on the difference between the two media from the perspective of supply and demand. We can regard the articles of newspapers as a kind of information supply, whereas messages on various SNSs represent a demand for information. By investigating traditional newspapers and SNSs from the perspective of supply and demand of information, we can explore and explain the information dilemma more clearly. For example, there may be superfluous issues that are heavily reported in newspaper articles despite the fact that users seldom have much interest in these issues. Such overproduced information is not only a waste of media resources but also makes it difficult to find valuable, in-demand information. Further, some issues that are covered by only a few newspapers may be of high interest to SNS users. To alleviate the deleterious effects of information asymmetries, it is necessary to analyze the supply and demand of each information source and, accordingly, provide information flexibly. Such an approach would allow the value of information to be explored and approximated on the basis of the supply-demand balance. Conceptually, this is very similar to the price of goods or services being determined by the supply-demand relationship. Adopting this concept, media companies could focus on the production of highly in-demand issues that are in short supply. In this study, we selected Internet news sites and Twitter as representative media for investigating information supply and demand, respectively. We present the notion of News Value Index (NVI), which evaluates the value of news information in terms of the magnitude of Twitter messages associated with it. In addition, we visualize the change of information value over time using the NVI. We conducted an analysis using 387,014 news articles and 31,674,795 Twitter messages. The analysis results revealed interesting patterns: most issues show lower NVI than average of the whole issue, whereas a few issues show steadily higher NVI than the average.



  1. Albright, R., Taming Text with The SVD, SAS Institute Inc., Cary, NC, 2004.
  2. An, J. Y., J. H. Bae, N. G. Han and M. Song, "A Study of "Emotion Trigger' by Text Mining Techniques," Journal of Intelligence and Information Systems, Vol.21, No.2(2015), 69-92.
  3. Bae, J. H., J. E. Son, and M. Song, "Analysis of Twitter for 2012 South Korea Presidential Election by Text Mining Techniques," Journal of Intelligence and Information Systems, Vol.19, No.3(2013), 141-156.
  4. Bae, J. H., N. G. Han and M. Song, "Twitter Issue Tracking System by Topic Modeling Techniques," Journal of Intelligence and Information Systems, Vol.20, No.2(2014), 109-122.
  5. Choi, S. I., Y. J. Hyun and N. Kim, "Improving Performance of Recommendation Systems Using Topic Modeling," Journal of Intelligence and Information Systems, Vol.21, No.3(2015), 103-118.
  6. Choi, S. J., J. W. Lee and O. B. Kown, "A Morphological Analysis Method of Predicting Place-Event Performance by Online News Titles," The Journal of Society for e-Business Studies, Vol.21, No.1(2016), 15-32.
  7. Han, J. and M. Kamber, Data Mining: Concepts and Techniques, 3rd Edition, Morgan Kaufmann Publishers, San Francisco, 2011.
  8. Hur, S. H., K. S. Choi, "A Study on Characteristics and Types of Tweet in Twitter," Hanminjok Emunhak, Vol.61(2012), 455-494.
  9. Jin, S. A., G. E. Heo, Y. K. Jeong and M. Song, "Topic-Network based Topic Shift Detection on Twitter," Journal of the Korean Society for Information Management, Vol.30, No.1(2013), 285-302.
  10. Jo, H. J., J. H. Seo and J. T. Choi, "OAR Algorithm Technology based on Opinion Mining Utilizing Stock News Contents," Journal of KIIT, Vol.13, No.3(2015), 111-119.
  11. Jung, Y. I., W. J. Nam, Introducing Translation Studies(Theories and Applications), Hankuk University of Foreign Studies Knowledge Press, 2006.
  12. Jung, H. J., J. H. Bae, S. L. Hong, C. U. Park and M. Song, "Analysis of Twitter Public Opinion in Different Political Views : A Case Study of Sewol Ferry Accident," Korean Journal of Journalism and Communication Studies, Vol.60, No.2(2016), 269-302.
  13. Kang, A. T., A Study on Regional Characteristics on The Stress Sentiment and Topics Extracted from Tweet Data, The Graduate School of Ewha Womans University, 2016.
  14. Kim, D. S., W. X. S. Wong, M. S. Lim, C. Liu, N. Kim, J. H. Park, W. Y. Kil and H. S. Yoon, "A Methodology for Analyzing Public Opinion about Science and Technology Issues Using Text Analysis," Journal of Information Technology Services, Vol.14, No.3(2015), 33-48.
  15. Kim, D. Y., Study of TV Ration Prediction through Analysis of On-Line Bigdata : Case of The Drama in My Love from The Star, The Graduate School of Chungbuk National University, 2016.
  16. Kim, D. Y., J. W. Park and J. H. Choi, "A Comparative Study between Stock Price Prediction Models Using Sentiment Analysis and Machine Learning based on SNS and News Articles," Journal of Information Technology Services, Vol.13, No.3(2014), 221-233.
  17. Kim, J. E., N. Kim and Y. H. Cho, "User-Perspective Issue Clustering Using Multi-Layered Two-Mode Network Analysis," Journal of Intelligence and Information Systems, Vol.20, No.2(2014), 93-107.
  18. Kim, M. J., H. J. Jung, "A Case Study on Visual Expression through Interaction with Information Types - Focusing on Interactive Infographic in The New York Times -," Journal of the Korean Society of Design Culture, Vol.20, No.1(2014), 146-158.
  19. Lee, J. H., K. S. Song, J. A. Kang and J. R. Hwang, "A Study on The Efficient Extraction Method of SNS Data related to Crime Risk Factor," Journal of The Korea Society of Computer and Information, Vol.20, No.1(2015), 255-263.
  20. Lee, S. Y., K. M. Lee, "A Reply Graph-based Social Mining Method with Topic Modeling," Journal of Korean Institute of Intelligent Systems, Vol.24, No.6(2014), 640-645.
  21. Lee, Y. J., J. H. Seo and J. T. Choi, "Fashion Trend Marketing Prediction Analysis based on Opinion Mining Applying SNS Text Contents," Journal of KIIT, Vol.12, No.12 (2014), 163-170.
  22. Lim, H. J., S. H. Park, "A Tentative Approach for Regional Futures Strategy with Big Data - Through The Analysis Using The Data of SNS and Newpaper," Journal of The Korean Cadastre Information Association, Vol.17, No.1(2015), 75-90.
  23. Lim, M. S. and N. Kim, "Analyzing The Issue Life Cycle by Mapping Inter-Period Issues," Journal of Intelligence and Information Systems, Vol.20, No.4(2014), 25-41.
  24. Lim, M. S. and N. Kim, "Investigating Dynamic Mutation Process of Issues Using Unstructured Text Analysis," Journal of Intelligence and Information Systems, Vol.22, No.1(2016), 1-18.
  25. Mooney, R. J. and R. Bunescu, "Mining Knowledge from Text Using Information Extraction," ACM SIGKDD Explorations, Vol.7, No.1(2006), 3-10.
  26. Munday, J., Introducing Translation Studies: Theories and Applications, 4th Edition, Routledge, New York, 2016.
  27. Noh, B. J., Z. S. Xu, J. U. Lee, D. H. Park and Y. H. Chung, "Keyword Network based Repercussion Effect Analysis of Foot-and-Mouth Disease Using Online News," Journal of KIIT, Vol.14, No.9(2016), 143-152.
  28. Park, S. H., "SNS News Communication - Multiplicity and Orality," Journal of communication research, Vol.49, No.2(2012), 37-73.
  29. Reiss, K., "Type, Kind and Individuality of Text : Decision Making in Translation," Translation Theory and Intercultural Relations, Vol.2, No.4(1981), 121-131.
  30. Rijsbergen, C. J. V., Information Retrieval, 2nd Edition, Butterworths, London, 1979.
  31. Salton, G., A. Wong, and C. S. Yang, "A Vector Space Model for Automatic Indexing," Communications of the ACM, Vol.18, No.11(1975), 613-620.
  32. Sebastiani, F., Classification of Text, Automatic, The Encyclopedia of Language and Linguistics 14, 2nd Edition, Elsevier Science Pub, 2006.
  33. Weiss, S. M., N. Indurkhya, and T. Zhang, Fundamentals of Predictive Text Mining, 2nd Edition, Springer, 2015.
  34. Witten, I. H., Text Mining: The Practical Handbook of Internet Computing, CRC Press, 2004.
  35. Yu, E. J., Y. S. Kim, N. Kim and S. R. Jeong, "Predicting The Direction of The Stock Index by Using A Domain-Specific Sentiment Dictionary," Journal of Intelligence and Information Systems, Vol.19, No.1(2013), 95-110.