Browse > Article
http://dx.doi.org/10.4275/KSLIS.2013.47.4.315

A Study on Opinion Mining of Newspaper Texts based on Topic Modeling  

Kang, Beomil (연세대학교 언어정보연구원)
Song, Min (연세대학교 문헌정보학과)
Jho, Whasun (연세대학교 정치외교학과)
Publication Information
Journal of the Korean Society for Library and Information Science / v.47, no.4, 2013 , pp. 315-334 More about this Journal
Abstract
This study performs opinion mining of newspaper articles, based on topics extracted by topic modeling. We analyze the attitudes of the news media towards a major issue of 'presidential election', assuming that newspaper partisanship is a kind of opinion. We first extract topics from a large collection of newspaper texts, and examine how the topics are distributed over the entire dataset. The structure and content of each topic are then investigated by means of network analysis. Finally we track down the chronological distribution of the topics in each of the newspapers through time serial analysis. The result reveals that both the liberal newspapers and the conservative newspapers exhibit their own tendency to report in line with their adopted ideology. This confirms that we can count on opinion mining technique based on topics in order to analyze opinion in a reliable fashion.
Keywords
Topic Modeling; Opinion Mining; Network Analysis; Newspaper Partisanship;
Citations & Related Records
Times Cited By KSCI : 8  (Citation Analysis)
연도 인용수 순위
1 감미아, 송민. 2012. 텍스트 마이닝을 활용한 신문사에 따른 내용 및 논조 차이점 분석. 지능정보연구, 18(3): 53-77.(Kam, Miah, & Song, Min. 2012. "A Study on Differences of Contents and Tones of Arguments among Newspapers Using Text Mining Analysis." Journal of Intelligence and Information System, 18(3): 53-77.)   과학기술학회마을
2 강명구. 2004. 한국 언론의 구조변동과 언론전쟁, 한국언론학보, 48(5): 319-421.(Kang, Myungkoo. 2004. "Media War and the Crisis of Journalism Practices." Korean Journal of Journalism & Communication Studies, 48(5): 319-421.)
3 김영욱. 2011. 한국 언론의 정파성과 사회적 소통의 위기. 한국언론학회 심포지움 및 세미나, 107-136.(Kim, Youngwook. 2011. "The Partisanship of Korean Media and The Crisis of Social Interaction." Korean Society For Journalism And Communication Studies symposium seminar, 2011: 107-136.)
4 김재홍. 2003. 김대중 정부의 대북 포용정책에 대한 언론노조와 국민여론의 비교분석. 한국정치학회보, 37(2): 197-218.(Kim, Jaehong. 2003. "Editorial Tone of Major Korean Newspapers toward the Sunshine Policy during the Kim Dae Joong Government." Korean Political Science Review, 37(2): 197-218.)
5 김정아, 채백. 2008. 언론의 정치 성향과 프레임: '이해찬 골프'와 '최연희 성추행' 사건의 보도를 중심으로. 한국언론정보학보, 41: 232-267.(Kim, Jungah, & Chae, Baek. 2008. "The Political Attitude of Newspapers and the Coverage of Political Scandal." Journal of Communication & Information, 41: 232-267.)   과학기술학회마을
6 박자현, 송민. 2013. 토픽 모델링을 활용한 국내 문헌정보학 연구동향 분석. 정보관리학회지, 30(1): 7-32.(Park, Ja-Hyun, & Song, Min. 2013. "A Study on the Research Trends in Library & Information Science in Korea using Topic Modeling." Journal of the Korean Society for Information Management, 30(1): 7-32.)   과학기술학회마을   DOI   ScienceOn
7 박재영. 2009. 한국 언론사들의 정파성 지형. 한국언론재단 세미나 종합 보고서, 17-65.(Park, Jaeyoung. 2009. "The Partisanship Topography of Korean Presses." The Summary Report of The Seminar on Korea Press Foundation, 17-65.)
8 신태범, 권상희. 2013. 국내 청소년의 포털뉴스 이용특성과 뉴스신뢰, 공공성인식에 관한 연구. 사이버 커뮤니케이션 학보, 30(1): 241-294.(Shin, TaeBeom, & Kweon, Sanghee. 2013. "A Study of The Relationship between Domestic Youth's Portal News Usage Characteristics and News Trust with Publicness Recognitions." Journal of Cybercommunication, 30(1): 241-294.)
9 송혜지, 박경수, 정혜은, 송민. 2013. 텍스트 마이닝 기법을 활용한 한국의 경제연구 동향 분석. 한국정보관리학회 학술대회논문집, 20: 47-50.(Song, Hye-Ji, Park, Kyung-Soo, Jung, Hye-Eun, & Song, Min. 2013. "Trend Analysis of Korean Economy in the Economic Literature by text mining techniques." Proceedings of the Korean Society for Information Management, 20: 47-50.)
10 윤성이. 2012. 소셜 네트워크의 확산과 민주주의 의식의 변화. 한국정치연구, 21(2): 145-168.(Yun, Seongyi. 2012. "Diffusion of Social Network Service and Its Challenge to Representative Democracy." Journal of Korean Politics, 21(2): 145-168.)
11 이재윤. 2006a. 지적 구조의 규명을 위한 네트워크 형성 방식에 관한 연구. 한국문헌정보학회지, 40(2): 333-355.(Lee, Jaeyun. 2006a. "A Study on the Network Generation Methods for Examining the Intellectual Structure of Knowledge Domains." Journal of the Korean Library and Information Science Society, 40(2): 333-355.)   과학기술학회마을   DOI   ScienceOn
12 윤영철. 2000. 권력 이동과 신문의 대북정책 보도: 신문과 정당의 병행관계를 중심으로. 언론과 사회, 27: 48-81.(Yoon, Youngchul. 2000. "Power Shift and News Policy toward North Korea: An analysis of press-party parallelism." Media and Society, 27: 48-81.)
13 이민웅. 2003. 저널리즘: 위기 변화 지속. 서울: 나남.(Lee, Minwoong. 2003. Journalism: Crisis Change Endure. Seoul: Nanam.)
14 이재경. 2004. 저널리즘의 위기와 언론의 미래. 신문과 방송 40주년 세미나. 2004년 3월 18일. [서울: 프레스센터].(Lee, Jaekyung. 2004. The Crisis of The Journalism and The Future of The Media. The 40th Anniversary Seminar on Newspaper and Broadcasting, Seoul: Korea Press Center)
15 이재윤. 2006b. 계량서지적 네트워크 분석을 위한 중심성 척도에 관한 연구. 한국문헌정보학회지, 40(3): 191-214.(Lee, Jaeyun. 2006b. "Centrality Measures for Bibliometric Network Analysis." Journal of the Korean Library and Information Science Society, 40(3): 191-214.)   과학기술학회마을   DOI   ScienceOn
16 이재윤. 2006c. 지적 구조 분석을 위한 새로운 클러스터링 기법에 관한 연구. 정보관리학회지, 23(4): 215-231.Lee, Jaeyun. 2006c. "A novel clustering method for examining and analyzing the intellectual structure of a scholarly field." Journal of the Korean Society for Information Management, 23(4): 215-231.)   과학기술학회마을   DOI   ScienceOn
17 이재윤. 2012. WNET. (version 0.4). (Software).(Lee, Jaeyun. 2012. WNET. (version 0.4). (Software).)
18 이준웅. 2001. 갈등적 이슈에 대한 뉴스 프레임 구성방식이 의견형성에 미치는 영향. 한국언론학보, 46(1): 441-482.(Rhee, Junewoong. 2001. "Impacts of News Frames in the Coverage of Conflicting Issues on Individual Interpretation and Opinion." Korean Journal of Journalism & Communication Studies, 46(1): 441-482.)
19 진설아, 허고은, 정유경, 송민. 2013. 트위터 데이터를 이용한 네트워크 기반 토픽 변화 추적 연구. 정보관리학회지, 30(1): 285-302.(Jin, Seol-A, Heo, Coeun, Jeong, Yoo-Kyung, & Song, Min. 2013. "Topic-Network based Topic Shift Detection on Twitter." Journal of the Korean Society for Information Management, 30(1): 285-302.)   과학기술학회마을   DOI   ScienceOn
20 이지혜, 정영미. 2009. 지도적 잠재의미색인(LSI)기법을 이용한 의견 문서 자동분류에 관한 실험적 연구. 정보관리학회지, 26(3): 451-462.(Lee, Ji-Hye, & Chung, Young-Mee. 2009. "An Experimental Study on Opinion Classification Using Supervised Latent Semantic Indexing(LSI)." Journal of the Korean Society for Information Management, 26(3): 451-462.)
21 차한필. 1989. 국내 신문 사설의 주제 분석과 각 신문 간 상관관계에 관한 연구. 석사학위논문, 연세대학교 대학원, 도서관학과.(Cha, Hanpil. 1989. The Study on the Topic of Domestic Paper's Editorials and Correlation between Newspapers. M.A. thesis, Yonsei University.)
22 최민재, 김재영. 2008. 포털의 17대 대선 관련 뉴스서비스 공정성에 관한 탐색적 연구. 언론과학연구, 8(4): 667-701.(Choi, Minjae, & Kim, Jaeyoung. 2008. "Fairness of Portal News Service in the 2007 Presidential Election." Journal of Communication Science, 8(4): 667-701.)
23 최진호, 한동섭. 2012. 언론의 정파성과 권력 개입: 1987년 이후 13-17대 대선캠페인 기간의 주요일간지 사설 분석. 언론과학연구, 12(2): 534-571.(Choi, Jinho, & Han, Dongsub. 2012. "The Partisanship of Media and the Media Intervention in Political-power Creation in Korea: Focusing on the Analysis of the Major Newspapers` Editorial Articles during the 13-17th Presidential Election Campaigns." Journal of Communication Science, 12(2): 534-571.)
24 최현주. 2010. 한국 신문 보도의 이념적 다양성에 대한 고찰: 6개 종합일간지의 3개 주요 이슈에 대한 보도 성향 분석을 중심으로. 한국언론학보, 54(3): 399-426.(Choi, Hyunju. 2010. "A Study on the Diversity of Korean Newspapers: Analyzing the Tendencies of Covering Three Major Issues." Korean Journal of Journalism & Communication Studies, 54(3): 399-426.)
25 한경수. 2010. 효과적인 의견 자질 결합을 위한 실험적 연구. 정보관리학회지, 27(3): 227-239.(Han, Kyung-Soo. 2010. "Experimental Study for Effective Combination of Opinion Features." Journal of the Korean Society for Information Management, 27(3): 227-239.)   과학기술학회마을   DOI   ScienceOn
26 Gerrish, S., & Blei, D. 2010. "A language-based approach to measuring scholarly impact." The 27th International Conference on Machine Learning, 375-382.
27 Blei, D., &Lafferty, J. 2006. "Dynamic topic models." The 23rd international conference on Machine learning, 113-120.
28 Blei, D. 2012. "Probabilistic topic models." Communications of the ACM, 55(4): 77-84.   DOI
29 Chen, H., & D. Zimbra. 2010. "AI and Opinion Mining." IEEE Intelligent Systems, 25(3): 74-76.
30 Griffiths, T., & Steyvers, M. 2004. Finding scientific topics. Proceedings of the National Academy of Sciences.
31 Grimmer, J. 2010. "A Bayesian hierarchical topic model for political texts: Measuring expressed agendas in senate press releases." Political Analysis, 18(1): 1-35.   DOI   ScienceOn
32 Liu, Bing. 2010. "Sentiment Analysis: A Multifaceted Problem." IEEE Intelligent Systems, 25(3): 76-80.   DOI   ScienceOn
33 McCallum, Andrew Kachites. 2002. "MALLET: A Machine Learning for Language Toolkit." .
34 Mimno, D., & McCallum, A. 2008. "Topic models conditioned on arbitrary features with Dirichlet-multinomial regression." The 24th Conference on Uncertainty in Artificial Intelligence, 411-418.
35 Newman, D., & Block, S. 2006. "Probabilistic Topic Decomposition of an Eighteenth-Century Newspaper." Journal of the American Society for Information Science and Technology, 57(5): 753-767.   DOI   ScienceOn
36 Schvaneveldt, Roger W. ed. 1990. Pathfinder Associative Networks: Studies in Knowledge Organization. US: Ablex Publishing.
37 Song, Min., & Kim, Suyeon. 2013. "Detecting the knowledge structure of bioinformatics by mining full-text collections." Scientometrics, 96(1): 183-201.   DOI
38 Steyvers, M., & Griffiths, T. 2007. Probabilistic topic models. Handbook of Latent Semantic Analysis. Edited by T. K. Landauer, D. S. McNamara, S. Dennis, W. Kintsch. NJ: Erlbaum.