Browse > Article
http://dx.doi.org/10.9728/dcs.2018.19.1.157

Proposal of keyword extraction method based on morphological analysis and PageRank in Tweeter  

Lee, Won-Hyung (Kangwon National University, IT College, Electronic Electronics Engineering)
Cho, Sung-Il (Kangwon National University, IT College, Electronic Electronics Engineering)
Kim, Dong-Hoi (Kangwon National University, IT College, Electronic Electronics Engineering)
Publication Information
Journal of Digital Contents Society / v.19, no.1, 2018 , pp. 157-163 More about this Journal
Abstract
People who use SNS publish their diverse ideas on SNS every day. The data posted on the SNS contains many people's thoughts and opinions. In particular, popular keywords served on Twitter compile the number of frequently appearing words in user posts and rank them. However, this method is sensitive to unnecessary data simply by listing duplicate words. The proposed method determines the ranking based on the topic of the word using the relationship diagram between words, so that the influence of unnecessary data is less and the main word can be stably extracted. For the performance comparison in terms of the descending keyword rank and the ratios of meaningless keywords among high rank 20 keywords, we make a comparison between the proposed scheme which is based on morphological analysis and PageRank, and the existing scheme which is based on the number of appearances. As a result, the proposed scheme and the existing scheme have included 55% and 70% of meaningless keywords among high rank 20 keywords, respectively, where the proposed scheme is improved about 15% compared with the existing scheme.
Keywords
Twitter; Trend keyword; PageRank algorithm; Morphological analysis;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Yun-hi Lee. Use of domestic SNS and analysis of major issues. Internet & Security Focus, 2014, 10.
2 Chang-Jin Han, Kyoung-Soo Kim. "Twitter's impact on the election of TV debates -18th presidential election TV debates". 2013
3 Search term ranking [Internet] http://datalab.naver.com/keyword/realtimeList.naver
4 Mihalcea, Rada, and Paul Tarau. "TextRank: Bringing order into text." Proceedings of the 2004 conference on empirical methods in natural language processing. 2004.
5 ji-Yeon. Search bias issues on portal and effective search values : focusing on keyword searcheso 'Naver', 2016.
6 조성문의 블로그, '쉽게 설명한' 구글의 페이지 랭크 알고리즘', Aug 26 2012, https://sungmooncho.com/2012/08/26/pagerank/, Oct 16 2017
7 PAGE, Lawrence, et al. The PageRank citation ranking: Bringing order to the web. Stanford InfoLab, 1999.
8 KOMORAN[Internet]. Available: http://shineware.tistory.com/entry/KOMORAN-30-beta
9 MySQL[Internet]. Available: https://www.mysql.com/
10 Neo4j[Internet]. Available: https://neo4j.com/