DOI QR코드

DOI QR Code

A Method for Detecting Event-Location based on Similar Keyword Extraction in Tweet Text

트윗 텍스트의 유사 키워드 추출을 통한 이벤트 지역 탐지 기법

  • Yim, Junyeob (Dept. of Computer Science and Engineering, The Catholic University of Korea) ;
  • Ha, Hyunsoo (Dept. of Computer Science and Engineering, The Catholic University of Korea) ;
  • Hwang, Byung-Yeon (Dept. of Computer Science and Engineering, The Catholic University of Korea)
  • Received : 2015.01.25
  • Accepted : 2015.10.13
  • Published : 2015.10.31

Abstract

Twitter has the fast propagation and diffusion of information compare to other SNS. Therefore, many researches about detecting real-time event using twitter are progressing. Twitter real-time event detecting system assumes every twitter user as a sensor and analyzes their written tweet in order to detect the event. Researches that are related to this twitter have already obtained good results but confronted the limits because of some problems. Especially, many existing researches are using the method that can trace an event location by using GPS coordinate. However, it can be suggested a definite limitation through the present user's skeptical responses about making personal location information public. Therefore, this paper suggests the method that traces the location information in tweet contents text without using the provided location information from twitter. Associated words were grouped by using the keyword that extracted in tweet contents text. The place that the events have occurred and whether the events have surely occurred are detected by this experiment using this algorithm. Furthermore, this experiment demonstrated the necessity of the suggested methods by showing faster detection compare to the other existing media.

트위터는 다른 SNS와 대비되는 정보의 빠른 전파력과 확산성을 갖고 있다. 따라서 트위터를 이용하여 현실에서 발생한 이벤트를 탐지하는 여러 연구가 진행되고 있다. 트위터 사용자 개개인을 하나의 센서로 가정하고 그들이 작성한 트윗 텍스트를 분석하여 이벤트 탐지에 이용하는 것이다. 이와 관련된 연구들은 이미 많은 성과를 보이며 진행되어 왔으나 여러 가지 문제점들로 인해 새로운 한계에 직면했다. 특히 선행 연구의 대다수가 이벤트의 발생 위치를 추적하기 위해 GPS좌표를 이용한다. 그러나 이는 최근 트위터 사용자들이 위치정보 공개에 회의적인 점을 감안하면 명확한 한계점으로 제시될 수 있다. 이에 본 논문에서는 트위터에서 제공하는 위치정보를 이용하지 않고 트윗 텍스트에서 위치정보를 추적하는 방법을 제시하였다. 트윗 텍스트에서 키워드를 추출하여 키워드간의 관계를 고려해 연관단어를 군집화 하였다. 본 논문에서 제안한 알고리즘을 적용한 실험을 통해 이벤트가 발생한 지역과 실제로 발생한 이벤트의 탐지여부를 확인하였다. 또한 본 논문에서 제안한 기법이 기존 매체들보다 빠른 탐지를 보임으로써 제안된 기법의 우수성을 입증하였다.

Keywords

References

  1. Park, S. Y; Ha, Y. H; Kim, Y. H. 2010, Recent Studies on Twitter in the Field of Information Retrieval, KIISE Fall Conference, 25-29.
  2. Sakaki, T; Okzaki, M; Matsuo, Y. 2010, Earthquake Shakes Twitter Users: Real-time Event Detection by Social Sensors, The 19th Int'l Conf. on World Wide Web, 851-860.
  3. Li, R; Lei, K. H; Khadiwala, R; Chang, K. 2012, TEDAS: a Twitter Based Event Detection and Analysis System, IEEE 28th International Conference on Data Engineering, 1273-1276.
  4. Lee, J; Baek, S; Lee, S; Bae, H. 2012, The Method for Real-time Complex Event Detection of Unstructured Big data, Journal of Korea Spatial Information Society, 20(5):99-109.
  5. Kim, M; Park S. 2014, Construction and Application of POI Database with Spatial Relations Using SNS, Journal of Korea Spatial Information Society, 22(4):21-38. https://doi.org/10.12672/ksis.2014.22.4.021
  6. Lee, B; Kim, S; Hwang, B. Y. 2012, Analyzing the Credibility of the Location Information Provided by Twitter Users, Journal of Korea Multimedia Society, 15(7):910-919. https://doi.org/10.9717/kmms.2012.15.7.910
  7. Li, Z; Zhou, D; Juan, Y. F; Han, J. 2010, Keyword Extraction for Social Snippets, The 19th International Conference on World Wide Web, 1143-1144.
  8. J. H. Friedman. 2001, Greedy Function Approximation: A Gradient Boosting Machine, Annals of Statistics, 29(5):1189-1232. https://doi.org/10.1214/aos/1013203451
  9. Ku, M; Min, D. 2009, Study on Keyword Extraction Method using Recursive Extracted Word Division, KIISE Fall Conference, 329-334.
  10. Yim, J; Yoon, J; Lee, B; Hwang, B. Y. 2014, Designing of Event Decision Module using Twitter, KIPS Spring Conference, 680-683.
  11. Twitter, 2012, The Streaming APIs Twitter Developers, Accessed Sep 24. https://dev.twitter.com/docs/streaming-apis.
  12. Lee, S. 2008, Lucean Korean Morph Analyzer, Accessed Oct 18. http://cafe.naver.com/korlucene.
  13. Republic of Korea National Statistical Office, 2010, Population and Housing Census 2010, http://www.kostat.go.kr.

Cited by

  1. Inferring tweet location inference for twitter mining vol.24, pp.4, 2015, https://doi.org/10.1007/s41324-016-0041-y