Browse > Article
http://dx.doi.org/10.3745/KTSDE.2015.4.10.447

Geographical Name Denoising by Machine Learning of Event Detection Based on Twitter  

Woo, Seungmin (가톨릭대학교 컴퓨터정보공학부)
Hwang, Byung-Yeon (가톨릭대학교 컴퓨터정보공학부)
Publication Information
KIPS Transactions on Software and Data Engineering / v.4, no.10, 2015 , pp. 447-454 More about this Journal
Abstract
This paper proposes geographical name denoising by machine learning of event detection based on twitter. Recently, the increasing number of smart phone users are leading the growing user of SNS. Especially, the functions of short message (less than 140 words) and follow service make twitter has the power of conveying and diffusing the information more quickly. These characteristics and mobile optimised feature make twitter has fast information conveying speed, which can play a role of conveying disasters or events. Related research used the individuals of twitter user as the sensor of event detection to detect events that occur in reality. This research employed geographical name as the keyword by using the characteristic that an event occurs in a specific place. However, it ignored the denoising of relationship between geographical name and homograph, it became an important factor to lower the accuracy of event detection. In this paper, we used removing and forecasting, these two method to applied denoising technique. First after processing the filtering step by using noise related database building, we have determined the existence of geographical name by using the Naive Bayesian classification. Finally by using the experimental data, we earned the probability value of machine learning. On the basis of forecast technique which is proposed in this paper, the reliability of the need for denoising technique has turned out to be 89.6%.
Keywords
SNS; Twitter; Realtime Event Detect; Geographical Name Denoising; Machine Learning;
Citations & Related Records
Times Cited By KSCI : 3  (Citation Analysis)
연도 인용수 순위
1 Statistic Brain, Twitter Statistics [Internet], http://www.statisticbrain.com.
2 E. Lee, J. Kim, and D. Baik, "An Evaluation Method for Contents Importance Based on Twitter Characteristics," Journal of KIISE, Vol.41, No.12, pp.1136-1144, 2014.   DOI
3 T. Bayar and K. Lee, "Extracting Core Events Based on Timeline and Retweet Analysis in Twitter Corpus," KIPS Transactions on Software and Data Engineering, Vol.1 No.1, pp.69-74, 2012.   DOI
4 H. Kwak, C. Lee, H. Park, and S. Moon, "What is Twitter, a Social Network or a News Media?," Proc. of the 19th International Conference on World Wide Web, pp.591-600, 2010.
5 T. Sakaki, M. Okzaki, and Y. Matsuo, "Earthquake Shakes Twitter Users: Real-time Event Detection by Social Sensors," Proc. of the 19th International Conference on World Wide Web, pp.851-860, 2010.
6 R. Li, K. H. Lei, R. Khadiwala, and K. Chang, "TEDAS: a Twitter Based Event Detection and Analysis System," Proc. of the IEEE 28th International Conference on Data Engineering, pp.1273-1276. 2012.
7 J. Yim, J. Yoon, B. Lee, and B. Hwang, "Designing of Event Decision Module using Twitter," Proc. of Korea Computer Congress, pp.248-250, 2013.
8 J. Shin and C. Ock, "A Stage Transition Model for Korean Part-of-Speech and Homograph Tagging," Journal of KIISE, Vol.39 No.11, pp.889-901, 2012.
9 Twitter Streaming API [Internet], http://dev.twitter.com/docs/streaming-apis.
10 W. lan H, F. Eibe, and H. Mark A, "Data Mining," 3rd ed., Morgan Kaufmann, pp.594-595, 2011.
11 J. Yim and B. Hwang, "Predicting Movie Success based on Machine Learning Using Twitter," KIPS Transactions on Software and Data Engineering, Vol.3 No.7, pp.263-270, 2014.   DOI