Browse > Article
http://dx.doi.org/10.13089/JKIISC.2013.23.6.1103

A Method of Identifying Ownership of Personal Information exposed in Social Network Service  

Kim, Seok-Hyun (Electronics and Telecommunications Research Institute)
Cho, Jin-Man (Electronics and Telecommunications Research Institute)
Jin, Seung-Hun (Electronics and Telecommunications Research Institute)
Choi, Dae-Seon (Electronics and Telecommunications Research Institute)
Abstract
This paper proposes a method of identifying ownership of personal information in Social Network Service. In detail, the proposed method automatically decides whether any location information mentioned in twitter indicates the publisher's residence area. Identifying ownership of personal information is necessary part of evaluating risk of opened personal information online. The proposed method uses a set of decision rules that considers 13 features that are lexicographic and syntactic characteristics of the tweet sentences. In an experiment using real twitter data, the proposed method shows better performance (f1-score: 0.876) than the conventional document classification models such as naive bayesian that uses n-gram as a feature set.
Keywords
Big Data; Pricacy; Personal Information; Social Network Service; Location Information;
Citations & Related Records
Times Cited By KSCI : 4  (Citation Analysis)
연도 인용수 순위
1 Abhishek Kumar, Subham Kumar Gupta, Animesh Kumar Rai and Sapna Sinha, "Social networking sites and their security issues," International Journal of Scientific and Research Publications, vol. 3, no. 4, pp. 1-5, Apr. 2013.
2 Nam-won Kim, Jin-su Park, "Personal information detection by using naive bayes methodology," Journal of The Korea Intelligent Information System Society, 18(1), pp. 91-107, Mar. 2012.
3 Morphological analysis [Online] : htto://ko.wikipedia.org/wiki/형태_분석
4 William B. Cavnar and John M. Trenkle, "N-gram-based text categorization," Ann Arbor MI, vol. 48113, no. 2, pp. 161-175, 1994.
5 Cho-hui Hong, Hak-su Kim, "Comparative study of various machine-learning features for tweets sentiment classification," Journal of The Korea Contents Association, 12(12), pp. 471-478, Dec. 2012.   과학기술학회마을   DOI   ScienceOn
6 Gyeong-ryeol Kim, Dong-hyeon Choi, Eun-gyeong Kim, Gi-seon Choi, "Feature selection for meeting location from non-itemized meeting email announcement in korean," Journal of The Korean Institute of Information Scientists and Engineers, 37(2), pp. 50-51, Nov. 2010.
7 Ui-gyu Park, Min-hui Cho, Seong-won Kim, Dong-ryeol Na, "A method for extracting dependency relations using chunking and segmentation," Journal of The Korean Institute of Information Scientists and Engineers, 16(1), pp. 131-137, Oct. 2004.
8 Jong-su Im, Tae-yeong Kim, Dong-ryeol Na, "Korean dependency parsing based on machine learning of feature weights," Journal of The Korean Institute of Information Scientists and Engineers, 38(4), pp. 214-223, Apr. 2011.   과학기술학회마을
9 Chang, Chih-Chung, and Chih-jen Lin. "LIBSVN: A library support vector machines," ACM Transactions on Intelligent Systems and Technology, vol. 2, no. 3, 2011.
10 Harry Zhang, "The optimality of naive bayes," Proceedings of the FLAIRS Conference, vol. 1, no. 2, pp. 3-9, 2004.
11 Decision Trees[Online].Available : http://scikit-learn.org/stable/modules/tree.html
12 SCIKIT [Online].Available: http://scikit-learn.org
13 Ha-na Jun, Facebook user is 11 million people per month and enhance the marketing, ZDNET Korea, Feb. 2013.
14 Dae-seon Choi, Seok-hyun Kim, Jin-man Cho and Seung-hun Jin, "Analysis technique for privacy in bigdata," Journal of The Korea Institute of Information Security & Cryptology, 23(3), pp. 56-60, June 2013.
15 Press release, "How much my information expose in twitter?," Korea Communications Commission, 2011.
16 Y, Haifeng and K. Michael, "Sybilguard: defending against sybil attacks via social networks," ACM SIGCOMM Computer Communication, vol. 36, no. 4, pp. 267-278, 2006   DOI
17 Tae-kyeong Yoon, Do-won Hong, "Trend of technology of reliability reinforcement of social network service," Electronics and Telecommunications Trends, 26(4), pp. 134-145, Aug. 2011.
18 Kolaczek and Grzegorz, "An approach to identity theft detection using social net work analysis," Intelligent Information and Database Systems ACIIDS First Asian Conference, pp. 78-81. Apr. 2009.
19 Kumar, Nitesh, and Ranabothu Nithin Reddy. "Automatic detection of fake profiles in online social networks," Bachelor Thesis, National Institute of Technology Rourkela, May. 2012.
20 Seong-mi Sim, Korea people leave twitter because of it tired, The Korea Economic Daily. May. 2013.
21 Bharath Sriram, David Fuhry, Engin Demir, Hakan Ferhatosmanoglu and Murat Demirbas, "Short text classification in twitter to improve information," Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval, pp. 841-842, 2010.
22 Chang-gi Lee, Myeong-gil Jang, "Named entity recognition with structural SVMs and pegasos algorithm," Journal of The Korean Society for Cognitive Science, 21(4), pp. 665-667, Dec. 2010.   과학기술학회마을   DOI