A Model for Privacy Preserving Publication of Social Network Data

소셜 네트워크 데이터의 프라이버시 보호 배포를 위한 모델

  • 성민경 (고려대학교 컴퓨터 전파통신공학과) ;
  • 정연돈 (고려대학교 컴퓨터학과)
  • Received : 2010.05.25
  • Accepted : 2010.07.07
  • Published : 2010.08.15

Abstract

Online social network services that are rapidly growing recently store tremendous data and analyze them for many research areas. To enhance the effectiveness of information, companies or public institutions publish their data and utilize the published data for many purposes. However, a social network containing information of individuals may cause a privacy disclosure problem. Eliminating identifiers such as names is not effective for the privacy protection, since private information can be inferred through the structural information of a social network. In this paper, we consider a new complex attack type that uses both the content and structure information, and propose a model, $\ell$-degree diversity, for the privacy preserving publication of the social network data against such attacks. $\ell$-degree diversity is the first model for applying $\ell$-diversity to social network data publication and through the experiments it shows high data preservation rate.

최근 빠르게 확산되고 있는 온라인 소셜 네트워크 서비스는 수많은 데이터를 저장하고 이를 분석하여 여러 연구 분야에 활용하고 있다. 정보의 효율성을 높이기 위해 기업이나 공공기관은 자신들이 가진 데이터를 배포하고, 배포된 데이터를 이용하여 여러 목적에 사용한다. 그러나 배포되는 소셜 네트워크에는 개인과 관련된 정보가 포함되어 있으므로 개인 프라이버시가 노출될 수 있는 문제가 있다. 배포되는 소셜 네트워크에서 단순히 이름 등의 식별자를 지우는 것으로는 개인 프라이버시 보호에 충분하지 않으며, 소셜 네트워크가 가진 구조적 정보에 의해서도 개인 프라이버시가 노출될 수 있다. 본 논문에서는 내용 정보를 포함하고 있는 소셜 네트워크 배포 시 개인 프라이버시 노출에 이용되는 복합된 공격법을 제시하고 이를 방지할 수 있는 새로운 모델인 $\ell$-차수 다양성($\ell$-degree diversity)을 제안한다. $\ell$-차수 다양성은 소셜 네트워크 데이터 배포에서 $\ell$-다양성을 최초로 적용한 모델이며 높은 정보 보존율을 가짐을 실험을 통해 볼 수 있다.

Keywords

References

  1. L. Sweeney, "Uniqueness of Simple Demographics in the U.S. Population," Carnegie Mellon University, Laboratory for International Data Privacy, 2000.
  2. L. Sweeney, "k-anonymity: A model for protecting privacy," International Journal on Uncertainty, Fuzziness and Knowledge-based System, vol.10, no.3, pp.557-570, 2002.
  3. A. Machanavajjhala, J.Gehrke, D. Kifer, and M. Venkitasubramaniam, "L-diversity: Privacy beyond k-anonymity," In Proceedings of International Conference on Data Engineering, p.24, 2006.
  4. E. Zheleva, and L. Getoor, "Preserving the privacy of sensitive relationships in graph data," In Proceedings of the 1st ACM SIGKDD international conference on Privacy, security, and trust in KDD, pp.153-171, 2007.
  5. A. Campan, and T.M. Truta, "A clustering approach for data and structural anonymity in social networks," In Proceedings of the 2nd ACM SIGKDD international conference on Privacy, security, and trust in KDD, pp.33-54, 2008.
  6. Q. Wei, and Y. Lu, "Preservation of Privacy in Publishing Social Network Data," In Proceedings of the 2008 International Symposium on Electronic Commerce and Security, pp.421-425, 2008.
  7. X. Xiao, and Y. Tao, "m-invariance: Towards privacy preserving re-publication of dynamic datasets," In Proceedings of the ACM SIGMOD, pp.689-700, 2007.
  8. J.W. Byun, Y. Sohn, E. Bertino, and N. Li, "Secure anonymization for incremental datasets," In Proceedings of the VLDB Workshop on Secure Data Management, pp.48-63, 2006.
  9. B.C.M. Fung, K. Wang, A.W.C. Fu, and J. Pei, "Anonymity for continuous data publishing," In Proceedings of the 11th international conference on Extending database technology, pp.264-275, 2008.
  10. L. Backstrom, C. Dwork, and J. Kleinberg, "Wherefore art thou R3579X? anonymized social networks, hidden patterns, and structural steganography," In Proceedings of the international conference on World Wide Web, pp.181-190, 2007.
  11. B. Zou, and J. Pei, "Preserving privacy in social networks against neighbourhood attacks," In Proceedings of the 2008 IEEE 24th International Conference on Data Engineering, pp.506-515, 2008.
  12. K. Liu, and E. Terzi, "Towards identity anonymization on graphs," In Proceedings of the ACM SIGMOD, pp.93-106, 2008.
  13. M. Hay, G. Miklau, D. Jensen, D. Towsley, and P. Weis, "Resisting structural re-identification in anonymized social networks," In Proceedings of the VLDB Endowment, vol.1, no.1, pp.102-114, 2008.
  14. L. Zou, L. Chen, and M.T. Ozsu, "k-automorphism: A general framework for privacy preserving network publication," In Proceedings of the VLDB Endowment, vol.2, no.1, pp.946-957, 2009.
  15. X. Xiao, and Y. Tao, "Anatomy : Simple and Effective Privacy Preservation," In Proceedings of International Conference on Very Large Data Bases, pp.139-150, 2006.
  16. Y. Ye, Q. Deng, C. Wang, D. Lv, Y. Liu and J. Feng, "BSGI : An effective algorithm towards stronger l-diversity," In Proceedings of the 19th International conference on Database and Expert Systems Applications, pp.19-32, 2008.
  17. J.W. Byun, A. Kamra, E. Bertino, and N.Li, "Efficient k-anonymization using clustering techniques," In Proceedings of the 12th international conference on Database and systems for Advanced applications, pp.188-200, 2007.
  18. P. Erdos, and T. Gallai, "Graphs with prescribed degrees of vertices," Mat.Lapok, 1960.
  19. D.J. Newman, S.Hettich, C.L. Blake, and C.J. Merz. "UCI Repository of Machine Learning Databases," http://archive.ics.uci.edu/ml/
  20. A.L. Barabasi, "Linked: The new science of networks," Basic Book, 2002.
  21. D.A. Bader and K. Madduri, "GTGraph: A synthetic graph generator suite," http://sdm.lbl.gov/~kamesh/software/GTgraph/