DOI QR코드

DOI QR Code

A Distributed Privacy-Utility Tradeoff Method Using Distributed Lossy Source Coding with Side Information

  • Gu, Yonghao (Beijing Key Laboratory of Intelligent Telecommunication Software and Multimedia, School of Computer Science Beijing University of Posts and Telecommunications) ;
  • Wang, Yongfei (Beijing Key Laboratory of Intelligent Telecommunication Software and Multimedia, School of Computer Science Beijing University of Posts and Telecommunications) ;
  • Yang, Zhen (Beijing Key Laboratory of Intelligent Telecommunication Software and Multimedia, School of Computer Science Beijing University of Posts and Telecommunications) ;
  • Gao, Yimu (Department of Computer and Information Sciences, University of Delaware)
  • Received : 2016.10.22
  • Accepted : 2017.03.16
  • Published : 2017.05.31

Abstract

In the age of big data, distributed data providers need to ensure the privacy, while data analysts need to mine the value of data. Therefore, how to find the privacy-utility tradeoff has become a research hotspot. Besides, the adversary may have the background knowledge of the data source. Therefore, it is significant to solve the privacy-utility tradeoff problem in the distributed environment with side information. This paper proposes a distributed privacy-utility tradeoff method using distributed lossy source coding with side information, and quantitatively gives the privacy-utility tradeoff region and Rate-Distortion-Leakage region. Four results are shown in the simulation analysis. The first result is that both the source rate and the privacy leakage decrease with the increase of source distortion. The second result is that the finer relevance between the public data and private data of source, the finer perturbation of source needed to get the same privacy protection. The third result is that the greater the variance of the data source, the slighter distortion is chosen to ensure more data utility. The fourth result is that under the same privacy restriction, the slighter the variance of the side information, the less distortion of data source is chosen to ensure more data utility. Finally, the provided method is compared with current ones from five aspects to show the advantage of our method.

Keywords

References

  1. Huang X, Liu J, Han Z, et al, "A new anonymity model for privacy-preserving data publishing," China Communications, vol. 11, no. 9, pp. 47-59, 2014. https://doi.org/10.1109/CC.2014.6969710
  2. Alvim M S, Andres M E, Chatzikokolakis K, et al, "Differential Privacy: on the trade-off between Utility and Information Leakage," in Proc. of International Conference on Formal Aspects of Security and Trust, vol. 7140, pp. 39-54, 2011.
  3. V. Rastogi and S. Nath, "Differentially private aggregation of distributed time-series with transformation and encryption," in Proc. of ACM SIGMOD International Conference on Management of Data, Indianapolis, Indiana, USA, pp. 735-746, 2010.
  4. Makhdoumi A, Fawaz N, "Privacy-utility tradeoff under statistical uncertainty," in Proc. of Allerton Conference on Communication, Control, and Computing, pp.1627-1634, 2013.
  5. Krause A, Horvitz E, "A utility-theoretic approach to privacy in online services," Journal of Artificial Intelligence Research, vol. 39, no. 1, pp. 633-662, 2010. https://doi.org/10.1613/jair.3089
  6. Xiong P, Zhu T, "An Anonymization Method Based on Tradeoff between Utility and Privacy for Data Publishing," in Proc. of International Conference on Management of E-Commerce and E-Government, pp.72-78, 2012.
  7. Loukides G, Shao J, "Data utility and privacy protection trade-off in k-anonymisation," in Proc. of International Workshop on Privacy and Anonymity in Information Society, Nantes, France, pp.36-45, March, 2008.
  8. Li T, Li N, "On the tradeoff between privacy and utility in data publishing," in Proc. of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.517-526, 2009.
  9. Khokhar R H, Chen R, Fung B C M, et al, "Quantifying the costs and benefits of privacy-preserving health data publishing," Journal of Biomedical Informatics, vol. 50, no.8, pp. 107-121, 2014. https://doi.org/10.1016/j.jbi.2014.04.012
  10. Andreas Krause, Eric Horvitz, "A Utility-Theoretic Approach to Privacy and Personalization," in Proc. of Twenty-Third Conference on Artificial Intelligence, pp. 1181-1188, July 2008.
  11. Loukides G, Gkoulalas-Divanis A, Shao J, "On balancing disclosure risk and data utility in transaction data sharing using R-U confidentiality map," Joint UNECE/Eurostat work session on statistical data confidentiality, pp. 19, 2011.
  12. M. Terrovitis, N. Mamoulis, and P. Kalnis, "Privacy-preserving anonymization of set-valued data," PVLDB, vol. 1, no.1, pp. 115-125, 2008.
  13. G. Loukides, A. Gkoulalas-Divanis, and B. Malin, "COAT: Constraint-based anonymization of transactions," KAIS, vol. 28, no. 2, pp. 251-282, 2011.
  14. Gkoulalas-Divanis A, Loukides G, "PCTA: privacy-constrained clustering-based transaction data anonymization," in Proc. of International Workshop on Privacy and Anonymity in Information Society, pp.1-10, March, 2011.
  15. G. Loukides, A. Gkoulalas-Divanis, and J. Shao, "Assessing disclosure risk and data utility trade-off in transaction data anonymization," International Journal of Software and Information, vol. 6, no. 3, pp.399-417, 2012.
  16. Terrovitis M, Mamoulis N, Kalnis P, "Privacy-Preserving Anonymization Of Set-Valued Data," in Proc. of the Vldb Endowment, 115-125, 2008.
  17. G. Loukides, A. Gkoulalas-Divanis, and B. Malin, "COAT: COnstraint-based anonymization of transactions," Knowledge and Information Systems, vol. 28, no. 2, pp. 251-282, 2011. https://doi.org/10.1007/s10115-010-0354-4
  18. Salamatian S., Zhang A., du Pin Calmon F, et al, "Managing your Private and Public Data: Bringing down Inference Attacks against your Privacy," IEEE Journal of Selected Topics in Signal Processing, vol. 9, no. 7, pp. 1240-1255, 2015. https://doi.org/10.1109/JSTSP.2015.2442227
  19. Ghosh A, Roughgarden T, Sundararajan M, "Universally Utility-Maximizing Privacy Mechanisms," Siam Journal on Computing, vol. 41, no. 6, pp. 351-360, 2009.
  20. Ali Makhdoumi, Nadia Fawaz, "Privacy-Utility Tradeoff under Statistical Uncertainty," in Proc. of Fifty-first Annual Allerton Conference, pp. 1627-1634, October 2-3, 2013.
  21. Chakraborty S, Charbiwala Z, Choi H, et al, "Balancing behavioral privacy and information utility in sensory data flows," Pervasive and Mobile Computing, vol. 8, no. 3, pp. 331-345, 2012. https://doi.org/10.1016/j.pmcj.2012.03.002
  22. Reza Shokri, "Privacy Games: Optimal Protection Mechanism Design for Bayesian and Differential Privacy," Submitted on 14 Feb 2014.
  23. Sankar L, Rajagopalan S R, Poor H V, "A theory of utility and privacy of data sources," in Proc. of 2010 IEEE International Symposium on Information Theory(ISIT), pp. 2642-2646, June 13-18, 2010.
  24. Y Oohama, "Gaussian multiterminal source coding," IEEE Transactions on Information Theory, vol. 43, no. 6, pp. 1912-1923, November, 1997. https://doi.org/10.1109/18.641555
  25. Wyner A D, "The rate-distortion function for source coding with side information at the decoder-II: General sources," Probability Theory & Related Fields, vol. 38, no. 1, pp. 60-80, 1978.
  26. Sheet D, Kaiwartya O, Abdullah A, et al. "Location Information Verification using Transferable Belief Model for Geographic Routing in VANETs," IET Intelligent Transport Systems, 2016.