DOI QR코드

DOI QR Code

Differential Privacy in Practice

  • Nguyen, Hiep H. (Division of IT Convergence Engineering, Pohang University of Science and Technology (POSTECH)) ;
  • Kim, Jong (Division of IT Convergence Engineering, Pohang University of Science and Technology (POSTECH)) ;
  • Kim, Yoonho (Division of Computer Science, Sangmyung University)
  • Received : 2013.06.14
  • Accepted : 2013.07.03
  • Published : 2013.09.30

Abstract

We briefly review the problem of statistical disclosure control under differential privacy model, which entails a formal and ad omnia privacy guarantee separating the utility of the database and the risk due to individual participation. It has born fruitful results over the past ten years, both in theoretical connections to other fields and in practical applications to real-life datasets. Promises of differential privacy help to relieve concerns of privacy loss, which hinder the release of community-valuable data. This paper covers main ideas behind differential privacy, its interactive versus non-interactive settings, perturbation mechanisms, and typical applications found in recent research.

Keywords

References

  1. L. Sweeney, "k-Anonymity: a model for protecting privacy," International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, vol. 10, no. 5, pp. 557-570, 2002. https://doi.org/10.1142/S0218488502001648
  2. A. Machanavajjhala, D. Kifer, J. Gehrke, and M. Venkitasubramaniam, "l-Diversity: privacy beyond k-anonymity," ACM Transactions on Knowledge Discovery from Data, vol. 1, no. 1, article no. 3, 2007.
  3. N. Li, T. Li, and S. Venkatasubramanian, "t-Closeness: privacy beyond k-anonymity and l-diversity," in Proceedings of the IEEE 23rd International Conference on Data Engineering, Istanbul, Turkey, 2007, pp. 106-115.
  4. B. C. M. Fung, K. Wang, R. Chen, and P. S. Yu, "Privacypreserving data publishing: a survey of recent developments," ACM Computing Surveys, vol. 42, no. 4, article no. 14, 2010.
  5. C. Dwork, "A firm foundation for private data analysis," Communications of the ACM, vol. 54, no. 1, pp. 86-95, 2011.
  6. C. Dwork, K. Kenthapadi, F. McSherry, I. Mironov, and M. Naor, "Our data, ourselves: privacy via distributed noise generation," in Proceedings of the 24th Annual International Conference on The Theory and Applications of Cryptographic Techniques, Saint Petersburg, Russia, 2006, pp. 486-503.
  7. C. Dwork, "Differential privacy," Automata, Languages and Programming, Lecture Notes in Computer Science vol. 4052, M. Bugliesi et al., editors, Heidelberg: Springer, pp. 1-12, 2006.
  8. K. Nissim, S. Raskhodnikova, and A. Smith, "Smooth sensitivity and sampling in private data analysis," in Proceedings of the 39th Annual ACM Symposium on Theory of Computing, San Diego, CA, 2007, pp. 75-84.
  9. R. Chen, N. Mohammed, B. C. Fung, B. C. Desai, and L. Xiong, "Publishing set-valued data via differential privacy," Proceedings of the VLDB Endowment, vol. 4, no. 11, pp. 1087-1098, 2011.
  10. N. Li, W. Qardaji, D. Su, and J. Cao, "PrivBasis: frequent itemset mining with differential privacy," Proceedings of the VLDB Endowment, vol. 5, no. 11, pp. 1340-1351, 2012.
  11. F. McSherry and K. Talwar, "Mechanism design via differential privacy," in Proceedings of the 48th Annual IEEE Symposium on Foundations of Computer Science, Providence, RI, 2007, pp. 94-103.
  12. C. Dwork, F. McSherry, K. Nissim, and A. Smith, "Calibrating noise to sensitivity in private data analysis," in Proceedings of the 3rd Conference on Theory of Cryptography, New York, NY, 2006, pp. 265-284.
  13. O. Williams and F. McSherry, "Probabilistic inference and differential privacy," in Proceedings of the 24th Annual Conference on Neural Information Processing Systems, Vancouver, BC, 2010.
  14. A. Blum, K. Ligett, and A. Roth, "A learning theory approach to non-interactive database privacy," in Proceedings of the 40th Annual ACM Symposium on Theory of Computing, Victoria, BC, 2008, pp. 609-618.
  15. R. Bhaskar, S. Laxman, A. Smith, and A. Thakurta, "Discovering frequent patterns in sensitive data," in Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, 2010, pp. 503-512.
  16. A. Friedman and A. Schuster, "Data mining with differential privacy," in Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, 2010, pp. 493-502.
  17. N. Mohammed, R. Chen, B. C. M. Fung, and P. S. Yu, "Differentially private data release for data mining," in Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, 2011, pp. 493-501.
  18. V. Rastogi and S. Nath, "Differentially private aggregation of distributed time-series with transformation and encryption," in Proceedings of the ACM SIGMOD International Conference on Management of Data, Indianapolis, IN, 2010, pp. 735-746.
  19. G. Acs, C. Castelluccia, and R. Chen, "Differentially private histogram publishing through lossy compression," in Proceedings of the IEEE 12th International Conference on Data Mining, Brussels, Belgium, 2012.
  20. A. Ghosh, T. Roughgarden, and M. Sundararajan, "Universally utility-maximizing privacy mechanisms," in Proceedings of the 41st Annual ACM Symposium on Theory of Computing, Bethesda, MD, 2009, pp. 351-360.
  21. S. R. Ganta, S. P. Kasiviswanathan, and A. Smith, "Composition attacks and auxiliary information in data privacy," in Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, NV, 2008, pp. 265-273.
  22. F. D. McSherry, "Privacy integrated queries: an extensible platform for privacy-preserving data analysis," in Proceedings of the 35th SIGMOD International Conference on Management of Data, Providence, RI, 2009, pp. 19-30.
  23. G. Cormode, C. Procopiuc, D. Srivastava, E. Shen, and T. Yu, "Differentially private spatial decompositions," in Proceedings of the 28th IEEE International Conference on Data Engineering, Washington, DC, 2012, pp. 20-31.
  24. R. Chen, G. Acs, and C. Castelluccia, "Differentially private sequential data publication via variable-length n-grams," in Proceedings of the 2012 ACM Conference on Computer and Communications Security, Raleigh, NC, 2012, pp. 638-649.
  25. X. Xiao, G. Wang, and J. Gehrke, "Differential privacy via wavelet transforms," IEEE Transactions on Knowledge and Data Engineering, vol. 23, no. 8, pp. 1200-1214, 2011. https://doi.org/10.1109/TKDE.2010.247
  26. G. Cormode, C. Procopiuc, D. Srivastava, and T. T. Tran, "Differentially private summaries for sparse data," in Proceedings of the 15th International Conference on Database Theory, Berlin, Germany, 2012, pp. 299-311.
  27. M. Hardt, K. Ligett, and F. McSherry, "A simple and practical algorithm for differentially private data release," Cornell University, Ithaca, NY, arXiv: 1012.4763, 2010.
  28. I. Dinur and K. Nissim, "Revealing information while preserving privacy," in Proceedings of the 22nd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, San Diego, CA, 2003, pp. 202-210.
  29. A. Roth and T. Roughgarden, "Interactive privacy via the median mechanism," in Proceedings of the 42nd ACM Symposium on Theory of Computing, Cambridge, MA, 2010, pp. 765-774.
  30. M. Hardt and G. N. Rothblum, "A multiplicative weights mechanism for privacy-preserving data analysis," in Proceedings of the IEEE 51st Annual Symposium on Foundations of Computer Science, Las Vegas, NV, 2010, pp. 61-70.
  31. V. N. Vapnik and A. Y. Chervonenkis, "On the uniform convergence of relative frequencies of events to their probabilities," Theory of Probability & Its Applications, vol. 16, no. 2, pp. 264-280, 1971. https://doi.org/10.1137/1116025
  32. C. Dwork, M. Naor, O. Reingold, G. N. Rothblum, and S. Vadhan, "On the complexity of differentially private data release: efficient algorithms and hardness results," in Proceedings of the 41st Annual ACM Symposium on Theory of Computing, Bethesda, MD, 2009, pp. 381-390.
  33. A. Blum, C. Dwork, F. McSherry, and K. Nissim, "Practical privacy: the SuLQ framework," in Proceedings of the 24th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, Baltimore, MD, 2005, pp. 128-138.
  34. S. Chawla, C. Dwork, F. McSherry, A. Smith, and H. Wee, "Toward privacy in public databases," in Proceedings of the 2nd International Conference on Theory of Cryptography, Cambridge, MA, 2005, pp. 363-385.
  35. B. Barak, K. Chaudhuri, C. Dwork, S. Kale, F. McSherry, and K. Talwar, "Privacy, accuracy, and consistency too: a holistic solution to contingency table release," in Proceedings of the 26th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, Beijing, China, 2007, pp. 273-282.
  36. M. Hay, V. Rastogi, G. Miklau, and D. Suciu, "Boosting the accuracy of differentially private histograms through consistency," Proceedings of the VLDB Endowment, vol. 3, no. 1-2, pp. 1021-1032, 2010.
  37. M. Hay, C. Li, G. Miklau, and D. Jensen, "Accurate estimation of the degree distribution of private networks," in Proceedings of the 9th IEEE International Conference on Data Mining, Miami, FL, 2009, pp. 169-178.
  38. C. Li, M. Hay, V. Rastogi, G. Miklau, and A. McGregor, "Optimizing linear counting queries under differential privacy," in Proceedings of the 29th ACM SIGMOD-SIGACTSIGART Symposium on Principles of Database Systems, Indianapolis, IN, 2010, pp. 123-134.
  39. C. Li and G. Miklau, "An adaptive mechanism for accurate query answering under differential privacy," Proceedings of the VLDB Endowment, vol. 5, no. 6, pp. 514-525, 2012.
  40. G. Yuan, Z. Zhang, M. Winslett, X. Xiao, Y. Yang, and Z. Hao, "Low-rank mechanism: optimizing batch queries under differential privacy," Proceedings of the VLDB Endowment, vol. 5, no. 11, pp. 1352-1363, 2012.
  41. M. Hardt and K. Talwar, "On the geometry of differential privacy," in Proceedings of the 42nd ACM Symposium on Theory of Computing, Cambridge, MA, 2010, pp. 705-714.
  42. A. Bhaskara, D. Dadush, R. Krishnaswamy, and K. Talwar, "Unconditional differentially private mechanisms for linear queries," in Proceedings of the 44th Symposium on Theory of Computing, New York, NY, 2012, pp. 1269-1284.
  43. S. Peng, Y. Yang, Z. Zhang, M. Winslett, and Y. Yu, "DPtree: indexing multi-dimensional data under differential privacy," in Proceedings of the ACM SIGMOD International Conference on Management of Data, Scottsdale, AZ, 2012, pp. 864-864.
  44. M. A. Pathak and B. Raj, "Large margin Gaussian mixture models with differential privacy," IEEE Transactions on Dependable and Secure Computing, vol. 9, no. 4, pp. 463-469, 2012. https://doi.org/10.1109/TDSC.2012.27
  45. J. Zhang, Z. Zhang, X. Xiao, Y. Yang, and M. Winslett, "Functional mechanism: regression analysis under differential privacy," Proceedings of the VLDB Endowment, vol. 5, no. 11, pp. 1364-1375, 2012.
  46. S. Zhou, K. Ligett, and L. Wasserman, "Differential privacy with compression," in Proceedings of the IEEE International Conference on Symposium on Information Theory, Seoul, Korea, 2009, pp. 2718-2722.
  47. K. Chaudhuri, A. D. Sarwate, and K. Sinha, "Near-optimal differentially private principal components," in Proceedings of the 25th Annual Conference on Neural Information Processing Systems, Granada, Spain, 2011, pp. 998-1006.
  48. A. Smith, "Efficient, differentially private point estimators," Cornell University, Ithaca, NY, arXiv: 0809.4794, 2008.
  49. J. Lei, "Differentially private M-estimators," in Proceedings of the 25th Annual Conference on Neural Information Processing Systems, Granada, Spain, 2011, pp. 361-369.
  50. B. I. P. Rubinstein, P. L. Bartlett, L. Huang, and N. Taft, "Learning in a large function space: privacy-preserving mechanisms for SVM learning," Cornell University, Ithaca, NY, arXiv: 0911.5708, 2009.
  51. C. Dwork, G. N. Rothblum, and S. Vadhan, "Boosting and differential privacy," in Proceedings of the IEEE 51st Annual Symposium on Foundations of Computer Science, Las Vegas, NV, 2010, pp. 51-60.
  52. F. McSherry and R. Mahajan, "Differentially-private network trace analysis," in Proceedings of the ACM SIGCOMM 2010 Conference, New Delhi, India, 2010, pp. 123-134.
  53. F. McSherry and I. Mironov, "Differentially private recommender systems: building privacy into the net," in Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France, 2009, pp. 627-636.
  54. I. Roy, S. T. V. Setty, A. Kilzer, V. Shmatikov, and E. Witchel, "Airavat: security and privacy for MapReduce," in Proceedings of the 7th USENIX Symposium on Networked Systems Design and Implementation, San Jose, CA, 2010, pp. 20-20.
  55. P. Mohan, A. Thakurta, E. Shi, D. Song, and D. Culler, "GUPT: privacy preserving data analysis made easy," in Proceedings of the ACM SIGMOD International Conference on Management of Data, Scottsdale, AZ, 2012, pp. 349-360.
  56. T. H. Chan, E. Shi, and D. Song, "Private and continual release of statistics," ACM Transactions on Information and System Security, vo. 14, no. 3, article no. 26, 2011.
  57. C. Dwork, M. Naor, T. Pitassi, G. N. Rothblum, and S. Yekhanin, "Pan-private streaming algorithms," in Proceedings of the 1st Symposium on Innovations in Computer Science, Beijing, China, 2010, pp. 66-80.
  58. C. Dwork and S. Yekhanin, "New efficient attacks on statistical disclosure control mechanisms," in Proceedings of the 28th Annual Conference on Cryptology: Advances in Cryptology, Santa Barbara, CA, 2008, pp. 469-480.
  59. C. Dwork, "Ask a better question, get a better answer a new approach to private data analysis," in Proceedings of the 11th International Conference on Database Theory, Barcelona, Spain, 2007, pp. 18-27.
  60. A. Haeberlen, B. C. Pierce, and A. Narayan, "Differential privacy under fire," in Proceedings of the 20th USENIX Conference on Security, San Francisco, CA, 2011.
  61. D. Kifer and A. Machanavajjhala, "No free lunch in data privacy," in Proceedings of the ACM SIGMOD International Conference on Management of Data, Athens, Greece, 2011, pp. 193-204.

Cited by

  1. Achieving Perfect Location Privacy in Wireless Devices Using Anonymization vol.12, pp.11, 2017, https://doi.org/10.1109/TIFS.2017.2713341
  2. On Binary Decomposition Based Privacy-Preserving Aggregation Schemes in Real-Time Monitoring Systems vol.27, pp.10, 2016, https://doi.org/10.1109/TPDS.2016.2516983
  3. Using Feature Selection to Improve the Utility of Differentially Private Data Publishing vol.37, 2014, https://doi.org/10.1016/j.procs.2014.08.076
  4. A New Differential Privacy Crowdsensing Scheme Based on the Multilevel Interactive Game vol.2018, pp.1530-8677, 2018, https://doi.org/10.1155/2018/9867061