DOI QR코드

DOI QR Code

An Analysis of Privacy and Accuracy for Privacy-Preserving Techniques by Matrix-based Randomization

행렬 기반 랜덤화를 적용한 프라이버시 보호 기술의 안전성 및 정확성 분석

  • Published : 2008.08.30

Abstract

We study on the practical privacy-preserving techniques by matrix-based randomization approach. We clearly examine the relationship between the two parameters associated with the measure of privacy breach and the condition number of matrix in order to achieve the optimal transition matrix. We propose a simple formula for efficiently calculating the inverse of transition matrix which are needed in the re-construction process of random substitution algorithm, and deduce some useful connections among standard error and another parameters by obtaining condition numbers according to norms of matrix and the expectation and variance of the transformed data. Moreover we give some experimental results about our theoretical expressions by implementing random substitution algorithm.

실용적인 프라이버시 보호 기술 중의 하나인 행렬 기반 랜덤화 기법에 대하여 세밀한 분석을 실시한다. 최적의 변환 행렬을 찾기 위한 프라이버시 손상 관점의 요구조건 및 정확성 측도로 제안된 행렬의 조건수 개념과 연관된 파라미터들간의 관계를 이론적으로 규명한다. 행렬 기반의 대표적 알고리즘인 랜덤 대치 기법의 효율적인 구현을 위하여 데이터 재구축 과정에서 필요한 역행렬을 간단히 구하는 공식을 제시하고, 행렬의 노름에 따른 변환 행렬의 조건수와 변환된 분포의 기댓값 및 분산을 계산함으로써 표준오차와 파라미터들 간의 관계식을 도출한다. 또한, 랜덤 대치 기법을 구현하여 다양한 시뮬레이션을 실시함으로써 이론적으로 얻은 결과를 실험적으로 검증한다.

Keywords

References

  1. R. Agrawal, R. Srikant, 'Privacy preserving data mining', ACM SIGMOD Conference on Management of Data, Dallas, TX, 2000, pp. 439-450
  2. Keke Chen, and Ling Liu, 'Privacy-Preserving Data Classification with Rotation Perturbation', Proc. of IEEE Intl. Conf. on Data Mining (ICDM05), 2005
  3. Kun Liu, Hillol Kargupta, and Jessica Ryan, 'Random Projection-Based Multiplicative Data Perturbation for Privacy Preserving Distributed Data Mining', IEEE Transactions on Knowledge and Data Engineering archive, Vol. 18, Issue 1, 2006
  4. S. Agrawal, and J. Haritsa, 'A Framework for High-Accuracy Privacy-Preserving Mining', Proc. of ICDE 2005, 2005
  5. Jim Dowd, Shouhuai Xu, and Weining Zhang, 'Privacy-Preserving Decision Tree Mining Based on Random Substitutions', ETRICS2006, LNCS 3995, Springer-Verlag, pp. 145-159, 2006
  6. S. Agrawal, and J. Haritsa, 'A framework for high-accuracy privacy-preserving mining', Technical Report TR-2004-02, Database Systems Lab, Indian Institute of Science, 2004. (http://dsl.serc.iisc.ermet.in/pub/TR/R-2004-02.pdf)
  7. A. Evfimievski, J. Gehrke, and R. Srikant, 'Limiting Privacy Breaches in Privacy Preserving Data Mining', Proc. of ACM Symp. on Principles of Database Systems (PODS), 2003
  8. D. Agrawal, C. C. Agrawal, 'On the design and quantification of privacy preserving data mining algorithms', Proceedings of the 20th Symposium on Principles of Database Systems, May 2001
  9. C. D. Meyer, Matrix Analysis and Applied Linear Algebra, SIAM(Society for Industrial and Applied Mathematics), Philadelphia, 2000
  10. Y. Wang, 'On the number of success in independent trials', Statistica Silica 3, 1993