Browse > Article
http://dx.doi.org/10.3745/KIPSTC.2007.14-C.5.439

On the Privacy Preserving Mining Association Rules by using Randomization  

Kang, Ju-Sung (국민대학교 수학과)
Cho, Sung-Hoon (누리솔루션)
Yi, Ok-Yeon (국민대학교 수학과)
Hong, Do-Won (한국전자통신연구원)
Abstract
We study on the privacy preserving data mining, PPDM for short, by using randomization. The theoretical PPDM based on the secure multi-party computation techniques is not practical for its computational inefficiency. So we concentrate on a practical PPDM, especially randomization technique. We survey various privacy measures and study on the privacy preserving mining of association rules by using randomization. We propose a new randomization operator, binomial selector, for privacy preserving technique of association rule mining. A binomial selector is a special case of a select-a-size operator by Evfimievski et al.[3]. Moreover we present some simulation results of detecting an appropriate parameter for a binomial selector. The randomization by a so-called cut-and-paste method in [3] is not efficient and has high variances on recovered support values for large item-sets. Our randomization by a binomial selector make up for this defects of cut-and-paste method.
Keywords
Data mining; PPDM; Randomization; Association rule mining; Privacy measure;
Citations & Related Records
연도 인용수 순위
  • Reference
1 A. Evfimievski, R. Srikant, R. Agrawal, and J. Gehrke, 'Privacy preserving mining of association rules', Proc. ACMSIGKDD IntI. Conf. on Knowledge Discovery and Data Mining, 2002, pp. 217-228   DOI
2 R. Agrawal, R. Srikant, 'Privacy preserving data mining', ACM SIGMOD Conference on Management of Data, Dallas, TX, 2000, pp. 439-450   DOI
3 J. R. Quinlan, 'Induction of decision trees', Machine learning, Vol. 1, No.1, 1986, pp. 81-106   DOI
4 G. T. Duncan, D. Lambert, 'Disclosure limited data dissemination', Journal of the Americal Statistical Association, Vol. 81, 1986, pp. 10-18   DOI
5 R. Agrawal, T. Imielinski, A. Swami, 'Mining association rules between sets of items in large databases', Proceedings of the ACM SIGMOD Conference on Management of Data, 1993, pp. 207-216   DOI
6 T. Dalenius, 'Towards a methodology for statistical disclosure control', Statistisktidskrift , Vol. 5, 1977, pp. 429-444
7 O. Goldreich, 'Secure Multi-Party Computation (Final Draft, Version 1.4)', http://www.wisdom.weizmann.ac.il /home/oded/public_html/foc.html, 2002
8 J. R. Quinlan. 'Discovering rules by induction from large collection of examples', Expert Systems in the Micro Electronic Age, Edinburgh University Press, pp. 168-201
9 K. Muralidhar, R. Sarathy, 'A theoretical basis for perturbation methods', Statistics and Computing, Vol. 13, 2003, pp. 329-335   DOI
10 J. Vaidya, C. Clifton, 'Privacy-Preserving Data Mining: Why, How, and When', IEEE Security & Privacy, November/December 2004, www.computer.org/security/   DOI   ScienceOn
11 N. Zhang, S. Wang, W. Zhao, 'A new scheme on privacy preserving association rule mining', PKDD 2004, LNAI 3202, 2004, PP. 484-495
12 D. Agrawal, C. C. Agrawal, 'On the design and quantification of privacy preserving data mining algorithms', Proceedings of the 20th Symposium on Principles of Database Systems, May 2001   DOI
13 A. Evfimievski, R. Srikant, R. Agrawal, and J. Gehrke, 'Privacy preserving mining of association rules', Information Systems, Vol. 29, 2004, pp. 343-364   DOI   ScienceOn
14 Y. Lindell, B. Pinkas, 'Privacy preserving data mining', CRYPTO 2000, pp. 36-54