Browse > Article
http://dx.doi.org/10.13089/JKIISC.2017.27.6.1519

A Study on a Differentially Private Model for Financial Data  

Kim, Hyun-il (Kongju National University)
Park, Cheolhee (Kongju National University)
Hong, Dowon (Kongju National University)
Choi, Daeseon (Kongju National University)
Abstract
Data de-identification is the one of the technique that preserves individual data privacy and provides useful information of data to the analyst. However, original de-identification techniques like k-anonymity have vulnerabilities to background knowledge attacks. On the contrary, differential privacy has a lot of researches and studies within several years because it has both strong privacy preserving and useful utility. In this paper, we analyze various models based on differential privacy and formalize a differentially private model on financial data. As a result, we can formalize a differentially private model on financial data and show that it has both security guarantees and good usefulness.
Keywords
De-identification; Differential privacy; Financial data;
Citations & Related Records
연도 인용수 순위
  • Reference
1 M.Gaboardi, A.Haeberlen, J.Hsu, A.Narayan and B.C.Pierce, "Linear dependent types for differential privacy," ACM SIGPLAN Notices, vol.48, no.1, pp.357-370, 2013.
2 A.Friedman and A.Schuster, "Data mining with differential privacy," Proceedings of the 16th ACM SIGKDD International Conference on Konwledge Discovery and Data Mining, pp.493-502, 2010.
3 J.Gardner and L.Xiong, "HIDE: an integrated system for health information DE-identification," Computer-Based Medical Systems, 2008.
4 Financial Security Institute, "Survey on machine learning technologies," http://www.fsec.or.kr/user/bbs/fsec/42/312/bbsDataView/355.do?page=7&column=&search=&searchSDate=&searchEDate=&bbsDataCategory=
5 UCI Repository, "German Credit Data, https://archive.ics.uci.edu/ml/datasets/Statlog+%28German+Credit+Data%29
6 R.Shokri, M.Stronati, C.Song and V.Shmatikov, "Membership inference attacks against machine learning models," Security and Privacy, IEEE Symposium on, pp.3-18, 2017.
7 M.Fredrikson, S.Jha and T.Ristenpart, "Model inversion attacks that exploit confidence information and basic countermeasures," Proceedings of the 22nd ACM SIGSAC conference on computer and communications security, pp.1322-1333, 2015.
8 A.Narayanan and V.Shmatikov, "Robust de-anonymization of large sparse datasets," Security and Privacy, IEEE Symposium on, pp. 111-125, May, 2008.
9 N.Li, T.Li, and S.Venkatasubramanian, "t-closeness: Privacy beyond k-anonymity and l-diversity," Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on, pp. 106-115, April, 2007.
10 N.Mohammed, R.Chen, B.Fung and P.S.Yu, "Differentially private data release for data mining," Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp.493-501, 2011.
11 P.Ohm, "Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization," UCLA Law Review, Research Information Network, vol.57, no.6, pp-1701-1777, 2009.
12 C.Dwork, A.Roth, "The algorithmic foundations of differential privacy," Foundations and Trends$^{(R)}$ in Theoretical Computer Science, pp.211-407, 2014.
13 C.Dwork, F.McSherry, L.Nissim and A.Smith, "Calibrating noise to sensitivity in private data analysis, " Third Theory of Cryptography Conference(TCC), vol.3876, pp.265-284, 2006.
14 C.Park, D.Hong, C.Seo "Differentially private data release method for general use of data," Korea Computer Congress, pp.1036-1038, 2017.
15 Financial Security Institue, "Present condition on introduction for domestic and foreign financial machine learning techniques," http://www.fsec.or.kr/user/bbs/fsec/42/312/bbsDataView/899.do, 2017.
16 C. Dwork, G. N. Rothblum, and S. P. Vadhan, "Boosting and differential privacy," Foundations of Computer Science, pp 51-60. 2010.
17 K.Ligett, "Introduction to differential privacy, randomized response, basic properties," The 7th BIU Winter School on Cryptography, BIU, 2017.
18 J.Wang, S.Liu and Y.Li, "A review of differential privacy in individual data release," International Journal of Distributed Sensor Networks, vol.11, no.10, 2015.
19 F.McSherry and K.Talwar, "Mechanism design via differential privacy,", Foundations of Computer Science, pp.94-103, 2007.
20 F.McSherry, "Privacy integrated queries: an extensible platform for privacy-preserving data analysis," Communications of the ACM, vol. 53, no. 9, pp. 89-97, 2010.   DOI
21 S.L.Garfinkel, "NISTIR8053: De-identification of personal information," Technical report, National Institute of Standards Technology, 2015.
22 B.C.Fung, K.Wang, P.S.Yu, "Top-down specialization for information and privacy preservation, " Data Engineering, Proceedings 21st International Conference on IEEE, pp.205-216, 2005.
23 J.Gardner, L.Xiong, Y.Xiao, J.Gao, A.R.Post, X.Jiang and L.Ohno-Machado, "SHARE: system design and case studies for statistical health information release," Journal of the American Medical Informatics Association, vol.20, no.1, pp.109-116, 2012.
24 Y.Xiao, L.Xiong, C.Yuan, "Differentially private data release through multidimensional partitioning,", Secure Data Management, pp.150-168, 2010.
25 J.L.Bentley, "Multidimensional binary search trees used for associative searching,", Communications of the ACM, vol.18, no.9, pp.509-517, 1975.   DOI
26 N.Li, W.H.Qardaji and D.Su, "Provably private data anonymization:Or, k-anonymity meets differential privacy, " CERIAS Technical Report, 2010.
27 Y.Lim, "Evaluation and future challenges of de-identification techniques," Big data utilization and privacy protection: Information technology solution for object conflicts, Financial Information Society of Korea, Korea Money and Finance Association, Common policy symposium on spring, 2017.
28 "https://onthemap.ces.census.gov/", OnTheMap.
29 A.Machanvajjhala, D.Kifer, J.Abowd, J.Gehrke and L.Vilhuber, "Privacy: Theory meets practice on the map," Data Engineering, IEEE 24th International Conference on, pp.277-286, 2008.
30 Z.Ji, Z.Lipton and C.Elkan, "Differential privacy and machine learning: a survey and review," arXiv preprint, 2014.
31 J.R. Quinlan, "Induction of decision trees," Machine learning, vol.1, no.1, pp.81-106, 1986.   DOI
32 J.R. Quinlan, C4.5: Programs for machine learning, Elsevier, 2014.
33 S.Fletcher, M.Z.Islam, "Decision tree classfication with differential privacy: A Survey,", arXiv preprint, 2016.
34 S.P.Kasiviswanathan, H.K.Lee, K.Nissim, S.Raskhodnikova and A.Smith, "What can we learn privately?," SIAM Journal on Computing, vol.40, no.3, pp.793-826, 2011.   DOI
35 U.Erlingsson, V.Pihur and A.Korolova, "RAPPOR: Randomized aggregatable privacy-preserving ordinal response, " Proceedings of the 2014 ACM SIGSAC conference on computer and communications security, pp.1054-1067, 2014.
36 Google, "Chrome Privacy Whitepaper, "https://www.google.co.kr/intl/ko/chrome/browser/privacy/whitepaper.html
37 A.Machanavajjhala, D.Kifer, J.Gehrke and M.Venkitasubramaniam, "L-diversity: Privacy beyond k-anonymity," ACM Transactions on Knowledge Discovery from Data (TKDD), vol. 1, no. 1, Article 3, 2007.
38 L.Sweeney, "k-anonymity: A model for protecting privacy," International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, vol. 10, no.5, pp.557-570, 2002.   DOI
39 Office for government Policy Coordination, Prime Minister's Secretariat, Ministry of the Interior and Safety, Korea Communications Commission, Financial Services Commission, Ministry of Science and ICT, Ministry of Health & Welfare, "Guidelines for data de-identification - Guidance on de-identification standard, support and management system,", https://www.privacy.go.kr/inf/gdl/selectBoardArticle.do?nttId=7187&bbsId=BBSMSTR_000000000044&bbsTyCode=BBST01&bbsAttrbCode=BBSA03&authFlag=Y&pageIndex=1&searchCnd=&searchWrd=&replyLc=0&nttSj, June, 2016.
40 J.Kim, "Presentation of data linkage case of SK Telecom: Creation and distribution demonstration of personal information de-identification data," Seminar on de-identified demonstaration for big data on the fourth industrial revolution, 2017.
41 C.Wong, J.Li, W.Fu and K.Wang, "(${\alpha}$, k)-anonymity: an enhanced k-anonymity model for privacy-preserving data publishing," Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp.754-759, 2006.
42 Apple, "guides and sample code," https://developer.apple.com/library/content/releasenotes/General/WhatsNewIniOS/Articles/iOS10.html
43 L.Fan and L.Xiong, "Differentially private anomaly detection with a case study on epidemic outbreak detection," Data Mining Workshops, IEEE 13th International Conference on, pp.833-840, 2013.
44 J.Reed and B.C.Pierce, "Distance makes the types grow stronger: a calculus for differential privacy," ACM Sigplan Notices, vol.45, no.9, pp.157-168, 2010.   DOI