A Study on Performing Join Queries over K-anonymous Tables

Kim, Dae-Ho;Kim, Jong Wook;

doi:10.9708/jksci.2017.22.07.055

Journal of the Korea Society of Computer and Information (한국컴퓨터정보학회논문지)

Volume 22 Issue 7
/
Pages.55-62
/
2017
/
1598-849X(pISSN)
/
2383-9945(eISSN)

Korean Society of Computer Information (한국컴퓨터정보학회)

DOI QR Code

A Study on Performing Join Queries over K-anonymous Tables

Kim, Dae-Ho (Dept. of Computer Science, Sangmyung University) ;
Kim, Jong Wook (Dept. of Computer Science, Sangmyung University)

Received : 2017.04.28
Accepted : 2017.06.18
Published : 2017.07.31

https://doi.org/10.9708/jksci.2017.22.07.055 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Recently, there has been an increasing need for the sharing of microdata containing information regarding an individual entity. As microdata usually contains sensitive information on an individual, releasing it directly for public use may violate existing privacy requirements. Thus, to avoid the privacy problems that occur through the release of microdata for public use, extensive studies have been conducted in the area of privacy-preserving data publishing (PPDP). The k-anonymity algorithm, which is the most popular method, guarantees that, for each record, there are at least k-1 other records included in the released data that have the same values for a set of quasi-identifier attributes. Given an original table, the corresponding k-anonymous table is obtained by generalizing each record in the table into an indistinguishable group, called the equivalent class, by replacing the specific values of the quasi-identifier attributes with more general values. However, query processing over the anonymized data is a very challenging task, due to generalized attribute values. In particular, the problem becomes more challenging with an equi-join query (which is the most common type of query in data analysis tasks) over k-anonymous tables, since with the generalized attribute values, it is hard to determine whether two records can be joinable. Thus, to address this challenge, in this paper, we develop a novel scheme that is able to effectively perform an equi-join between k-anonymous tables. The experiment results show that, through the proposed method, significant gains in accuracy over using a naive scheme can be achieved.

Keywords

References

A. Narayanan and V. Shmatikov, "Robust De-anonymization of Large Sparse Datasets", In Proceedings of the 2008 IEEE Symposium on Security and Privacy Page, 2008.
J. Kim, K.Jung, H. Lee, S. Kim, J.W. Kim and Y.D. Chung, "Models for Privacy-preserving Data Publishing : A Survey", Journal of KIISE, Vol. 44, No. 2, pp. 195-207, 2017. https://doi.org/10.5626/JOK.2017.44.2.195
B.C.M. Fung, K. Wang, R. Chen, and P.S. Yu, "Privacy-preserving data publishing: A survey of recent developments", ACM Computing Surveys, 42(4), June 2010.
N. Mohammed, B.C.M. Fung, P.C.K. Hung, and C.K. Lee, "Centralized and distributed anonymization for high-dimensional healthcare data", ACM Transactions on Knowledge Discovery from Data, 4(4), October 2010.
L. Sweeney, "k-anonymity: A model for protecting privacy", International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 10(5), 557-570, 2002. https://doi.org/10.1142/S0218488502001648
K. LeFevre, D.J. DeWitt and R. Ramakrishnan, "Incognito: Efficient full domain k-anonymity", In Proceedings of the ACM SIGMOD International Conference on Management of Data, 2005.
A. Machanavajjhala, D. Kifer, J. Gehrke and M. Venkitasubramaniam, "l-diversity: Privacy beyond k-anonymity", ACM Transactions on Knowledge Discovery from Data, 1(1), 2007.
N. Li, T. Li and S. Venkatasubramanian, "t-closeness: Privacy beyond k-anonymity and l-diversity", In Proceedings of the International Conference on Data Engineering, 2007.
S. Kim, H. Lee, Y.D. Chung, "Privacy-preserving data cub for electronic medical records: An experimental evaluation", International Journal of medical Informatics, 2017
J. Byun, A. Kamra, E. Bertino, N. Li, "Efficient k-Anonymization Using Clustering Technique", DASFAA 2007: Advances in Databases: Concepts, Systems and Applications pp 188-200, 2007
Health Insurance Review and Assessment Service in Korea. http://opendata.hira.or.kr (2012).

Journal of the Korea Society of Computer and Information (한국컴퓨터정보학회논문지)

A Study on Performing Join Queries over K-anonymous Tables

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)