Browse > Article
http://dx.doi.org/10.9708/jksci.2022.27.06.011

A study on the method of measuring the usefulness of De-Identified Information using Personal Information  

Kim, Dong-Hyun (Korea Internet & Security Agency)
Abstract
Although interest in de-identification measures for the safe use of personal information is growing at home and abroad, cases where de-identified information is re-identified through insufficient de-identification measures and inferences are occurring. In order to compensate for these problems and discover new technologies for de-identification measures, competitions to compete on the safety and usefulness of de-identified information are being held in Korea and Japan. This paper analyzes the safety and usefulness indicators used in these competitions, and proposes and verifies new indicators that can measure usefulness more efficiently. Although it was not possible to verify through a large population due to a significant shortage of experts in the fields of mathematics and statistics in the field of de-identification processing, very positive results could be derived for the necessity and validity of new indicators. In order to safely utilize the vast amount of public data in Korea as de-identified information, research on these usefulness metrics should be continuously conducted, and it is expected that more active research will proceed starting with this thesis.
Keywords
Big data; Personal Information; De-Identification; De-Identified Information usefulness measurement; De-Identified Information usefulness indicators;
Citations & Related Records
Times Cited By KSCI : 3  (Citation Analysis)
연도 인용수 순위
1 JhCheon, YhEuh et al, "Privacy-Preserving Finance Data Analysis Based on Homomorphic Encryption," Financial Information Society of Korea, Vol. 7, No. 1, pp. 33-60, Feb. 2018.
2 IBM, "IDC stacks up top object storage vendors," 2020.
3 Personal Information Protection Commission, "Guidelines for processing pseudonym information," 2021.
4 K. Hiroaki, Y. Takayasu et al, "Ice and Fire: Quantifying the Risk of Re-identification and Utility in Data Anonymization," IEEE 30th International Conference on Advanced Information Networking and Applications, pp. 1035-1042, Montana, Switzerland, Mar. 2016.
5 Sunil Ray, "A Comprehensive Guide to Data Exploration," Analytics Vidhya, pp. 9-13, 2016.
6 Joint Government Departments in Korea, "Guide lines for de-identification of personal information," 2016.
7 EU General Data Protection Regulation, "Recital(26)," 2018.
8 KISA, http://datachallenge.kr/challenge/anon-con/
9 Sgkim, "The Mediating Effect and Moderating Effect of Pseudonymized Information Combination in the Relationship Between Regulation Factors of Personal Information and Big Data Utilization," Informatization Policy, Vol. 27, No. 3, pp. 082-111, Aug. 2020.
10 R. Diane, F. Mary and W. Terese, "Challenge Design and Lessons Learned from the 2018 Differential Privacy Challenges," NIST Technical Note 2151, 2018.
11 Sweeney L, "k-anonymity: A model for protecting privacy," International Journal of Uncertainty, Vol. 10, No. 3, pp. 557-570, July. 2002.
12 ISO/IEC 20889, "Privacy enhancing data deident ification terminology and classification of techniques, Annex A," 2018.
13 N. Guo, M. Yang et al, "Data Anonymization Based on Natural Equivalent Class," IEEE 23rd International Conference on Computer Supported Cooperative Work in Design, Porto, Portugal, May. 2019.
14 I. Leontiadis, M. Onen et al, "Privacy preserving similarity detection for data analysis," 2013 International Conference on Cloud and Green Computing, pp. 547-552, 2013.
15 K. LeFevre, D. DeWitt et al, "Mondrian multidimensional k-anonymity," 22nd International Conference on Data Engineering, Atlanta, USA, April. 2006.
16 K. Emam, F. Dankar et al, "A Globally Optimal k-Anonymity Method for the De-Identification of Health Data," Journal of the American Medical Informatics Association, Vol. 16, No. 5, pp. 670-682, Jun. 2009.   DOI
17 Joint Government Departments in Korea, "Plans to revitalize the data and AI economy," 2019.
18 A. Narayanan, V. Shmatikov, "Robust de-anonymiszation of large sparse datasets," 2008 IEEE Symposium on Security and Privacy, Oakland, USA, May. 2008.
19 Dschoi, Shkim et al, "Big Data Privacy Risk Analysis Technology," Journal of The Korea Institute of Information Security and Cryptology, Vol. 23, No. 3, pp. 56-60, Jun. 2013.
20 O. Hidenobu, M. Kunio, "A Study for the practical implementation of the evaluation of utility and security, through the data anonymization and re-identification competition," Information Processing Society of Japan Technical Report, 2016.
21 Dhkim, Sskim, "A New Scheme for Risk Assessment Based on Data Context for De- Identification of Personal Information", Journal of The Korea Institute of Information Security and Cryptology, Vol. 30, No. 4, pp. 719-734, Jun. 2020.   DOI
22 K. Hiroaki, K. Hamada et al, "Study on Record Linkage of Anonymizied Data," IEICE Trans. FUNDAMENTALS, Vol. E101-A, No. 1, pp. 19-28, Jan. 2018.   DOI
23 U.S. Department of Health & Human Services, "The HIPAA Privacy Rule," 1996.
24 J. Domingo-Ferrer, V. Torra, "A Quant itative comparison of disclosure control methods for microdata," Confidentiality, Disclosure and Data Access, pp. 111-133, 2001.
25 S. Martinez, J. Domingo-Ferrer et al, "Supplementary materials for How to avoid reidentification with proper anonymization," Science, Vol. 351, No. 6279, pp. 1274, Nov. 2015.   DOI
26 S. Ugur, A. Osman, "A utility based approach for data stream anonymization," Journal of Intelligent Information Systems, Vol. 54, pp. 605-631, Oct. 2019.   DOI
27 KISA, "Cryptographic Algorithms and Key Length User Guide," 2018.
28 Hikim, ChPark et al, "A Study on a Differentially Private Model for Financial Data," Journal of The Korea Institute of Information Security & Cryptology, Vol. 27, No. 6, pp. 1519-1534, Dec. 2017.   DOI
29 Ksjung, Spark, "Differentially Private Synthetic Data Generation Technique with k-anonymity," Journal of Computing Science and Engineering Congress 2018, Jeju, Korea, Jun. 2018.