• Title/Summary/Keyword: Pseudonymization

Search Result 19, Processing Time 0.026 seconds

Pseudonymization's effect on data quality: A study under personal information protection act (개인정보보호법에 따른 가명처리로 인한 데이터 손실이 데이터 분석의 정확도에 미치는 영향)

  • Minjeong Kim;Jae Keun Yoo
    • The Korean Journal of Applied Statistics
    • /
    • v.37 no.3
    • /
    • pp.381-393
    • /
    • 2024
  • This study investigates the impact of pseudonymization of personal information and its effect on the accuracy of data analysis. We quantitatively evaluated the relationship between the degree of pseudonymization and the accuracy of data analysis using logistic regression models, decision trees, and random forests. Through this, we confirmed that pseudonymizing sensitive information can realize personal information protection without significantly damaging data quality. However, we recognized limitations such as single sample data and consistent application of pseudonymization ratios. To overcome these limitations, additional research on diverse datasets is necessary to strengthen the generalizability of results. Moreover, we propose developing and applying methodologies to find optimal pseudonymization ratios for individual variables. The results from this study provide new insights into maintaining usability of data while achieving regulatory compliance and personal information protection.

A Study on Efficient Data De-Identification Method for Blockchain DID

  • Min, Youn-A
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.13 no.2
    • /
    • pp.60-66
    • /
    • 2021
  • Blockchain is a technology that enables trust-based consensus and verification based on a decentralized network. Distributed ID (DID) is based on a decentralized structure, and users have the right to manage their own ID. Recently, interest in self-sovereign identity authentication is increasing. In this paper, as a method for transparent and safe sovereignty management of data, among data pseudonymization techniques for blockchain use, various methods for data encryption processing are examined. The public key technique (homomorphic encryption) has high flexibility and security because different algorithms are applied to the entire sentence for encryption and decryption. As a result, the computational efficiency decreases. The hash function method (MD5) can maintain flexibility and is higher than the security-related two-way encryption method, but there is a threat of collision. Zero-knowledge proof is based on public key encryption based on a mutual proof method, and complex formulas are applied to processes such as personal identification, key distribution, and digital signature. It requires consensus and verification process, so the operation efficiency is lowered to the level of O (logeN) ~ O(N2). In this paper, data encryption processing for blockchain DID, based on zero-knowledge proof, was proposed and a one-way encryption method considering data use range and frequency of use was proposed. Based on the content presented in the thesis, it is possible to process corrected zero-knowledge proof and to process data efficiently.

A New Method of Registering the XML-based Clinical Document Architecture Supporting Pseudonymization in Clinical Document Registry Framework (익명화 방법을 적용한 임상진료문서 등록 기법 연구)

  • Kim, Il-Kwang;Lee, Jae-Young;Kim, Il-Kon;Kwak, Yun-Sik
    • Journal of KIISE:Software and Applications
    • /
    • v.34 no.10
    • /
    • pp.918-928
    • /
    • 2007
  • The goal of this paper is to propose a new way to register CDA documents in CDR (Clinical Document Repository) that is proposed by the author earlier. One of the methods is to use a manifest archiving for seamless references and visualization of CDA related files. Another method is to enhance the CDA security level for supporting pseudonymization of CDA. The former is a useful method to support the bundled registration of CDA related files as a set. And it also can provide a seamless presentation view to end-users, once downloaded, without each HTTP connection. The latter is a new method of CDA registration which can supports a do-identification of a patient. Usually, CDA header can be used for containing patient identification information, and CDA body can be used for diagnosis or treatment data. So, if we detach each other, we can get good advantages for privacy protection. Because even if someone succeeded to get separated CDA body, he/she never knows whose clinical data that is. The other way, even if someone succeeded to get separated CDA header; he/she doesn't know what kind of treatment has been done. This is the way to achieve protecting privacy by disconnecting association of relative information and reducing possibility of leaking private information. In order to achieve this goal, the method we propose is to separate CDA into two parts and to store them in different repositories.

Integrated Data Safe Zone Prototype for Efficient Processing and Utilization of Pseudonymous Information in the Transportation Sector (교통분야 가명정보의 효율적 처리 및 활용을 위한 통합데이터안심구역 프로토타입)

  • Hyoungkun Lee;Keedong Yoo
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.23 no.3
    • /
    • pp.48-66
    • /
    • 2024
  • According to the three amended Laws of the Data Economy and the Data Industry Act of Korea, systems for pseudonymous data integration and Data Safe Zones have been operated separately by selected agencies, eventually causing a burden of use in SMEs, startups, and general users because of complicated and ineffective procedures. An over-stringent pseudonymization policy to prevent data breaches has also compromised data quality. Such trials should be improved to ensure the convenience of use and data quality. This paper proposes a prototype system of the Integrated Data Safe Zone based on redesigned and optimized pseudonymization workflows. Conventional workflows of pseudonymization were redesigned by applying the amended guidelines and selectively revising existing guidelines for business process redesign. The proposed prototype has been shown quantitatively to outperform the conventional one: 6-fold increase in time efficiency, 1.28-fold in cost reduction, and 1.3-fold improvement in data quality.

A Study on the Principle of Application of Privacy by Design According to the Life Cycle of Pseudonymization Information (가명정보 생명주기에 따른 개인정보보호 중심 설계 적용 원칙에 관한 연구)

  • Kim, Dong-hyun
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.2
    • /
    • pp.329-339
    • /
    • 2022
  • Recently, as personal information has been used as data, various new industries have been discovered, but cases of personal information leakage and misuse have occurred one after another due to insufficient systematic management system establishment. In addition, services that use personal information anonymously and anonymously have emerged since the enforcement of the Data 3 Act in August 2020, but personal information issues have arisen due to insufficient alias processing, safety measures for alias information processing, and insufficient hate expression. Therefore, this study proposed a new PbD principle that can be applied to the pseudonym information life cycle based on the Privacy by Design (PbD) principle proposed by Ann Cavoukian [1] of Canada to safely utilize personal information. In addition, the significance of the proposed method was confirmed through a survey of 30 experts related to personal information protection.

Combination Key Generation Scheme Robust to Updates of Personal Information (결합키 생성항목의 갱신에 강건한 결합키 생성 기법)

  • Jang, Hobin;Noh, Geontae;Jeong, Ik Rae;Chun, Ji Young
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.5
    • /
    • pp.915-932
    • /
    • 2022
  • According to the Personal Information Protection Act and Pseudonymization Guidelines, the mapping is processed to the hash value of the combination key generation items including Salt value when different combination applicants wish to combine. Example of combination key generation items may include personal information like name, phone number, date of birth, address, and so on. Also, due to the properties of the hash functions, when different applicants store their items in exactly the same form, the combination can proceed without any problems. However, this method is vulnerable to combination in scenarios such as address changing and renaming, which occur due to different database update times of combination applicants. Therefore, we propose a privacy preserving combination key generation scheme robust to updates of items used to generate combination key even in scenarios such as address changing and renaming, based on the thresholds through probabilistic record linkage, and it can contribute to the development of domestic Big Data and Artificial Intelligence business.

Study for the Pseudonymization Technique of Medical Image Data (의료 이미지 데이터의 비식별화 방안에 관한 연구)

  • Baek, Jongil;Song, Kyoungtaek;Choi, Wonkyun;Yu, Khiguen;Lee, Pilwoo;In, Hanjin;Kim, Cheoljung;Yeo, Kwangsoo;Kim, Soonseok
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.6 no.6
    • /
    • pp.103-110
    • /
    • 2016
  • The recent frequent cases of damage due to leakage of medical data and the privacy of medical patients is increasing day by day. The government says the Privacy Rule regulations established for these victims, such as prevention. Medical data guidelines can be seen 'national medical privacy guidelines' is only released. When replacing the image data between the institutions it has been included in the image file (JPG, JPEG, TIFF) there is exchange of data in common formats such as being made when the file is leaked to an external file there is a risk that the exposure key identification information of the patient. This medial image file has no protection such as encryption, This this paper, introduces a masking technique using a mosaic technique encrypting the image file contains the application to optical character recognition techniques. We propose pseudonymization technique of personal information in the image data.

A Study on the Data Pseudonymization Methodology for Defense Training Data as Artificial Intelligence Technology is applied to the Defense Field (국방 분야 인공지능 기술 접목에 따른 교육훈련 데이터 가명처리 방법론에 관한 연구)

  • Hyunsuk Cho;Sujin Kang;Dongrae Cho;Yeongseop Shin
    • Journal of The Korean Institute of Defense Technology
    • /
    • v.5 no.3
    • /
    • pp.1-7
    • /
    • 2023
  • Recently, in the defense field, efforts are being made to collect data by building data centers to incorporate artificial intelligence technology. Weapon system training data can be used as input data for artificial intelligence models and can be used as high-quality data to maximize training performance and develop military strategies. However, training data contains personal information such as the names and military numbers of the personnel who operated the equipment, and training records that reveal the characteristics of the weapon system. If such data is passed on to the enemy, not only the specifications and performance of the weapon system but also the proficiency of each operator may be exposed. In this paper, we propose a pseudonym processing methodology for education and training data security and also suggest a direction for revising related laws.

  • PDF

Enhancing Privacy Protection in Steppy Applications through Pseudonymization

  • Nugroho, Heri Arum;Prihatmanto, Ary Setijadi;Rhee, Kyung Hyune
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2015.10a
    • /
    • pp.763-766
    • /
    • 2015
  • Smart Healthcare System as an Open Platform (Shesop) is an integrated healthcare system and have several features, one of them is Steppy Application. Steppy does count your step and display on Shesop website. In this system security issues are not properly addressed, while Personal Health Record (PHR) patient stored in the cloud platform could be at risk. In fact, the huge electronic information available online, people needs reliable and effective technique for privacy preserving. In order to improve the security of data which are displayed on the Shesop website, so that anyone who access could not tamper without permission. Recently Xu et al. showed a pseudonym scheme using smart card as a solution in e-health systems which uses discrete logarithm problem with cyclic group. In this paper, we adopt their scheme and use it application into smartphone using Near Field Communication (NFC) to construct security in Steppy apps.

Research of Specific Domestic De-identification Technique for Protection of Personal Health Medical Information in Review & Analysis of Overseas and Domestic De-Identification Technique (국내외 비식별화 기술에 관한 검토 분석에 따른 개인건강의료정보 보호를 위한 국내 특화 비식별화 기술 제안에 관한 연구)

  • Lee, Pilwoo;In, Hanjin;Kim, Cheoljung;Yeo, Kwangsoo;Song, Kyoungtaek;Yu, Khigeun;Baek, Jongil;Kim, Soonseok
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.6 no.7
    • /
    • pp.9-16
    • /
    • 2016
  • As life in a rapidly changing Internet age at home and abroad, large amounts of information are being used medical, financial, services, etc. Accordingly, especially hospitals, is an invasion of privacy caused by leakage and intrusion of personal information in the system in medical institutions, including clinics institutions. To protect the privacy & information protection of personal health medical information in medical institutions at home and abroad presented by national policies and de-identification processing technology standards in accordance with the legislation. By comparative analysis in existing domestic and foreign institutional privacy and de-identification technique, derive a advanced one of pseudonymization and anonymization techniques for destination data items that fell short in comparison to the domestic laws and regulations, etc. De-identification processing technology for personal health information is compared to a foreign country pharmaceutical situations. We propose a new de-identification techniques by reducing the risk of re-identification processing to enable the secondary use of domestic medical privacy.