• Title/Summary/Keyword: probabilistic record linkage

Search Result 2, Processing Time 0.013 seconds

A study on the probabilistic record linkage and its application (확률적 자료연계의 이론과 적용에 관한 연구)

  • Choi, Yeonok;Lee, Sangin
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.5
    • /
    • pp.849-861
    • /
    • 2021
  • This paper aims to introduce the basic concept of probabilistic record linkage and its statistical framework, and describe the specific process and principle of performing it using a real example from Statistics Korea. First, we briefly describe the deterministic record linkage and compare it with probabilistic record linkage. We introduce the Fellegi-Sunter model framework for record linkage and the related paprameters: m-probability, u-probability, matched weight and decision rule. Finally, we show the detailed process of record linkage under Fellegi-Sunter model framework and evaluate the record linkage results, using sample data from the registered-based census and Population and Housing Census survey in Statistics Korea.

Combination Key Generation Scheme Robust to Updates of Personal Information (결합키 생성항목의 갱신에 강건한 결합키 생성 기법)

  • Jang, Hobin;Noh, Geontae;Jeong, Ik Rae;Chun, Ji Young
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.5
    • /
    • pp.915-932
    • /
    • 2022
  • According to the Personal Information Protection Act and Pseudonymization Guidelines, the mapping is processed to the hash value of the combination key generation items including Salt value when different combination applicants wish to combine. Example of combination key generation items may include personal information like name, phone number, date of birth, address, and so on. Also, due to the properties of the hash functions, when different applicants store their items in exactly the same form, the combination can proceed without any problems. However, this method is vulnerable to combination in scenarios such as address changing and renaming, which occur due to different database update times of combination applicants. Therefore, we propose a privacy preserving combination key generation scheme robust to updates of items used to generate combination key even in scenarios such as address changing and renaming, based on the thresholds through probabilistic record linkage, and it can contribute to the development of domestic Big Data and Artificial Intelligence business.