• Title/Summary/Keyword: De-Identification

Search Result 371, Processing Time 0.045 seconds

Plasticity-damage model parameters identification for structural connections

  • Imamovic, Ismar;Ibrahimbegovic, Adnan;Knopf-Lenoir, Catherine;Mesic, Esad
    • Coupled systems mechanics
    • /
    • v.4 no.4
    • /
    • pp.337-364
    • /
    • 2015
  • In this paper we present methodology for parameters identification of constitutive model which is able to present behavior of a connection between two members in a structure. Such a constitutive model for frame connections can be cast in the most general form of the Timoshenko beam, which can present three failure modes. The first failure mode pertains to the bending in connection, which is defined as coupled plasticity-damage model with nonlinear softening. The second failure mode is seeking to capture the shearing of connection, which is defined as plasticity with linear hardening and nonlinear softening. The third failure mode pertains to the diffuse failure in the members; excluding it leads to linear elastic constitutive law. Theoretical formulation of this Timoshenko beam model and its finite element implementation are presented in the second section. The parameter identification procedure that will allow us to define eighteen unknown parameters is given in Section 3. The proposed methodology splits identification in three phases, with all details presented in Section 4 through three different examples. We also present the real experimental results. The conclusions are stated in the last section of the paper.

Re-defining Named Entity Type for Personal Information De-identification and A Generation method of Training Data (개인정보 비식별화를 위한 개체명 유형 재정의와 학습데이터 생성 방법)

  • Choi, Jae-hoon;Cho, Sang-hyun;Kim, Min-ho;Kwon, Hyuk-chul
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.05a
    • /
    • pp.206-208
    • /
    • 2022
  • As the big data industry has recently developed significantly, interest in privacy violations caused by personal information leakage has increased. There have been attempts to automate this through named entity recognition in natural language processing. In this paper, named entity recognition data is constructed semi-automatically by identifying sentences with de-identification information from de-identification information in Korean Wikipedia. This can reduce the cost of learning about information that is not subject to de-identification compared to using general named entity recognition data. In addition, it has the advantage of minimizing additional systems based on rules and statistics to classify de-identification information in the output. The named entity recognition data proposed in this paper is classified into twelve categories. There are included de-identification information, such as medical records and family relationships. In the experiment using the generated dataset, KoELECTRA showed performance of 0.87796 and RoBERTa of 0.88.

  • PDF

De-identifying Unstructured Medical Text and Attribute-based Utility Measurement (의료 비정형 텍스트 비식별화 및 속성기반 유용도 측정 기법)

  • Ro, Gun;Chun, Jonghoon
    • The Journal of Society for e-Business Studies
    • /
    • v.24 no.1
    • /
    • pp.121-137
    • /
    • 2019
  • De-identification is a method by which the remaining information can not be referred to a specific individual by removing the personal information from the data set. As a result, de-identification can lower the exposure risk of personal information that may occur in the process of collecting, processing, storing and distributing information. Although there have been many studies in de-identification algorithms, protection models, and etc., most of them are limited to structured data, and there are relatively few considerations on de-identification of unstructured data. Especially, in the medical field where the unstructured text is frequently used, many people simply remove all personally identifiable information in order to lower the exposure risk of personal information, while admitting the fact that the data utility is lowered accordingly. This study proposes a new method to perform de-identification by applying the k-anonymity protection model targeting unstructured text in the medical field in which de-identification is mandatory because privacy protection issues are more critical in comparison to other fields. Also, the goal of this study is to propose a new utility metric so that people can comprehend de-identified data set utility intuitively. Therefore, if the result of this research is applied to various industrial fields where unstructured text is used, we expect that we can increase the utility of the unstructured text which contains personal information.

Data Quality Measurement on a De-identified Data Set Based on Statistical Modeling (통계모형의 정확도에 기반한 비식별화 데이터의 품질 측정)

  • Chun, Heuiju;Yi, Hyun Jee;Yeon, Kyupil;Kim, Dongrae
    • The Journal of the Korea Contents Association
    • /
    • v.19 no.5
    • /
    • pp.553-561
    • /
    • 2019
  • In this study, the method of quality measurement for the statistical usefulness of de-identified data was examined in terms of prediction accuracy by statistical modeling. In the era of the 4th industrial revolution, effective use of big data is essential to innovation through information and communication technology, but personal information issues are constrained to actively utilize big data. In order to solve this problem, de-identification guidelines have been established and the possibility of actual re-identification of personal information has become very low due to the utilization of various de-identification methods. On the other hand, strong de-identification can have side effects that degrade the usefulness of the data. We have studied the quality of statistical usefulness of the de-identified data by KLT model which is a representative de-identification method, A case study was conducted to see how statistical accuracy of prediction is degraded by de-identification. We also proposed a new measure of data usefulness of the de-identified data by quantifying how much data is added to the de-identified data to restore the accuracy of the predictive model.

De-identification Policy Comparison and Activation Plan for Big Data Industry (비식별화 정책 비교 및 빅데이터 산업 활성화 방안)

  • Lee, So-Jin;Jin, Chae-Eun;Jeon, Min-Ji;Lee, Jo-Eun;Kim, Su-Jeong;Lee, Sang-Hyun
    • The Journal of the Convergence on Culture Technology
    • /
    • v.2 no.4
    • /
    • pp.71-76
    • /
    • 2016
  • In this study, de-identification policies of the US, the UK, Japan, China and Korea are compared to suggest a future direction of de-identification regulations and a method for vitalizing the big data industry. Efficiently using the de-identification technology and the standard of adequacy evaluation contributes to using personal information for the industry to develop services and technology while not violating the right of private lives and avoiding the restrictions specified in the Personal Information Protection Act. As a counteraction, the re-identification issue may occur, for re-identifying each person as a de-identified data collection. From the perspective of business, it is necessary to mitigate schemes for discarding some regulations and using big data, and also necessary to strengthen security and refine regulations from the perspective of information security.

A Study on De-Identification Methods to Create a Basis for Safety Report Text Mining Analysis (항공안전 보고 데이터 텍스트 분석 기반 조성을 위한 비식별 처리 기술 적용 연구)

  • Hwang, Do-bin;Kim, Young-gon;Sim, Yeong-min
    • Journal of the Korean Society for Aviation and Aeronautics
    • /
    • v.29 no.4
    • /
    • pp.160-165
    • /
    • 2021
  • In order to identify and analyze potential aviation safety hazards, analysis of aviation safety report data must be preceded. Therefore, in consideration of the provisions of the Aviation Safety Act and the recommendations of ICAO Doc 9859 SMM Edition 4th, personal information in the reporting data and sensitive information of the reporter, etc. It identifies the scope of de-identification targets and suggests a method for applying de-identification processing technology to personal and sensitive information including unstructured text data.

Identification of Volterra Kernels of Nonlinear Van de Vusse Reactor

  • Kashiwagi, Hiroshi;Rong, Li
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2001.10a
    • /
    • pp.26.3-26
    • /
    • 2001
  • Van de Vusse reactor is known as a highly nonlinear chemical process and has been considered by a number of researchers as a benchmark problem for nonlinear chemical process. Various identification methods for nonlinear system are also verified by applying these methods to Van de Vusse reactor. From the point of view of identification, only the Volterra kernel of second order has been obtained until now. In this paper, the authors show that Volterra kernels of nonlinear Van de Vusse reactor of up to 3rd order are obtained by use of M-sequence correlation method. A pseudo-random M-sequence is applied to Van de Vusse reactor as an input and its output is measured. Taking the cross correlation function between the input and the output, we obtain up to 3rd order Volterra kernels, which is ...

  • PDF

The De-identification Technique Using Data Grouping in Relational Database (관계형 데이터베이스에서 데이터 그룹화를 이용한 익명화 처리 기법)

  • Park, Jun-Bum;Jin, Seung-Hun;Choi, Daeseon
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.25 no.3
    • /
    • pp.493-500
    • /
    • 2015
  • Personal information exposed in the Internet is increasing by the public data opening and sharing, vitalization of SNS(Social Network Service) and growth of information shared between users. Exposed personal information in the Internet can infringe upon targeted users using linkage attack or background attack. To prevent these attack De-identification models were appeared a few years ago. The 'k-anonymity' has been introduced in the first place, and the '${\ell}$-diversity' and 't-closeness' have been followed up as solutions, and diverse algorithms have been being suggested for performance improvement nowadays. However, industry or public sectors actually needs a whole solution as a system for the de-identification process rather than performance of the de-identification algorithm. This paper explains a way of de-identification techique for 'k-anonymity', '${\ell}$-diversity', and 't-closeness' algorithm using QI(Quasi-Identifier) grouping method in the relational database.

Secure De-identification and Data Sovereignty Management of Decentralized SSI using Restructured ZKP (재구성된 영지식 증명을 활용한 탈중앙형 자기 주권 신원의 안전한 비식별화 및 데이터 주권 관리)

  • Cho, Kang-Woo;Jeon, Mi-Hyeon;Shin, Sang Uk
    • Journal of Digital Convergence
    • /
    • v.19 no.8
    • /
    • pp.205-217
    • /
    • 2021
  • Decentralized SSI(Self Sovereign Identity) has become an alternative to a new digital identity solution, but an efficient de-identification technique has not been proposed due to the unique algorithmic characteristics of data transactions. In this study, to ensure the decentralized operation of SSI, we propose a de-identification technique that does not remove identifiers by restructuring the verification results of ZKP (Zero Knowledge Proof) into a form that can be provided to the outside by the verifier. In addition, it is possible to provide restructured de-identification data without the consent of data subject by proposing the concept of differential sovereignty management for each entity participating in verification. As a result, the proposed model satisfies the domestic personal information protection law in a decnetralized SSI, in addition provides secure and efficient de-identification processing and sovereignty management.

A pilot study of an automated personal identification process: Applying machine learning to panoramic radiographs

  • Ortiz, Adrielly Garcia;Soares, Gustavo Hermes;da Rosa, Gabriela Cauduro;Biazevic, Maria Gabriela Haye;Michel-Crosato, Edgard
    • Imaging Science in Dentistry
    • /
    • v.51 no.2
    • /
    • pp.187-193
    • /
    • 2021
  • Purpose: This study aimed to assess the usefulness of machine learning and automation techniques to match pairs of panoramic radiographs for personal identification. Materials and Methods: Two hundred panoramic radiographs from 100 patients (50 males and 50 females) were randomly selected from a private radiological service database. Initially, 14 linear and angular measurements of the radiographs were made by an expert. Eight ratio indices derived from the original measurements were applied to a statistical algorithm to match radiographs from the same patients, simulating a semi-automated personal identification process. Subsequently, measurements were automatically generated using a deep neural network for image recognition, simulating a fully automated personal identification process. Results: Approximately 85% of the radiographs were correctly matched by the automated personal identification process. In a limited number of cases, the image recognition algorithm identified 2 potential matches for the same individual. No statistically significant differences were found between measurements performed by the expert on panoramic radiographs from the same patients. Conclusion: Personal identification might be performed with the aid of image recognition algorithms and machine learning techniques. This approach will likely facilitate the complex task of personal identification by performing an initial screening of radiographs and matching ante-mortem and post-mortem images from the same individuals.