• Title/Summary/Keyword: causal inference

Search Result 68, Processing Time 0.024 seconds

Estimating Average Causal Effect in Latent Class Analysis (잠재범주분석을 이용한 원인적 영향력 추론에 관한 연구)

  • Park, Gayoung;Chung, Hwan
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.7
    • /
    • pp.1077-1095
    • /
    • 2014
  • Unlike randomized trial, statistical strategies for inferring the unbiased causal relationship are required in the observational studies. Recently, new methods for the causal inference in the observational studies have been proposed such as the matching with the propensity score or the inverse probability treatment weighting. They have focused on how to control the confounders and how to evaluate the effect of the treatment on the result variable. However, these conventional methods are valid only when the treatment variable is categorical and both of the treatment and the result variables are directly observable. Research on the causal inference can be challenging in part because it may not be possible to directly observe the treatment and/or the result variable. To address this difficulty, we propose a method for estimating the average causal effect when both of the treatment and the result variables are latent. The latent class analysis has been applied to calculate the propensity score for the latent treatment variable in order to estimate the causal effect on the latent result variable. In this work, we investigate the causal effect of adolescents delinquency on their substance use using data from the 'National Longitudinal Study of Adolescent Health'.

Exploring modern machine learning methods to improve causal-effect estimation

  • Kim, Yeji;Choi, Taehwa;Choi, Sangbum
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.2
    • /
    • pp.177-191
    • /
    • 2022
  • This paper addresses the use of machine learning methods for causal estimation of treatment effects from observational data. Even though conducting randomized experimental trials is a gold standard to reveal potential causal relationships, observational study is another rich source for investigation of exposure effects, for example, in the research of comparative effectiveness and safety of treatments, where the causal effect can be identified if covariates contain all confounding variables. In this context, statistical regression models for the expected outcome and the probability of treatment are often imposed, which can be combined in a clever way to yield more efficient and robust causal estimators. Recently, targeted maximum likelihood estimation and causal random forest is proposed and extensively studied for the use of data-adaptive regression in estimation of causal inference parameters. Machine learning methods are a natural choice in these settings to improve the quality of the final estimate of the treatment effect. We explore how we can adapt the design and training of several machine learning algorithms for causal inference and study their finite-sample performance through simulation experiments under various scenarios. Application to the percutaneous coronary intervention (PCI) data shows that these adaptations can improve simple linear regression-based methods.

Deep Analysis of Causal AI-Based Data Analysis Techniques for the Status Evaluation of Casual AI Technology (인과적 인공지능 기반 데이터 분석 기법의 심층 분석을 통한 인과적 AI 기술의 현황 분석)

  • Cha Jooho;Ryu Minwoo
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.19 no.4
    • /
    • pp.45-52
    • /
    • 2023
  • With the advent of deep learning, Artificial Intelligence (AI) technology has experienced rapid advancements, extending its application across various industrial sectors. However, the focus has shifted from the independent use of AI technology to its dispersion and proliferation through the open AI ecosystem. This shift signifies the transition from a phase of research and development to an era where AI technology is becoming widely accessible to the general public. However, as this dispersion continues, there is an increasing demand for the verification of outcomes derived from AI technologies. Causal AI applies the traditional concept of causal inference to AI, allowing not only the analysis of data correlations but also the derivation of the causes of the results, thereby obtaining the optimal output values. Causal AI technology addresses these limitations by applying the theory of causal inference to machine learning and deep learning to derive the basis of the analysis results. This paper analyzes recent cases of causal AI technology and presents the major tasks and directions of causal AI, extracting patterns between data using the correlation between them and presenting the results of the analysis.

Organizational Memory Formulation by Inference Diagram

  • Lee, Kun-Chang;Nho, Jae-Bum
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 1999.10a
    • /
    • pp.42-46
    • /
    • 1999
  • Knowledge management(KM) is emerging as a robust management mechanism with which an organization can remain highly intelligent and competitive in a turbulent market. Organization memory(or knowledge) is at the heart of KM success. How to create organizational memory has been debated among researchers. In literature, a wide variety of methods for creating organizational memory have been proposed only to prove that its applicability is limited to decision-making problems which require shallow or non-causal knowledge type. However, organizational memory with a sense of causal knowledge is highly required in solving complicated decision-making problems in which complex dynamics exist between various factors and influence each other with cause and effect relationship among them. In this respect, we propose a new approach to creating a causal-typed organizational memory (CATOM), which has a form of causal knowledge and is represented in a matrix form, by using an inference diagram. An algorithm for CATOM creation is suggested and applied to an illustrative example. Results show that our proposed KM approach can effectively equip an organization with semi-automated CATOM creation and inference process which is deemed useful in a highly competitive business environment.

  • PDF

Regression discontinuity for survival data

  • Youngjoo Cho
    • Communications for Statistical Applications and Methods
    • /
    • v.31 no.1
    • /
    • pp.155-178
    • /
    • 2024
  • Regression discontinuity (RD) design is one of the most widely used methods in causal inference for estimation of treatment effect when the treatment is created by a cutpoint from the covariate of interest. There has been little attention to RD design, although it provides a very useful tool for analysis of treatment effect for censored data. In this paper, we define the causal effect for survival function in RD design when the treatment is assigned deterministically by the covariate of interest. We propose estimators of this causal effect for survival data by using transformation, which leads unbiased estimator of the survival function with local linear regression. Simulation studies show the validity of our approach. We also illustrate our proposed method using the prostate, lung, colorectal and ovarian (PLCO) dataset.

Causal Inference Network of Genes Related with Bone Metastasis of Breast Cancer and Osteoblasts Using Causal Bayesian Networks

  • Park, Sung Bae;Chung, Chun Kee;Gonzalez, Efrain;Yoo, Changwon
    • Journal of Bone Metabolism
    • /
    • v.25 no.4
    • /
    • pp.251-266
    • /
    • 2018
  • Background: The causal networks among genes that are commonly expressed in osteoblasts and during bone metastasis (BM) of breast cancer (BC) are not well understood. Here, we developed a machine learning method to obtain a plausible causal network of genes that are commonly expressed during BM and in osteoblasts in BC. Methods: We selected BC genes that are commonly expressed during BM and in osteoblasts from the Gene Expression Omnibus database. Bayesian Network Inference with Java Objects (Banjo) was used to obtain the Bayesian network. Genes registered as BC related genes were included as candidate genes in the implementation of Banjo. Next, we obtained the Bayesian structure and assessed the prediction rate for BM, conditional independence among nodes, and causality among nodes. Furthermore, we reported the maximum relative risks (RRs) of combined gene expression of the genes in the model. Results: We mechanistically identified 33 significantly related and plausibly involved genes in the development of BC BM. Further model evaluations showed that 16 genes were enough for a model to be statistically significant in terms of maximum likelihood of the causal Bayesian networks (CBNs) and for correct prediction of BM of BC. Maximum RRs of combined gene expression patterns showed that the expression levels of UBIAD1, HEBP1, BTNL8, TSPO, PSAT1, and ZFP36L2 significantly affected development of BM from BC. Conclusions: The CBN structure can be used as a reasonable inference network for accurately predicting BM in BC.

Definition and Extraction of Causal Relations for Question-Answering on Fault-Diagnosis of Electronic Devices (전자장비 고장진단 질의응답을 위한 인과관계 정의 및 추출)

  • Lee, Sheen-Mok;Shin, Ji-Ae
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.5
    • /
    • pp.335-346
    • /
    • 2008
  • Causal relations in ontology should be defined based on the inference types necessary to solve problems specific to application as well as domain. In this paper, we present a model to define and extract causal relations for application ontology for Question-Answering (QA) on fault-diagnosis of electronic devices. Causal categories are defined by analyzing generic patterns of QA application; the relations between concepts in the corpus belonging to the causal categories are defined as causal relations. Instances of casual relations are extracted using lexical patterns in the concept definitions of domain, and extended incrementally with information from thesaurus. On the evaluation by domain specialists, our model shows precision of 92.3% in classification of relations and precision of 80.7% in identifying causal relations at the extraction phase.

Application of Standardization for Causal Inference in Observational Studies: A Step-by-step Tutorial for Analysis Using R Software

  • Lee, Sangwon;Lee, Woojoo
    • Journal of Preventive Medicine and Public Health
    • /
    • v.55 no.2
    • /
    • pp.116-124
    • /
    • 2022
  • Epidemiological studies typically examine the causal effect of exposure on a health outcome. Standardization is one of the most straightforward methods for estimating causal estimands. However, compared to inverse probability weighting, there is a lack of user-centric explanations for implementing standardization to estimate causal estimands. This paper explains the standardization method using basic R functions only and how it is linked to the R package stdReg, which can be used to implement the same procedure. We provide a step-by-step tutorial for estimating causal risk differences, causal risk ratios, and causal odds ratios based on standardization. We also discuss how to carry out subgroup analysis in detail.

Matrix-Based Intelligent Inference Algorithm Based On the Extended AND-OR Graph

  • Lee, Kun-Chang;Cho, Hyung-Rae
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 1999.10a
    • /
    • pp.121-130
    • /
    • 1999
  • The objective of this paper is to apply Extended AND-OR Graph (EAOG)-related techniques to extract knowledge from a specific problem-domain and perform analysis in complicated decision making area. Expert systems use expertise about a specific domain as their primary source of solving problems belonging to that domain. However, such expertise is complicated as well as uncertain, because most knowledge is expressed in causal relationships between concepts or variables. Therefore, if expert systems can be used effectively to provide more intelligent support for decision making in complicated specific problems, it should be equipped with real-time inference mechanism. We develop two kinds of EAOG-driven inference mechanisms(1) EAOG-based forward chaining and (2) EAOG-based backward chaining. and The EAOG method processes the following three characteristics. 1. Real-time inference : The EAOG inference mechanism is suitable for the real-time inference because its computational mechanism is based on matrix computation. 2. Matrix operation : All the subjective knowledge is delineated in a matrix form, so that inference process can proceed based on the matrix operation which is computationally efficient. 3. Bi-directional inference : Traditional inference method of expert systems is based on either forward chaining or backward chaining which is mutually exclusive in terms of logical process and computational efficiency. However, the proposed EAOG inference mechanism is generically bi-directional without loss of both speed and efficiency.

  • PDF

Latent causal inference using the propensity score from latent class regression model (잠재범주회귀모형의 성향점수를 이용한 잠재변수의 원인적 영향력 추론 연구)

  • Lee, Misol;Chung, Hwan
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.5
    • /
    • pp.615-632
    • /
    • 2017
  • Unlike randomized trial, statistical strategies for inferring the unbiased causal relationship are required in the observational studies. The matching with the propensity score is one of the most popular methods to control the confounders in order to evaluate the effect of the treatment on the outcome variable. Recently, new methods for the causal inference in latent class analysis (LCA) have been proposed to estimate the average causal effect (ACE) of the treatment on the latent discrete variable. They have focused on the application study for the real dataset to estimate the ACE in LCA. In practice, however, the true values of the ACE are not known, and it is difficult to evaluate the performance of the estimated the ACE. In this study, we propose a method to generate a synthetic data using the propensity score in the framework of LCA, where treatment and outcome variables are latent. We then propose a new method for estimating the ACE in LCA and evaluate its performance via simulation studies. Furthermore we present an empirical analysis based on data form the 'National Longitudinal Study of Adolescents Health,' where puberty as a latent treatment and substance use as a latent outcome variable.