• Title/Summary/Keyword: semiparametric

Search Result 75, Processing Time 0.021 seconds

Efficient variable selection method using conditional mutual information (조건부 상호정보를 이용한 분류분석에서의 변수선택)

  • Ahn, Chi Kyung;Kim, Donguk
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.5
    • /
    • pp.1079-1094
    • /
    • 2014
  • In this paper, we study efficient gene selection methods by using conditional mutual information. We suggest gene selection methods using conditional mutual information based on semiparametric methods utilizing multivariate normal distribution and Edgeworth approximation. We compare our suggested methods with other methods such as mutual information filter, SVM-RFE, Cai et al. (2009)'s gene selection (MIGS-original) in SVM classification. By these experiments, we show that gene selection methods using conditional mutual information based on semiparametric methods have better performance than mutual information filter. Furthermore, we show that they take far less computing time than Cai et al. (2009)'s gene selection but have similar performance.

Overview of estimating the average treatment effect using dimension reduction methods (차원축소 방법을 이용한 평균처리효과 추정에 대한 개요)

  • Mijeong Kim
    • The Korean Journal of Applied Statistics
    • /
    • v.36 no.4
    • /
    • pp.323-335
    • /
    • 2023
  • In causal analysis of high dimensional data, it is important to reduce the dimension of covariates and transform them appropriately to control confounders that affect treatment and potential outcomes. The augmented inverse probability weighting (AIPW) method is mainly used for estimation of average treatment effect (ATE). AIPW estimator can be obtained by using estimated propensity score and outcome model. ATE estimator can be inconsistent or have large asymptotic variance when using estimated propensity score and outcome model obtained by parametric methods that includes all covariates, especially for high dimensional data. For this reason, an ATE estimation using an appropriate dimension reduction method and semiparametric model for high dimensional data is attracting attention. Semiparametric method or sparse sufficient dimensionality reduction method can be uesd for dimension reduction for the estimation of propensity score and outcome model. Recently, another method has been proposed that does not use propensity score and outcome regression. After reducing dimension of covariates, ATE estimation can be performed using matching. Among the studies on ATE estimation methods for high dimensional data, four recently proposed studies will be introduced, and how to interpret the estimated ATE will be discussed.

Bayesian Variable Selection in the Proportional Hazard Model with Application to DNA Microarray Data

  • Lee, Kyeon-Eun;Mallick, Bani K.
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2005.09a
    • /
    • pp.357-360
    • /
    • 2005
  • In this paper we consider the well-known semiparametric proportional hazards (PH) models for survival analysis. These models are usually used with few covariates and many observations (subjects). But, for a typical setting of gene expression data from DNA microarray, we need to consider the case where the number of covariates p exceeds the number of samples n. For a given vector of response values which are times to event (death or censored times) and p gene expressions (covariates), we address the issue of how to reduce the dimension by selecting the significant genes. This approach enable us to estimate the survival curve when n < < p. In our approach, rather than fixing the number of selected genes, we will assign a prior distribution to this number. The approach creates additional flexibility by allowing the imposition of constraints, such as bounding the dimension via a prior, which in effect works as a penalty. To implement our methodology, we use a Markov Chain Monte Carlo (MCMC) method. We demonstrate the use of the methodology to diffuse large B-cell lymphoma (DLBCL) complementary DNA(cDNA) data.

  • PDF

Bayesian curve-fitting with radial basis functions under functional measurement error model

  • Hwang, Jinseub;Kim, Dal Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.3
    • /
    • pp.749-754
    • /
    • 2015
  • This article presents Bayesian approach to regression splines with knots on a grid of equally spaced sample quantiles of the independent variables under functional measurement error model.We consider small area model by using penalized splines of non-linear pattern. Specifically, in a basis functions of the regression spline, we use radial basis functions. To fit the model and estimate parameters we suggest a hierarchical Bayesian framework using Markov Chain Monte Carlo methodology. Furthermore, we illustrate the method in an application data. We check the convergence by a potential scale reduction factor and we use the posterior predictive p-value and the mean logarithmic conditional predictive ordinate to compar models.

On prediction of random effects in log-normal frailty models

  • Ha, Il-Do;Cho, Geon-Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.1
    • /
    • pp.203-209
    • /
    • 2009
  • Frailty models are useful for the analysis of correlated and/or heterogeneous survival data. However, the inferences of fixed parameters, rather than random effects, have been mainly studied. The prediction (or estimation) of random effects is also practically useful to investigate the heterogeneity of the hospital or patient effects. In this paper we propose how to extend the prediction method for random effects in HGLMs (hierarchical generalized linear models) to log-normal semiparametric frailty models with nonparametric baseline hazard. The proposed method is demonstrated by a simulation study.

  • PDF

Small-Sample Inference in the Errors-in-Variables Model (소표본 errors-in-vairalbes 모형에서의 통계 추론)

  • 소병수
    • Journal of Korean Society for Quality Management
    • /
    • v.25 no.1
    • /
    • pp.69-79
    • /
    • 1997
  • We consider the semiparametric linear errors-in-variables model: yi=(${\alpha}+{\beta}ui+{\varepsilon}i$, xi=ui+${\varepsilon}i$ i=1, …, n where (xi, yi) stands for an observation vector, (ui) denotes a set of incidental nuisance parameters, (${\alpha}$ , ${\beta}$) is a vector of regression parameters and (${\varepsilon}i$, ${\delta}i$) are mutually uncorrelated measurement errors with zero mean and finite variances but otherwise unknown distributions. On the basis of a simple small-sample low-noise a, pp.oximation, we propose a new method of comparing the mean squared errors(MSE) of the various competing estimators of the true regression parameters ((${\alpha}$ , ${\beta}$). Then we show that a class of estimators including the classical least squares estimator and the maximum likelihood estimator are consistent and first-order efficient within the class of all regular consistent estimators irrespective of type of measurement errors.

  • PDF

Semiparametric Nu-Support Vector Regression (정해진 기저함수가 포함되는 Nu-SVR 학습방법)

  • 김영일;조원희;박주영
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2003.05a
    • /
    • pp.81-84
    • /
    • 2003
  • $\varepsilon$-SVR(e-Support Vector Regression)학습방법은 SV(Support Vector)들을 이용하여 함수 근사(Regression)하는 방법으로 최근 주목받고 있는 기법이다. SVM(SV machine)의 한 가지 방법으로, 신경망을 기반으로 한 다른 알고리즘들이 학습과정에서 지역적 최적해로 수렴하는 등의 문제를 한계로 갖는데 반해, 이러한 구조들을 대체할 수 있는 학습방법으로 사용될 수 있다. 일반적인 $\varepsilon$-SVR에서는 학습 데이터와 관사 함수 f사이에 허용 가능한 에러범위 $\varepsilon$값이 학습하기 전에 정해진다. 그러나 Nu-SVR(ν-version SVR)학습방법은 학습의 결과로 최적화 된 $\varepsilon$값을 얻을 수 있다. 정해진 기저함수가 포함되는 $\varepsilon$-SVR 학습방법(Sermparametric SVR)은 정해진 독립 기저함수를 사용하여 함수를 근사하는 방법으로, 일반적인 $\varepsilon$-SVR 학습방범에 비해 우수한 결과를 나타내는 것이 성공적으로 입증된 바 있다. 이에 따라, 본 논문에서는 정해진 기저함수가 포함된 ν-SVR 학습 방법을 제안하고, 이에 대한 수식을 유도하였다. 그리고, 모의 실험을 통하여 제안된 Sermparametric ν-SVR 학습 방법의 적용 가능성을 알아보았다.

  • PDF

Semiparametric Kernel Fisher Discriminant Approach for Regression Problems

  • Park, Joo-Young;Cho, Won-Hee;Kim, Young-Il
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.3 no.2
    • /
    • pp.227-232
    • /
    • 2003
  • Recently, support vector learning attracts an enormous amount of interest in the areas of function approximation, pattern classification, and novelty detection. One of the main reasons for the success of the support vector machines(SVMs) seems to be the availability of global and sparse solutions. Among the approaches sharing the same reasons for success and exhibiting a similarly good performance, we have KFD(kernel Fisher discriminant) approach. In this paper, we consider the problem of function approximation utilizing both predetermined basis functions and the KFD approach for regression. After reviewing support vector regression, semi-parametric approach for including predetermined basis functions, and the KFD regression, this paper presents an extension of the conventional KFD approach for regression toward the direction that can utilize predetermined basis functions. The applicability of the presented method is illustrated via a regression example.

Analyzing Clustered and Interval-Censored Data based on the Semiparametric Frailty Model

  • Kim, Jin-Heum;Kim, Youn-Nam
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.5
    • /
    • pp.707-718
    • /
    • 2012
  • We propose a semi-parametric model to analyze clustered and interval-censored data; in addition, we plugged-in a gamma frailty to the model to measure the association of members within the same cluster. We propose an estimation procedure based on EM algorithm. Simulation results showed that our estimation procedure may result in unbiased estimates. The standard error is smaller than expected and provides conservative results to estimate the coverage rate; however, this trend gradually disappeared as the number of members in the same cluster increased. In addition, our proposed method was illustrated with data taken from diabetic retinopathy studies to evaluate the effectiveness of laser photocoagulation in delaying or preventing the onset of blindness in individuals with diabetic retinopathy.