• 제목/요약/키워드: Penalized likelihood

검색결과 56건 처리시간 0.022초

Penalized variable selection for accelerated failure time models

  • Park, Eunyoung;Ha, Il Do
    • Communications for Statistical Applications and Methods
    • /
    • 제25권6호
    • /
    • pp.591-604
    • /
    • 2018
  • The accelerated failure time (AFT) model is a linear model under the log-transformation of survival time that has been introduced as a useful alternative to the proportional hazards (PH) model. In this paper we propose variable-selection procedures of fixed effects in a parametric AFT model using penalized likelihood approaches. We use three popular penalty functions, least absolute shrinkage and selection operator (LASSO), adaptive LASSO and smoothly clipped absolute deviation (SCAD). With these procedures we can select important variables and estimate the fixed effects at the same time. The performance of the proposed method is evaluated using simulation studies, including the investigation of impact of misspecifying the assumed distribution. The proposed method is illustrated with a primary biliary cirrhosis (PBC) data set.

Penalized maximum likelihood estimation with symmetric log-concave errors and LASSO penalty

  • Seo-Young, Park;Sunyul, Kim;Byungtae, Seo
    • Communications for Statistical Applications and Methods
    • /
    • 제29권6호
    • /
    • pp.641-653
    • /
    • 2022
  • Penalized least squares methods are important tools to simultaneously select variables and estimate parameters in linear regression. The penalized maximum likelihood can also be used for the same purpose assuming that the error distribution falls in a certain parametric family of distributions. However, the use of a certain parametric family can suffer a misspecification problem which undermines the estimation accuracy. To give sufficient flexibility to the error distribution, we propose to use the symmetric log-concave error distribution with LASSO penalty. A feasible algorithm to estimate both nonparametric and parametric components in the proposed model is provided. Some numerical studies are also presented showing that the proposed method produces more efficient estimators than some existing methods with similar variable selection performance.

Probabilistic penalized principal component analysis

  • Park, Chongsun;Wang, Morgan C.;Mo, Eun Bi
    • Communications for Statistical Applications and Methods
    • /
    • 제24권2호
    • /
    • pp.143-154
    • /
    • 2017
  • A variable selection method based on probabilistic principal component analysis (PCA) using penalized likelihood method is proposed. The proposed method is a two-step variable reduction method. The first step is based on the probabilistic principal component idea to identify principle components. The penalty function is used to identify important variables in each component. We then build a model on the original data space instead of building on the rotated data space through latent variables (principal components) because the proposed method achieves the goal of dimension reduction through identifying important observed variables. Consequently, the proposed method is of more practical use. The proposed estimators perform as the oracle procedure and are root-n consistent with a proper choice of regularization parameters. The proposed method can be successfully applied to high-dimensional PCA problems with a relatively large portion of irrelevant variables included in the data set. It is straightforward to extend our likelihood method in handling problems with missing observations using EM algorithms. Further, it could be effectively applied in cases where some data vectors exhibit one or more missing values at random.

벌점가능추정치의 일치성에 대하여 (Note on the Consistency of a Penalized Maximum Likelihood Estimate)

  • 안성만
    • Communications for Statistical Applications and Methods
    • /
    • 제16권4호
    • /
    • pp.573-578
    • /
    • 2009
  • 본 논문에서는 과대적합모형을 가정하여 성분의 수에 대한 벌점항목을 추가한 Ahn (2001)의 벌점가능도에 의한 추정치가 일치성을 달성하는 것에 대한 증명을 제시하였다. Wald (1949)의 조건을 만족하는 지를 보이기 위하여 과대적합모형를 활용하였다. 또한 모수가 한계점으로 수렴할 경우 우도값이 무한대로 간다는 문제점을 해결하였다.

다수준 프레일티모형 변수선택법을 이용한 다기관 방광암 생존자료분석 (Analysis of multi-center bladder cancer survival data using variable-selection method of multi-level frailty models)

  • 김보현;하일도;이동환
    • Journal of the Korean Data and Information Science Society
    • /
    • 제27권2호
    • /
    • pp.499-510
    • /
    • 2016
  • 생존분석 회귀모형에서 적절한 변수를 선택하는 것은 매우 중요하다. 본 논문에서는 "frailtyHL" R 패키지 (Ha 등, 2012)를 기반으로 하여 다수준 프레일티 모형 (multi-level frailty models)에서 벌점화 변수선택 방법 (penalized variable-selection method)의 절차를 소개한다. 여기서 모형 추정은 벌점화 다단계 가능도에 기초하며, 세 가지 벌점 함수 (LASSO, SCAD 및 HL)가 고려된다. 개발된 방법의 예증을 위해 벨기에 EORTC (European Organization for Research and Treatment of Cancer; 유럽 암 치료기구)에서 수행된 다국가/다기관 임상시험 자료를 이용하여 세 가지 변수 선택 방법의 결과를 비교하고, 그 결과들의 상대적 장 단점에 대해 토론한다. 특히, 자료 분석 결과에 의하면 SCAD와 HL방법이 LASSO보다 중요한 변수를 잘 선택하는 것으로 나타났다.

희소행렬 계산과 혼합모형의 추론 (Sparse Matrix Computation in Mixed Effects Model)

  • 손원;박용태;김유경;임요한
    • 응용통계연구
    • /
    • 제28권2호
    • /
    • pp.281-288
    • /
    • 2015
  • 본 연구에서는 혼합모형의 추론을 위한 벌점-최대우도추정량의 빠른 계산절차를 제안하다. 제안된 절차는 벌점-최대우도추정량을 위한 추정방정식에서 헷시안 행렬을 화살촉형태를 지닌 희소행렬을 통하여 근사 시킴으로써 계산속도의 향상을 가져왔다. 두 가지 가상실험을 통하여 제안된 근사식을 사용함으로써 얻게되는 계산시간의 감소와 동시에 이를 위하여 지불하여야 하는 근사오차에 대하여 살펴보았다.

페널티 적용 최대 우도 평가를 통한 기저 스크리닝 기반 크리깅 모델 개선 (Improvement of Basis-Screening-Based Dynamic Kriging Model Using Penalized Maximum Likelihood Estimation)

  • 김민근;김재승;한정우;이근호
    • 한국전산구조공학회논문집
    • /
    • 제36권6호
    • /
    • pp.391-398
    • /
    • 2023
  • 본 논문에서는 기저 스크리닝 기반 크리깅 모델(BSKM: Basis Screening based Kriging Model) 생성의 정확도를 높이기 위해 페널티를 적용한 최대 우도 평가 방법(PMLE : Penalized Maximum Likelihood Estimation)에 대해서 소개한다. BSKM에서 사용하는 기저함수의 최대 차수와 종류는 그 중요도에 따라서 결정하게 되며, 이때 중요도의 지표는 기저함수에 대한 교차 검증 오차(CVE : Cross Validation Error)로 택한다. 크리깅 모델(KM : Kriging Model) 구성시 최적의 기저함수 조합은 우선 최대 기저함수 차수를 선택하고 개별 기저함수의 중요도를 평가를 하게 된다. 최적 기저함수 조합은 크리깅 모델의 CVE가 최소가 될 때까지 개별 기저함수의 중요도가 높은 순으로 기저함수를 하나씩 추가하며 찾는다. 이 과정에서 KM은 반복적으로 생성해야 하며, 동시에 데이터 사이의 상관관계를 나타내는 하이퍼 매개변수(Hyper-parameters)도 최대 우도 평가방법을 통해 계산하여야 한다. 하이퍼 매개변수의 값에 따라 선택되는 최적의 기저함수 조합이 달라지기 때문에 KM의 정확도에 막대한 영향을 미치게 된다. 정확한 하이퍼 매개변수를 계산하기 위해서 PMLE 방법을 적용하였으며, Branin-Hoo 함수 문제에 적용하여 BSKM 의 정확성이 개선될 수 있음을 확인하였다.

Semiparametric Kernel Poisson Regression for Longitudinal Count Data

  • Hwang, Chang-Ha;Shim, Joo-Yong
    • Communications for Statistical Applications and Methods
    • /
    • 제15권6호
    • /
    • pp.1003-1011
    • /
    • 2008
  • Mixed-effect Poisson regression models are widely used for analysis of correlated count data such as those found in longitudinal studies. In this paper, we consider kernel extensions with semiparametric fixed effects and parametric random effects. The estimation is through the penalized likelihood method based on kernel trick and our focus is on the efficient computation and the effective hyperparameter selection. For the selection of hyperparameters, cross-validation techniques are employed. Examples illustrating usage and features of the proposed method are provided.

Negative Binomial Varying Coefficient Partially Linear Models

  • Kim, Young-Ju
    • Communications for Statistical Applications and Methods
    • /
    • 제19권6호
    • /
    • pp.809-817
    • /
    • 2012
  • We propose a semiparametric inference for a generalized varying coefficient partially linear model(VCPLM) for negative binomial data. The VCPLM is useful to model real data in that varying coefficients are a special type of interaction between explanatory variables and partially linear models fit both parametric and nonparametric terms. The negative binomial distribution often arise in modelling count data which usually are overdispersed. The varying coefficient function estimators and regression parameters in generalized VCPLM are obtained by formulating a penalized likelihood through smoothing splines for negative binomial data when the shape parameter is known. The performance of the proposed method is then evaluated by simulations.

Semiparametric kernel logistic regression with longitudinal data

  • Shim, Joo-Yong;Seok, Kyung-Ha
    • Journal of the Korean Data and Information Science Society
    • /
    • 제23권2호
    • /
    • pp.385-392
    • /
    • 2012
  • Logistic regression is a well known binary classification method in the field of statistical learning. Mixed-effect regression models are widely used for the analysis of correlated data such as those found in longitudinal studies. We consider kernel extensions with semiparametric fixed effects and parametric random effects for the logistic regression. The estimation is performed through the penalized likelihood method based on kernel trick, and our focus is on the efficient computation and the effective hyperparameter selection. For the selection of optimal hyperparameters, cross-validation techniques are employed. Numerical results are then presented to indicate the performance of the proposed procedure.