• 제목/요약/키워드: semiparametric models

검색결과 32건 처리시간 0.026초

Semiparametric kernel logistic regression with longitudinal data

  • Shim, Joo-Yong;Seok, Kyung-Ha
    • Journal of the Korean Data and Information Science Society
    • /
    • 제23권2호
    • /
    • pp.385-392
    • /
    • 2012
  • Logistic regression is a well known binary classification method in the field of statistical learning. Mixed-effect regression models are widely used for the analysis of correlated data such as those found in longitudinal studies. We consider kernel extensions with semiparametric fixed effects and parametric random effects for the logistic regression. The estimation is performed through the penalized likelihood method based on kernel trick, and our focus is on the efficient computation and the effective hyperparameter selection. For the selection of optimal hyperparameters, cross-validation techniques are employed. Numerical results are then presented to indicate the performance of the proposed procedure.

베이지안 순서형 프로빗 준모수 회귀 모형 : 국민건강영양조사 2016 자료를 통한 흡연양태와 커피섭취 간의 관계 분석 (Bayesian ordinal probit semiparametric regression models: KNHANES 2016 data analysis of the relationship between smoking behavior and coffee intake)

  • 이다솜;이은지;조성일;최태련
    • 응용통계연구
    • /
    • 제33권1호
    • /
    • pp.25-46
    • /
    • 2020
  • 본 논문에서는 Bayesian spectral analysis regression (BSAR) 방법론을 이용한 베이지안 순서형 프로빗 준모수 회귀모형에 대해서 고찰한다. 순서형 프로빗 회귀모형은 순서가 있는 범주형 자료를 모형화하는 방법으로, 정규 분포의 분포함수의 역함수인 프로빗 연결함수를 이용해 각 범주의 확률과 설명변수을 연결함으로써 반응변수의 확률을 모형화한다. 베이지안 프로빗 회귀 모형은 정규 분포를 따르는 잠재변수를 도입함으로써 사후 분포 도출을 용이하게 하고, 절단점에 따라 나뉘어지는 잠재변수들의 값에 따라서 반응 변수들이 범주화된다. 본 논문에서는 이러한 잠재 변수 방법을 확장해 BSAR 방법론에 기반하여 단조증가/감소와 같은 형태제약을 반영할 수 있는 베이지안 이항형 및 순서형 프로빗 준모수 회귀모형에 대해 연구한다. 모의실험을 통하여 이항형 프로빗 준모수 회귀모형과 기존의 다른 모형들 간의 적합결과를 비교하고, 형태 제약에 따른 순서형 프로빗 준모수 회귀모형의 적합결과를 비교 분석하도록 한다. 아울러, 국민건강영양조사 제 7기 1차년도 (2016) 자료(Korean National Health and Nutrition Examination Survey (KNHANES), 2016)를 바탕으로, 본 논문에서 고찰한 이항형 및 순서형 프로빗 준모수 회귀모형을 적용하여, 흡연양태와 커피섭취 간의 관계에 대한 실증적 분석을 수행한다.

단일지표모형에서 계수 추정방법의 비교 (A comparison on coefficient estimation methods in single index models)

  • 최영웅;강기훈
    • Journal of the Korean Data and Information Science Society
    • /
    • 제21권6호
    • /
    • pp.1171-1180
    • /
    • 2010
  • 회귀함수의 비모수적 적합에서 공변량의 차원이 증가함에 따라 추정량의 극한성질이 좋지 않음이 잘 알려져 있다. 이러한 문제점을 극복하기 위한 방법중의 하나는 단일지표모형의 추정을 이용하여 공변량의 차원을 1차원으로 줄이는 것이다. 단일지표모형에서 계수 추정 방법으로는 반복적으로 해를 계산하여 근사치를 구하는 방법인 준모수적 최소제곱법과 비반복적으로 계산하여 구하는 도함수 가중평균법이 있다. 두 추정 방법 모두 모수적인 방법과 같은 수렴비율로 정규근사한다고 알려져 있지만 실질적인 성능에 관한 비교는 이루어지지 않았다. 본 논문에서는 모의실험을 통해 두 방법에 의한 추정치의 분산을 비교하여 어떠한 방법이 좋은지를 파악하고자 한다.

On prediction of random effects in log-normal frailty models

  • Ha, Il-Do;Cho, Geon-Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • 제20권1호
    • /
    • pp.203-209
    • /
    • 2009
  • Frailty models are useful for the analysis of correlated and/or heterogeneous survival data. However, the inferences of fixed parameters, rather than random effects, have been mainly studied. The prediction (or estimation) of random effects is also practically useful to investigate the heterogeneity of the hospital or patient effects. In this paper we propose how to extend the prediction method for random effects in HGLMs (hierarchical generalized linear models) to log-normal semiparametric frailty models with nonparametric baseline hazard. The proposed method is demonstrated by a simulation study.

  • PDF

A Comparative Study on the Performance of Bayesian Partially Linear Models

  • Woo, Yoonsung;Choi, Taeryon;Kim, Wooseok
    • Communications for Statistical Applications and Methods
    • /
    • 제19권6호
    • /
    • pp.885-898
    • /
    • 2012
  • In this paper, we consider Bayesian approaches to partially linear models, in which a regression function is represented by a semiparametric additive form of a parametric linear regression function and a nonparametric regression function. We make a comparative study on the performance of widely used Bayesian partially linear models in terms of empirical analysis. Specifically, we deal with three Bayesian methods to estimate the nonparametric regression function, one method using Fourier series representation, the other method based on Gaussian process regression approach, and the third method based on the smoothness of the function and differencing. We compare the numerical performance of three methods by the root mean squared error(RMSE). For empirical analysis, we consider synthetic data with simulation studies and real data application by fitting each of them with three Bayesian methods and comparing the RMSEs.

Long Memory Characteristics in the Korean Stock Market Volatility

  • Cho, Sinsup;Choe, Hyuk;Park, Joon Y
    • Communications for Statistical Applications and Methods
    • /
    • 제9권3호
    • /
    • pp.577-594
    • /
    • 2002
  • For the estimation and test of long memory feature in volatilities of stock indices and individual companies semiparametric approach, Geweke and Porter-Hudak (1983), is employed. Empirical study supports the strong evidence of volatility persistence in Korean stock market. Most of indices and individual companies have the feature of long term dependence of volatility. Hence the short memory models are unable to explain the volatilities in Korean stock market.

Bayesian Variable Selection in the Proportional Hazard Model with Application to DNA Microarray Data

  • Lee, Kyeon-Eun;Mallick, Bani K.
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2005년도 BIOINFO 2005
    • /
    • pp.357-360
    • /
    • 2005
  • In this paper we consider the well-known semiparametric proportional hazards (PH) models for survival analysis. These models are usually used with few covariates and many observations (subjects). But, for a typical setting of gene expression data from DNA microarray, we need to consider the case where the number of covariates p exceeds the number of samples n. For a given vector of response values which are times to event (death or censored times) and p gene expressions (covariates), we address the issue of how to reduce the dimension by selecting the significant genes. This approach enable us to estimate the survival curve when n < < p. In our approach, rather than fixing the number of selected genes, we will assign a prior distribution to this number. The approach creates additional flexibility by allowing the imposition of constraints, such as bounding the dimension via a prior, which in effect works as a penalty. To implement our methodology, we use a Markov Chain Monte Carlo (MCMC) method. We demonstrate the use of the methodology to diffuse large B-cell lymphoma (DLBCL) complementary DNA(cDNA) data.

  • PDF

작은 도시에 에어비앤비 가격지도: 지리가중접근법 활용한 마카오 관광지에 대한 분석 (Mapping Airbnb prices in a small city: A geographically weighted approach for Macau tourist attractions)

  • 등한신;홍인수;유창석
    • 한국콘텐츠학회:학술대회논문집
    • /
    • 한국콘텐츠학회 2019년도 춘계종합학술대회
    • /
    • pp.211-212
    • /
    • 2019
  • The objectives of this research are to test the utility of semiparametric geographically weighted regression (SGWR, a spatial analysis method) in the small-scale urban sample, and to understand the geographic patterns of provision and pricing of sharing economy based accommodations in the tourist city. This paper focused on how network distance to heritage site, to casino, residential unit prices and other five attribute categories determine Airbnb price in Macau SAR, China. Findings show that SGWR models outperformed OLS models. Moreover, comparing with heritage sites, casinos are the stronger factors to drive up Airbnb (including hostels) rooms' provision and their prices; and residential unit prices are not related with the Airbnb price in the attraction clusters in Macau. This research showed a little example for the applications of SGWR in the small city, and for the analysis of online marketplace data as new urban study material. Practically, this study provides some scientific evidence for hosts, guests, urban planners, and policymakers' decision making in Macau.

  • PDF

Bayesian curve-fitting with radial basis functions under functional measurement error model

  • Hwang, Jinseub;Kim, Dal Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • 제26권3호
    • /
    • pp.749-754
    • /
    • 2015
  • This article presents Bayesian approach to regression splines with knots on a grid of equally spaced sample quantiles of the independent variables under functional measurement error model.We consider small area model by using penalized splines of non-linear pattern. Specifically, in a basis functions of the regression spline, we use radial basis functions. To fit the model and estimate parameters we suggest a hierarchical Bayesian framework using Markov Chain Monte Carlo methodology. Furthermore, we illustrate the method in an application data. We check the convergence by a potential scale reduction factor and we use the posterior predictive p-value and the mean logarithmic conditional predictive ordinate to compar models.

Bayesian Variable Selection in the Proportional Hazard Model with Application to Microarray Data

  • Lee, Kyeong-Eun;Mallick, Bani K.
    • 한국통계학회:학술대회논문집
    • /
    • 한국통계학회 2005년도 춘계 학술발표회 논문집
    • /
    • pp.17-23
    • /
    • 2005
  • In this paper we consider the well-known semiparametric proportional hazards models for survival analysis. These models are usually used with few covariates and many observations (subjects). But, for a typical setting of gene expression data from DNA microarray, we need to consider the case where the number of covariates p exceeds the number of samples n. For a given vector of response values which are times to event (death or censored times) and p gene expressions(covariates), we address the issue of how to reduce the dimension by selecting the significant genes. This approach enables us to estimate the survival curve when n ${\ll}$p. In our approach, rather than fixing the number of selected genes, we will assign a prior distribution to this number. The approach creates additional flexibility by allowing the imposition of constraints, such as bounding the dimension via a prior, which in effect works as a penalty To implement our methodology, we use a Markov Chain Monte Carlo (MCMC) method. We demonstrate the use of the methodology to diffuse large B-cell lymphoma (DLBCL) complementary DNA (cDNA) data and Breast Carcinomas data.

  • PDF