• Title/Summary/Keyword: cross-validation

Search Result 1,016, Processing Time 0.021 seconds

Variable selection in L1 penalized censored regression

  • Hwang, Chang-Ha;Kim, Mal-Suk;Shi, Joo-Yong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.5
    • /
    • pp.951-959
    • /
    • 2011
  • The proposed method is based on a penalized censored regression model with L1-penalty. We use the iteratively reweighted least squares procedure to solve L1 penalized log likelihood function of censored regression model. It provide the efficient computation of regression parameters including variable selection and leads to the generalized cross validation function for the model selection. Numerical results are then presented to indicate the performance of the proposed method.

Partially linear support vector orthogonal quantile regression with measurement errors

  • Hwang, Changha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.1
    • /
    • pp.209-216
    • /
    • 2015
  • Quantile regression models with covariate measurement errors have received a great deal of attention in both the theoretical and the applied statistical literature. A lot of effort has been devoted to develop effective estimation methods for such quantile regression models. In this paper we propose the partially linear support vector orthogonal quantile regression model in the presence of covariate measurement errors. We also provide a generalized approximate cross-validation method for choosing the hyperparameters and the ratios of the error variances which affect the performance of the proposed model. The proposed model is evaluated through simulations.

Semiparametric support vector machine for accelerated failure time model

  • Hwang, Chang-Ha;Shim, Joo-Yong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.21 no.4
    • /
    • pp.765-775
    • /
    • 2010
  • For the accelerated failure time (AFT) model a lot of effort has been devoted to develop effective estimation methods. AFT model assumes a linear relationship between the logarithm of event time and covariates. In this paper we propose a semiparametric support vector machine to consider situations where the functional form of the effect of one or more covariates is unknown. The proposed estimating equation can be computed by a quadratic programming and a linear equation. We study the effect of several covariates on a censored response variable with an unknown probability distribution. We also provide a generalized approximate cross-validation method for choosing the hyper-parameters which affect the performance of the proposed approach. The proposed method is evaluated through simulations using the artificial example.

Support vector expectile regression using IRWLS procedure

  • Choi, Kook-Lyeol;Shim, Jooyong;Seok, Kyungha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.4
    • /
    • pp.931-939
    • /
    • 2014
  • In this paper we propose the iteratively reweighted least squares procedure to solve the quadratic programming problem of support vector expectile regression with an asymmetrically weighted squares loss function. The proposed procedure enables us to select the appropriate hyperparameters easily by using the generalized cross validation function. Through numerical studies on the artificial and the real data sets we show the effectiveness of the proposed method on the estimation performances.

GACV for partially linear support vector regression

  • Shim, Jooyong;Seok, Kyungha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.2
    • /
    • pp.391-399
    • /
    • 2013
  • Partially linear regression is capable of providing more complete description of the linear and nonlinear relationships among random variables. In support vector regression (SVR) the hyper-parameters are known to affect the performance of regression. In this paper we propose an iterative reweighted least squares (IRWLS) procedure to solve the quadratic problem of partially linear support vector regression with a modified loss function, which enables us to use the generalized approximate cross validation function to select the hyper-parameters. Experimental results are then presented which illustrate the performance of the partially linear SVR using IRWLS procedure.

A kernel machine for estimation of mean and volatility functions

  • Shim, Joo-Yong;Park, Hye-Jung;Hwang, Chang-Ha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.5
    • /
    • pp.905-912
    • /
    • 2009
  • We propose a doubly penalized kernel machine (DPKM) which uses heteroscedastic location-scale model as basic model and estimates both mean and volatility functions simultaneously by kernel machines. We also present the model selection method which employs the generalized approximate cross validation techniques for choosing the hyperparameters which affect the performance of DPKM. Artificial examples are provided to indicate the usefulness of DPKM for the mean and volatility functions estimation.

  • PDF

Variance function estimation with LS-SVM for replicated data

  • Shim, Joo-Yong;Park, Hye-Jung;Seok, Kyung-Ha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.5
    • /
    • pp.925-931
    • /
    • 2009
  • In this paper we propose a variance function estimation method for replicated data based on averages of squared residuals obtained from estimated mean function by the least squares support vector machine. Newton-Raphson method is used to obtain associated parameter vector for the variance function estimation. Furthermore, the cross validation functions are introduced to select the hyper-parameters which affect the performance of the proposed estimation method. Experimental results are then presented which illustrate the performance of the proposed procedure.

  • PDF

Fixed size LS-SVM for multiclassification problems of large data sets

  • Hwang, Hyung-Tae
    • Journal of the Korean Data and Information Science Society
    • /
    • v.21 no.3
    • /
    • pp.561-567
    • /
    • 2010
  • Multiclassification is typically performed using voting scheme methods based on combining a set of binary classifications. In this paper we use multiclassification method with a hat matrix of least squares support vector machine (LS-SVM), which can be regarded as the revised one-against-all method. To tackle multiclass problems for large data, we use the $Nystr\ddot{o}m$ approximation and the quadratic Renyi entropy with estimation in the primal space such as used in xed size LS-SVM. For the selection of hyperparameters, generalized cross validation techniques are employed. Experimental results are then presented to indicate the performance of the proposed procedure.

Geographically weighted least squares-support vector machine

  • Hwang, Changha;Shim, Jooyong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.1
    • /
    • pp.227-235
    • /
    • 2017
  • When the spatial information of each location is given specifically as coordinates it is popular to use the geographically weighted regression to incorporate the spatial information by assuming that the regression parameters vary spatially across locations. In this paper, we relax the linearity assumption of geographically weighted regression and propose a geographically weighted least squares-support vector machine for estimating geographically weighted mean by using the basic concept of kernel machines. Generalized cross validation function is induced for the model selection. Numerical studies with real datasets have been conducted to compare the performance of proposed method with other methods for predicting geographically weighted mean.

Prediction of retention of uncharged solutes in nanofiltration by means of molecular descriptors

  • Nowaczyk, Alicja;Nowaczyk, Jacek;Koter, Stanislaw
    • Membrane and Water Treatment
    • /
    • v.1 no.3
    • /
    • pp.181-192
    • /
    • 2010
  • A linear quantitative structure-property relationship (QSPR) model is presented for the prediction of rejection in permeation through membrane. The model was produced by using the multiple linear regression (MLR) technique on the database consisting of retention data of 25 pesticides in 4 different membrane separation experiments. Among the 3224 different physicochemical, topological and structural descriptors that were considered as inputs to the model only 50 were selected using several criteria of elimination. The physical meaning of chosen descriptor is discussed in detail. The accuracy of the proposed MLR models is illustrated using the following evaluation techniques: leave-one-out cross validation procedure, leave-many-out cross validation procedure and Y-randomization.