Search | Korea Science

Kang, Jongkyeong;Park, Jaeshin;Bang, Sungwan
- The Korean Journal of Applied Statistics
- /
- v.30 no.1
- /
- pp.135-145
- /
- 2017
Principal component analysis (PCA) describes the variation of multivariate data in terms of a set of uncorrelated variables. Since each principal component is a linear combination of all variables and the loadings are typically non-zero, it is difficult to interpret the derived principal components. Sparse principal component analysis (SPCA) is a specialized technique using the elastic net penalty function to produce sparse loadings in principal component analysis. When data are structured by groups of variables, it is desirable to select variables in a grouped manner. In this paper, we propose a new PCA method to improve variable selection performance when variables are grouped, which not only selects important groups but also removes unimportant variables within identified groups. To incorporate group information into model fitting, we consider a hierarchical lasso penalty instead of the elastic net penalty in SPCA. Real data analyses demonstrate the performance and usefulness of the proposed method.
https://doi.org/10.5351/KJAS.2017.30.1.135 인용 PDF KSCI

Kwon, Ji Hoon;Ha, Il Do
- The Korean Journal of Applied Statistics
- /
- v.34 no.3
- /
- pp.411-425
- /
- 2021
Accelerated failure time (AFT) model represents a linear relationship between the log-survival time and covariates. We are interested in the inference of covariate's effect affecting the variation of survival times in the AFT model. Thus, we need to model the variance as well as the mean of survival times. We call the resulting model mean and variance AFT (MV-AFT) model. In this paper, we propose a variable selection procedure of regression parameters of mean and variance in MV-AFT model using penalized likelihood function. For the variable selection, we study four penalty functions, i.e. least absolute shrinkage and selection operator (LASSO), adaptive lasso (ALASSO), smoothly clipped absolute deviation (SCAD) and hierarchical likelihood (HL). With this procedure we can select important covariates and estimate the regression parameters at the same time. The performance of the proposed method is evaluated using simulation studies. The proposed method is illustrated with a clinical example dataset.
https://doi.org/10.5351/KJAS.2021.34.3.411 인용 PDF KSCI

Kim, Eunkyung;Jhun, Myoungshic;Bang, Sungwan
- The Korean Journal of Applied Statistics
- /
- v.29 no.5
- /
- pp.961-975
- /
- 2016
The hierarchically penalized support vector machine (H-SVM) has been developed to perform simultaneous classification and input variable selection when input variables are naturally grouped or generated by factors. However, the H-SVM may suffer from estimation inefficiency because it applies the same amount of shrinkage to each variable without assessing its relative importance. In addition, when analyzing imbalanced data with uneven class sizes, the classification accuracy of the H-SVM may drop significantly in predicting minority class because its classifiers are undesirably biased toward the majority class. To remedy such problems, we propose the weighted adaptive H-SVM (WAH-SVM) method, which uses a adaptive tuning parameters to improve the performance of variable selection and the weights to differentiate the misclassification of data points between classes. Numerical results are presented to demonstrate the competitive performance of the proposed WAH-SVM over existing SVM methods.
https://doi.org/10.5351/KJAS.2016.29.5.961 인용 PDF KSCI