• Title/Summary/Keyword: LIKELIHOOD CROSS-VALIDATION

Search Result 31, Processing Time 0.02 seconds

Semiparametric kernel logistic regression with longitudinal data

  • Shim, Joo-Yong;Seok, Kyung-Ha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.2
    • /
    • pp.385-392
    • /
    • 2012
  • Logistic regression is a well known binary classification method in the field of statistical learning. Mixed-effect regression models are widely used for the analysis of correlated data such as those found in longitudinal studies. We consider kernel extensions with semiparametric fixed effects and parametric random effects for the logistic regression. The estimation is performed through the penalized likelihood method based on kernel trick, and our focus is on the efficient computation and the effective hyperparameter selection. For the selection of optimal hyperparameters, cross-validation techniques are employed. Numerical results are then presented to indicate the performance of the proposed procedure.

Estimation and variable selection in censored regression model with smoothly clipped absolute deviation penalty

  • Shim, Jooyong;Bae, Jongsig;Seok, Kyungha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.6
    • /
    • pp.1653-1660
    • /
    • 2016
  • Smoothly clipped absolute deviation (SCAD) penalty is known to satisfy the desirable properties for penalty functions like as unbiasedness, sparsity and continuity. In this paper, we deal with the regression function estimation and variable selection based on SCAD penalized censored regression model. We use the local linear approximation and the iteratively reweighted least squares algorithm to solve SCAD penalized log likelihood function. The proposed method provides an efficient method for variable selection and regression function estimation. The generalized cross validation function is presented for the model selection. Applications of the proposed method are illustrated through the simulated and a real example.

Variable selection in L1 penalized censored regression

  • Hwang, Chang-Ha;Kim, Mal-Suk;Shi, Joo-Yong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.5
    • /
    • pp.951-959
    • /
    • 2011
  • The proposed method is based on a penalized censored regression model with L1-penalty. We use the iteratively reweighted least squares procedure to solve L1 penalized log likelihood function of censored regression model. It provide the efficient computation of regression parameters including variable selection and leads to the generalized cross validation function for the model selection. Numerical results are then presented to indicate the performance of the proposed method.

Claims Reserving via Kernel Machine

  • Kim, Mal-Suk;Park, He-Jung;Hwang, Chang-Ha;Shim, Joo-Yong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.19 no.4
    • /
    • pp.1419-1427
    • /
    • 2008
  • This paper shows the kernel Poisson regression which can be applied in the claims reserving, where the row effect is assumed to be a nonlinear function of the row index. The paper concentrates on the chain-ladder technique, within the framework of the chain-ladder linear model. It is shown that the proposed method can provide better reserve estimates than the Poisson model. The cross validation function is introduced to choose optimal hyper-parameters in the procedure. Experimental results are then presented which indicate the performance of the proposed model.

  • PDF

Kernel Poisson Regression for Longitudinal Data

  • Shim, Joo-Yong;Seok, Kyung-Ha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.19 no.4
    • /
    • pp.1353-1360
    • /
    • 2008
  • An estimating procedure is introduced for the nonlinear mixed-effect Poisson regression, for longitudinal study, where data from different subjects are independent whereas data from same subject are correlated. The proposed procedure provides the estimates of the mean function of the response variables, where the canonical parameter is related to the input vector in a nonlinear form. The generalized cross validation function is introduced to choose optimal hyper-parameters in the procedure. Experimental results are then presented, which indicate the performance of the proposed estimating procedure.

  • PDF

Mapping Landslide Susceptibility Based on Spatial Prediction Modeling Approach and Quality Assessment (공간예측모형에 기반한 산사태 취약성 지도 작성과 품질 평가)

  • Al, Mamun;Park, Hyun-Su;JANG, Dong-Ho
    • Journal of The Geomorphological Association of Korea
    • /
    • v.26 no.3
    • /
    • pp.53-67
    • /
    • 2019
  • The purpose of this study is to identify the quality of landslide susceptibility in a landslide-prone area (Jinbu-myeon, Gangwon-do, South Korea) by spatial prediction modeling approach and compare the results obtained. For this goal, a landslide inventory map was prepared mainly based on past historical information and aerial photographs analysis (Daum Map, 2008), as well as some field observation. Altogether, 550 landslides were counted at the whole study area. Among them, 182 landslides are debris flow and each group of landslides was constructed in the inventory map separately. Then, the landslide inventory was randomly selected through Excel; 50% landslide was used for model analysis and the remaining 50% was used for validation purpose. Total 12 contributing factors, such as slope, aspect, curvature, topographic wetness index (TWI), elevation, forest type, forest timber diameter, forest crown density, geology, landuse, soil depth, and soil drainage were used in the analysis. Moreover, to find out the co-relation between landslide causative factors and incidents landslide, pixels were divided into several classes and frequency ratio for individual class was extracted. Eventually, six landslide susceptibility maps were constructed using the Bayesian Predictive Discriminant (BPD), Empirical Likelihood Ratio (ELR), and Linear Regression Method (LRM) models based on different category dada. Finally, in the cross validation process, landslide susceptibility map was plotted with a receiver operating characteristic (ROC) curve and calculated the area under the curve (AUC) and tried to extract success rate curve. The result showed that Bayesian, likelihood and linear models were of 85.52%, 85.23%, and 83.49% accuracy respectively for total data. Subsequently, in the category of debris flow landslide, results are little better compare with total data and its contained 86.33%, 85.53% and 84.17% accuracy. It means all three models were reasonable methods for landslide susceptibility analysis. The models have proved to produce reliable predictions for regional spatial planning or land-use planning.

Improvement of Basis-Screening-Based Dynamic Kriging Model Using Penalized Maximum Likelihood Estimation (페널티 적용 최대 우도 평가를 통한 기저 스크리닝 기반 크리깅 모델 개선)

  • Min-Geun Kim;Jaeseung Kim;Jeongwoo Han;Geun-Ho Lee
    • Journal of the Computational Structural Engineering Institute of Korea
    • /
    • v.36 no.6
    • /
    • pp.391-398
    • /
    • 2023
  • In this paper, a penalized maximum likelihood estimation (PMLE) method that applies a penalty to increase the accuracy of a basis-screening-based Kriging model (BSKM) is introduced. The maximum order and set of basis functions used in the BSKM are determined according to their importance. In this regard, the cross-validation error (CVE) for the basis functions is employed as an indicator of importance. When constructing the Kriging model (KM), the maximum order of basis functions is determined, the importance of each basis function is evaluated according to the corresponding maximum order, and finally the optimal set of basis functions is determined. This optimal set is created by adding basis functions one by one in order of importance until the CVE of the KM is minimized. In this process, the KM must be generated repeatedly. Simultaneously, hyper-parameters representing correlations between datasets must be calculated through the maximum likelihood evaluation method. Given that the optimal set of basis functions depends on such hyper-parameters, it has a significant impact on the accuracy of the KM. The PMLE method is applied to accurately calculate hyper-parameters. It was confirmed that the accuracy of a BSKM can be improved by applying it to Branin-Hoo problem.

New Calibration Methods with Asymmetric Data

  • Kim, Sung-Su
    • The Korean Journal of Applied Statistics
    • /
    • v.23 no.4
    • /
    • pp.759-765
    • /
    • 2010
  • In this paper, two new inverse regression methods are introduced. One is a distance based method, and the other is a likelihood based method. While a model is fitted by minimizing the sum of squared prediction errors of y's and x's in the classical and inverse methods, respectively. In the new distance based method, we simultaneously minimize the sum of both squared prediction errors. In the likelihood based method, we propose an inverse regression with Arnold-Beaver Skew Normal(ABSN) error distribution. Using the cross validation method with an asymmetric real data set, two new and two existing methods are studied based on the relative prediction bias(RBP) criteria.

GIS 공간분석기술을 이용한 산불취약지역 분석

  • 한종규;연영광;지광훈
    • Proceedings of the Korean Association of Geographic Inforamtion Studies Conference
    • /
    • 2002.03b
    • /
    • pp.49-59
    • /
    • 2002
  • 이 연구에서는 강원도 삼척시를 대상으로 산불취약지역 분석모델을 개발하고, 개발된 분석모델을 기반으로 산불취약지역을 표출하였으며, 이를 위한 전산프로그램을 개발하였다. 산불취약지역 공간분석자료로는 NGIS 사업을 통해 구축된 1/25천 축척의 수치지형도, 수치임상도 그리고 과거 산불발화위치자료를 사용하였다. 산불발화위치에 대한 공간적 분포특성(지형, 임상, 접근성)을 기반으로 모델을 설정하였으며, 공간분석은 간단하면서도 일반인들이 이해하기 쉬운 Conditional probability, Likelihood ratio 방법을 사용하였다. 그리고 각각의 모델에 대한 검증(cross validation)을 실시하였다. 모델 검증방법으로는 과거 산불발화위치자료를 발생시기에 따라 두 개의 그룹으로 나누어 하나는 예측을 위한 자료로 사용하고, 다른 하나는 검증을 위한 자료로 사용하였다. 모델별 예측성능은 prediction rate curve를 비교·분석하여 판단하였다. 삼척시를 대상으로 한 예측성능에서 Likelihood ratio 모델이 Conditional probability 모델보다 더 낳은 결과를 보였다. 산불취약지역 분석기술로 작성된 상세 산불취약지역지도와 현재 산림청에서 예보하고 있는 전국단위의 산불발생위험지수와 함께 상호보완적으로 사용한다면 산불취약지역에 대한 산불감시인력 및 감시시설의 효율적인 배치를 통하여 일선 시군 또는 읍면 산불예방업무의 효율성이 한층 더 증대될 것으로 기대된다.

  • PDF

Extraction of Potential Area for Block Stream and Talus Using Spatial Integration Model (공간통합 모델을 적용한 암괴류 및 애추 지형 분포가능지 추출)

  • Lee, Seong-Ho;JANG, Dong-Ho
    • Journal of The Geomorphological Association of Korea
    • /
    • v.26 no.2
    • /
    • pp.1-14
    • /
    • 2019
  • This study analyzed the relativity between block stream and talus distributions by employing a likelihood ratio approach. Possible distribution sites for each debris slope landform were extracted by applying a spatial integration model, in which we combined fuzzy set model, Bayesian predictive model, and logistic regression model. Moreover, to verify model performance, a success rate curve was prepared by cross-validation. The results showed that elevation, slope, curvature, topographic wetness index, geology, soil drainage, and soil depth were closely related to the debris slope landform sites. In addition, all spatial integration models displayed an accuracy of over 90%. The accuracy of the distribution potential area map of the block stream was highest in the logistic regression model (93.79%). Eventually, the accuracy of the distribution potential area map of the talus was also highest in the logistic regression model (97.02%). We expect that the present results will provide essential data and propose methodologies to improve the performance of efficient and systematic micro-landform studies. Moreover, our research will potentially help to enhance field research and topographic resource management.