통합 검색 | Korea Science

중도절단된 자료에 대한 가법회귀모형 (Additive Regression Models for Censored Data)

김철기
- 품질경영학회지
- /
- 제24권1호
- /
- pp.32-43
- /
- 1996
In this paper we develop nonparametric methods for regression analysis when the response variable is subject to censoring that arises naturally in quality engineering. This development is based on a general missing information principle that enables us to apply, via an iterative scheme, nonparametric regression techniques for complete data to iteratively reconstructed data from a given sample with censored observations. In particular, additive regression models are extended to right-censored data. This nonparametric regression method is applied to a simulated data set and the estimated smooth functions provide insights into the relationship between failure time and explanatory variables in the data.
PDF

Comparison of tree-based ensemble models for regression

Park, Sangho;Kim, Chanmin
- Communications for Statistical Applications and Methods
- /
- 제29권5호
- /
- pp.561-589
- /
- 2022
When multiple classifications and regression trees are combined, tree-based ensemble models, such as random forest (RF) and Bayesian additive regression trees (BART), are produced. We compare the model structures and performances of various ensemble models for regression settings in this study. RF learns bootstrapped samples and selects a splitting variable from predictors gathered at each node. The BART model is specified as the sum of trees and is calculated using the Bayesian backfitting algorithm. Throughout the extensive simulation studies, the strengths and drawbacks of the two methods in the presence of missing data, high-dimensional data, or highly correlated data are investigated. In the presence of missing data, BART performs well in general, whereas RF provides adequate coverage. The BART outperforms in high dimensional, highly correlated data. However, in all of the scenarios considered, the RF has a shorter computation time. The performance of the two methods is also compared using two real data sets that represent the aforementioned situations, and the same conclusion is reached.
https://doi.org/10.29220/CSAM.2022.29.5.561 인용 PDF KSCI

강제환기식 돈사의 환기량 추정을 위한 회귀모델의 비교 (Comparison of Regression Models for Estimating Ventilation Rate of Mechanically Ventilated Swine Farm)

조광곤;하태환;윤상후;장유나;정민웅
- 한국농공학회논문집
- /
- 제62권1호
- /
- pp.61-70
- /
- 2020
To estimate the ventilation volume of mechanically ventilated swine farms, various regression models were applied, and errors were compared to select the regression model that can best simulate actual data. Linear regression, linear spline, polynomial regression (degrees 2 and 3), logistic curve, generalized additive model (GAM), and gompertz curve were compared. Overfitting models were excluded even when the error rate was small. The evaluation criteria were root mean square error (RMSE) and mean absolute percentage error (MAPE). The evaluation results indicated that degree 3 exhibited the lowest error rate; however, an overestimation contradiction was observed in a certain section. The logistic curve was the most stable and superior to all the models. In the estimation of ventilation volume by all of the models, the estimated ventilation volume of the logistic curve was the smallest except for the model with a large error rate and the overestimated model.
https://doi.org/10.5389/KSAE.2020.62.1.061 인용 PDF KSCI

Kernel Regression Estimation for Permutation Fixed Design Additive Models

Baek, Jangsun;Wehrly, Thomas E.
- Journal of the Korean Statistical Society
- /
- 제25권4호
- /
- pp.499-514
- /
- 1996
Consider an additive regression model of Y on X = (X$_1$,X$_2$,. . .,$X_p$), Y = $sum_{j=1}^pf_j(X_j) + $\varepsilon$$, where $f_j$s are smooth functions to be estimated and $\varepsilon$ is a random error. If $X_j$s are fixed design points, we call it the fixed design additive model. Since the response variable Y is observed at fixed p-dimensional design points, the behavior of the nonparametric regression estimator depends on the design. We propose a fixed design called permutation fixed design, and fit the regression function by the kernel method. The estimator in the permutation fixed design achieves the univariate optimal rate of convergence in mean squared error for any p $\geq$ 2.
PDF

Predicting the Young's modulus of frozen sand using machine learning approaches: State-of-the-art review

Reza Sarkhani Benemaran;Mahzad Esmaeili-Falak
- Geomechanics and Engineering
- /
- 제34권5호
- /
- pp.507-527
- /
- 2023
Accurately estimation of the geo-mechanical parameters in Artificial Ground Freezing (AGF) is a most important scientific topic in soil improvement and geotechnical engineering. In order for this, one way is using classical and conventional constitutive models based on different theories like critical state theory, Hooke's law, and so on, which are time-consuming, costly, and troublous. The others are the application of artificial intelligence (AI) techniques to predict considered parameters and behaviors accurately. This study presents a comprehensive data-mining-based model for predicting the Young's Modulus of frozen sand under the triaxial test. For this aim, several single and hybrid models were considered including additive regression, bagging, M5-Rules, M5P, random forests (RF), support vector regression (SVR), locally weighted linear (LWL), gaussian process regression (GPR), and multi-layered perceptron neural network (MLP). In the present study, cell pressure, strain rate, temperature, time, and strain were considered as the input variables, where the Young's Modulus was recognized as target. The results showed that all selected single and hybrid predicting models have acceptable agreement with measured experimental results. Especially, hybrid Additive Regression-Gaussian Process Regression and Bagging-Gaussian Process Regression have the best accuracy based on Model performance assessment criteria.
https://doi.org/10.12989/gae.2023.34.5.507 인용

An Additive Sparse Penalty for Variable Selection in High-Dimensional Linear Regression Model

Lee, Sangin
- Communications for Statistical Applications and Methods
- /
- 제22권2호
- /
- pp.147-157
- /
- 2015
We consider a sparse high-dimensional linear regression model. Penalized methods using LASSO or non-convex penalties have been widely used for variable selection and estimation in high-dimensional regression models. In penalized regression, the selection and prediction performances depend on which penalty function is used. For example, it is known that LASSO has a good prediction performance but tends to select more variables than necessary. In this paper, we propose an additive sparse penalty for variable selection using a combination of LASSO and minimax concave penalties (MCP). The proposed penalty is designed for good properties of both LASSO and MCP.We develop an efficient algorithm to compute the proposed estimator by combining a concave convex procedure and coordinate descent algorithm. Numerical studies show that the proposed method has better selection and prediction performances compared to other penalized methods.
https://doi.org/10.5351/CSAM.2015.22.2.147 인용 PDF KSCI

Pure additive contribution of genetic variants to a risk prediction model using propensity score matching: application to type 2 diabetes

Park, Chanwoo;Jiang, Nan;Park, Taesung
- Genomics & Informatics
- /
- 제17권4호
- /
- pp.47.1-47.12
- /
- 2019
The achievements of genome-wide association studies have suggested ways to predict diseases, such as type 2 diabetes (T2D), using single-nucleotide polymorphisms (SNPs). Most T2D risk prediction models have used SNPs in combination with demographic variables. However, it is difficult to evaluate the pure additive contribution of genetic variants to classically used demographic models. Since prediction models include some heritable traits, such as body mass index, the contribution of SNPs using unmatched case-control samples may be underestimated. In this article, we propose a method that uses propensity score matching to avoid underestimation by matching case and control samples, thereby determining the pure additive contribution of SNPs. To illustrate the proposed propensity score matching method, we used SNP data from the Korea Association Resources project and reported SNPs from the genome-wide association study catalog. We selected various SNP sets via stepwise logistic regression (SLR), least absolute shrinkage and selection operator (LASSO), and the elastic-net (EN) algorithm. Using these SNP sets, we made predictions using SLR, LASSO, and EN as logistic regression modeling techniques. The accuracy of the predictions was compared in terms of area under the receiver operating characteristic curve (AUC). The contribution of SNPs to T2D was evaluated by the difference in the AUC between models using only demographic variables and models that included the SNPs. The largest difference among our models showed that the AUC of the model using genetic variants with demographic variables could be 0.107 higher than that of the corresponding model using only demographic variables.
https://doi.org/10.5808/GI.2019.17.4.e47 인용 PDF KSCI

Oceanographic indicators for the occurrence of anchovy eggs inferred from generalized additive models

Kim, Jin Yeong;Lee, Jae Bong;Suh, Young-Sang
- Fisheries and Aquatic Sciences
- /
- 제23권7호
- /
- pp.19.1-19.14
- /
- 2020
Three generalized additive models were applied to the distribution of anchovy eggs and oceanographic factors to determine the occurrence of anchovy spawning grounds in Korean waters and to identify the indicators of their occurrence using survey data from the spring and summer of 1985, 1995, and 2002. Binomial and Gaussian types of generalized additive models (GAM) and quantile generalized additive models (QGAM) revealed that egg density was influenced mostly by ocean temperature and salinity in spring, and the vertical structure of temperature, salinity, dissolved oxygen, and zooplankton biomass during summer in the upper quantiles of egg density. The GAM and QGAM model deviance explained 18.5-63.2% of the egg distribution in summer in the East and West Sea. For the principle component analysis-based GAMs, the variance explained by the final regression model was 27.3-67.0%, higher than the regular models and QGAMs for egg density in the East and West Sea. By analyzing the distribution of anchovy eggs off the Korean coast, our results revealed the optimal temperature and salinity conditions, in addition to high production and high vertical mixing, as the key indicators of the major spawning grounds of anchovies.
https://doi.org/10.1186/s41240-020-00161-y 인용 PDF KSCI

A Procedure for Fitting Nonadditive Models

Seo, Han-Son
- Communications for Statistical Applications and Methods
- /
- 제7권2호
- /
- pp.393-401
- /
- 2000
Many graphical methods have been suggested for obtaining an impression of a curvature in regression problem in which some covariates enter nonlinearly. However when true model does not belong to the class of additive models, graphical methods may contain a serious bias. A method is suggested which can avoid such bias in the fitting of nonaddive models.
PDF

A Comparative Study on the Performance of Bayesian Partially Linear Models

Woo, Yoonsung;Choi, Taeryon;Kim, Wooseok
- Communications for Statistical Applications and Methods
- /
- 제19권6호
- /
- pp.885-898
- /
- 2012
In this paper, we consider Bayesian approaches to partially linear models, in which a regression function is represented by a semiparametric additive form of a parametric linear regression function and a nonparametric regression function. We make a comparative study on the performance of widely used Bayesian partially linear models in terms of empirical analysis. Specifically, we deal with three Bayesian methods to estimate the nonparametric regression function, one method using Fourier series representation, the other method based on Gaussian process regression approach, and the third method based on the smoothness of the function and differencing. We compare the numerical performance of three methods by the root mean squared error(RMSE). For empirical analysis, we consider synthetic data with simulation studies and real data application by fitting each of them with three Bayesian methods and comparing the RMSEs.
https://doi.org/10.5351/CKSS.2012.19.6.885 인용 PDF KSCI

검색결과 64건 처리시간 0.028초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)