• Title/Summary/Keyword: REGRESSION MODELS

Search Result 3,509, Processing Time 0.037 seconds

Penalized logistic regression models for determining the discharge of dyspnea patients (호흡곤란 환자 퇴원 결정을 위한 벌점 로지스틱 회귀모형)

  • Park, Cheolyong;Kye, Myo Jin
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.1
    • /
    • pp.125-133
    • /
    • 2013
  • In this paper, penalized binary logistic regression models are employed as statistical models for determining the discharge of 668 patients with a chief complaint of dyspnea based on 11 blood tests results. Specifically, the ridge model based on $L^2$ penalty and the Lasso model based on $L^1$ penalty are considered in this paper. In the comparison of prediction accuracy, our models are compared with the logistic regression models with all 11 explanatory variables and the selected variables by variable selection method. The results show that the prediction accuracy of the ridge logistic regression model is the best among 4 models based on 10-fold cross-validation.

Quantile regression with errors in variables

  • Shim, Jooyong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.2
    • /
    • pp.439-446
    • /
    • 2014
  • Quantile regression models with errors in variables have received a great deal of attention in the social and natural sciences. Some eorts have been devoted to develop eective estimation methods for such quantile regression models. In this paper we propose an orthogonal distance quantile regression model that eectively considers the errors on both input and response variables. The performance of the proposed method is evaluated through simulation studies.

Nonlinear Regression Quantile Estimators

  • Park, Seung-Hoe;Kim, Hae kyung;Park, Kyung-Ok
    • Journal of the Korean Statistical Society
    • /
    • v.30 no.4
    • /
    • pp.551-561
    • /
    • 2001
  • This paper deals with the asymptotic properties for statistical inferences of the parameters in nonlinear regression models. As an optimal criterion for robust estimators of the regression parameters, the regression quantile method is proposed. This paper defines the regression quintile estimators in the nonlinear models and provides simple and practical sufficient conditions for the asymptotic normality of the proposed estimators when the parameter space is compact. The efficiency of the proposed estimator is especially well compared with least squares estimator, least absolute deviation estimator under asymmetric error distribution.

  • PDF

Disequilibrium econometric models and switching regression models (불균형계량경제모형과 교체회귀모형)

  • 이회경
    • The Korean Journal of Applied Statistics
    • /
    • v.2 no.2
    • /
    • pp.37-45
    • /
    • 1989
  • Switching regression models are commonly used for the statistical analysis of the disequilibrium models. In this paper wer show how switching regression models can be classified by the sample separation criterion and how they are related to the disequilibrium models. The problems in the estimation of the disequilibrium models ar discussed for the ones with both known sample separation and unknown sample separation.

  • PDF

A Comparative Study on Arrhenius-Type Constitutive Models with Regression Methods

  • Lee, Kyunghoon;Murugesan, Mohanraj;Lee, Seung-Min;Kang, Beom-Soo
    • Transactions of Materials Processing
    • /
    • v.26 no.1
    • /
    • pp.18-27
    • /
    • 2017
  • A comparative study was performed on strain-compensated Arrhenius-type constitutive models established with two regression methods: polynomial regression and regression Kriging. For measurements at high temperatures, experimental data of 70Cr3Mo steel were adopted from previous research. An Arrhenius-type constitutive model necessitates strain compensation for material constants to account for strain effect. To associate the material constants with strain, we first evaluated them at a set of discrete strains, then capitalized on surrogate modeling to represent the material constants as a function of strain. As a result, disparate flow stress models were formed via the two different regression methods. The constructed constitutive models were examined systematically against measured flow stresses by validation methods. The predicted material constants were found to be quite accurate compared to the actual material constants. However, notable mismatches between measured and predicted flow stresses were revealed by the proposed validation techniques, which carry out validation with not the entire, but a single tensile test case.

Evaluation of Regression Models with various Criteria and Optimization Methods for Pollutant Load Estimations (다양한 평가 지표와 최적화 기법을 통한 오염부하 산정 회귀 모형 평가)

  • Kim, Jonggun;Lim, Kyoung Jae;Park, Youn Shik
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2018.05a
    • /
    • pp.448-448
    • /
    • 2018
  • In this study, the regression models (Load ESTimator and eight-parameter model) were evaluated to estimate instantaneous pollutant loads under various criteria and optimization methods. As shown in the results, LOADEST commonly used in interpolating pollutant loads could not necessarily provide the best results with the automatic selected regression model. It is inferred that the various regression models in LOADEST need to be considered to find the best solution based on the characteristics of watersheds applied. The recently developed eight-parameter model integrated with Genetic Algorithm (GA) and Gradient Descent Method (GDM) were also compared with LOADEST indicating that the eight-parameter model performed better than LOADEST, but it showed different behaviors in calibration and validation. The eight-parameter model with GDM could reproduce the nitrogen loads properly outside of calibration period (validation). Furthermore, the accuracy and precision of model estimations were evaluated using various criteria (e.g., $R^2$ and gradient and constant of linear regression line). The results showed higher precisions with the $R^2$ values closed to 1.0 in LOADEST and better accuracy with the constants (in linear regression line) closed to 0.0 in the eight-parameter model with GDM. In hence, based on these finding we recommend that users need to evaluate the regression models under various criteria and calibration methods to provide the more accurate and precise results for pollutant load estimations.

  • PDF

Application of Statistical Models for Default Probability of Loans in Mortgage Companies

  • Jung, Jin-Whan
    • Communications for Statistical Applications and Methods
    • /
    • v.7 no.2
    • /
    • pp.605-616
    • /
    • 2000
  • Three primary interests frequently raised by mortgage companies are introduced and the corresponding statistical approaches for the default probability in mortgage companies are examined. Statistical models considered in this paper are time series, logistic regression, decision tree, neural network, and discrete time models. Usage of the models is illustrated using an artificially modified data set and the corresponding models are evaluated in appropriate manners.

  • PDF

Development of Roundabout Accident Models by Region (지역별 회전교차로 사고모형 개발 및 논의)

  • Son, Seul Ki;Park, Byung Ho
    • International Journal of Highway Engineering
    • /
    • v.20 no.2
    • /
    • pp.67-74
    • /
    • 2018
  • PURPOSES : The goal of this study is the development of roundabout accident models for urban and non-urban areas. METHODS : This study performed a comparative analysis of the regional factors affecting accidents. Traffic accident data were collected for the period 2010~2014 from the TAAS data set of the Road Traffic Authority. To develop the roundabout accident models, the Poisson and negative binomial regression models were used. A total of 25 explanatory variables such as geometry, and traffic volume were used. RESULTS : The key findings are as follows: First, it was found that the null hypotheses that the number of accidents is the same should be rejected. Second, three Poisson regression accident models, which are statistically significant (${\rho}^2$ of 0.154 and 0.385) were developed. Third, it was noted that although the common variable of the three models (models I~III) is the number of entry lanes, the specific variables are entry lane width, roundabout sign, number of circulatory roadways, splitter island, number of exit lanes, exit lane width, number of approach roads, and truck apron. CONCLUSIONS : The results of this study can provide suggestive countermeasures for decreasing the number of roundabout accidents.

Prediction Models of Residual Chlorine in Sediment Basin to Control Pre-chlorination in Water Treatment Plant (정수장 전염소 공정 제어를 위한 침전지 잔류 염소 농도 예측모델 개발)

  • Lee, Kyung-Hyuk;Kim, Ju-Hwan;Lim, Jae-Lim;Chae, Seon Ha
    • Journal of Korean Society of Water and Wastewater
    • /
    • v.21 no.5
    • /
    • pp.601-607
    • /
    • 2007
  • In order to maintain constant residual chlorine in sedimentation basin, It is necessary to develop real time prediction model of residual chlorine considering water treatment plant data such as water qualities, weather, and plant operation conditions. Based on the operation data acquired from K water treatment plant, prediction models of residual chlorine in sediment basin were accomplished. The input parameters applied in the models were water temperature, turbidity, pH, conductivity, flow rate, alkalinity and pre-chlorination dosage. The multiple regression models were established with linear and non-linear model with 5,448 data set. The corelation coefficient (R) for the linear and non-linear model were 0.39 and 0.374, respectively. It shows low correlation coefficient, that is, these multiple regression models can not represent the residual chlorine with the input parameters which varies independently with time changes related to weather condition. Artificial neural network models are applied with three different conditions. Input parameters are consisted of water quality data observed in water treatment process based on the structure of auto-regressive model type, considering a time lag. The artificial neural network models have better ability to predict residual chlorine at sediment basin than conventional linear and nonlinear multi-regression models. The determination coefficients of each model in verification process were shown as 0.742, 0.754, and 0.869, respectively. Consequently, comparing the results of each model, neural network can simulate the residual chlorine in sedimentation basin better than mathematical regression models in terms of prediction performance. This results are expected to contribute into automation control of water treatment processes.

Development of Virtual Metrology Models in Semiconductor Manufacturing Using Genetic Algorithm and Kernel Partial Least Squares Regression (유전알고리즘과 커널 부분최소제곱회귀를 이용한 반도체 공정의 가상계측 모델 개발)

  • Kim, Bo-Keon;Yum, Bong-Jin
    • IE interfaces
    • /
    • v.23 no.3
    • /
    • pp.229-238
    • /
    • 2010
  • Virtual metrology (VM), a critical component of semiconductor manufacturing, is an efficient way of assessing the quality of wafers not actually measured. This is done based on a model between equipment sensor data (obtained for all wafers) and the quality characteristics of wafers actually measured. This paper considers principal component regression (PCR), partial least squares regression (PLSR), kernel PCR (KPCR), and kernel PLSR (KPLSR) as VM models. For each regression model, two cases are considered. One utilizes all explanatory variables in developing a model, and the other selects significant variables using the genetic algorithm (GA). The prediction performances of 8 regression models are compared for the short- and long-term etch process data. It is found among others that the GA-KPLSR model performs best for both types of data. Especially, its prediction ability is within the requirement for the short-term data implying that it can be used to implement VM for real etch processes.