• Title/Summary/Keyword: 회귀 방법

Search Result 4,282, Processing Time 0.03 seconds

Relative Error Prediction via Penalized Regression (벌점회귀를 통한 상대오차 예측방법)

  • Jeong, Seok-Oh;Lee, Seo-Eun;Shin, Key-Il
    • The Korean Journal of Applied Statistics
    • /
    • v.28 no.6
    • /
    • pp.1103-1111
    • /
    • 2015
  • This paper presents a new prediction method based on relative error incorporated with a penalized regression. The proposed method consists of fully data-driven procedures that is fast, simple, and easy to implement. An example of real data analysis and some simulation results were given to prove that the proposed approach works in practice.

Determing the Monitoring Point using Entropy Method and Linear Regression (엔트로피 방법과 선형회귀식을 이용한 모니터링 지점선정)

  • Ryu, seung-hyun;Song, yang-ho;Lee, jung-ho
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2012.05a
    • /
    • pp.111-112
    • /
    • 2012
  • 하수관거시스템(sewer system)의 효율적인 관리를 위해서는 관거 내의 유출, 수질, 불명수 및 CSOs(Combine Sewer Overflows)등에 대한 지속적인 모니터링이 필요하다. 그런데 하나의 유역 하수관거시스템에서 모든 지점에 대한 모니터링은 예산의 제약으로 인하여 불가능하다. 따라서 모니터링 지점들은 주어진 예산 내에서 최대의 효율적인 자료의 획득이 가능한 지점들로 선정되어야한다. 그럼에도 불구하고 모니터링의 지점의 선정에 대한 명확한 기준 및 선정된 모니터링 지점에서 획득된 자료에 대한 정량화된 평가방법에 대한 연구는 미흡한 실정이다. 따라서 본 연구에서는 엔트로피 방법과 선형회귀식을 이용하여 상류 유출을 통한 하류 유출을 예측할 수 있는 모니터링 지점을 선정하는 방법을 제시하였다. 검증결과 제시된 회귀식은 안정적으로 하류 유출을 예측할 수 있는 것으로 나타났다. 본 연구에서 산정한 회귀식을 사용하여 하류 유출의 사전 예측이 가능할 것으로 판단된다.

  • PDF

A study on the multivariate sliced inverse regression (다변량 분할 역회귀모형에 관한 연구)

  • 이용구;이덕기
    • The Korean Journal of Applied Statistics
    • /
    • v.10 no.2
    • /
    • pp.293-308
    • /
    • 1997
  • Sliced inverse regression is a method for reducing the dimension of the explanatory variable X without going through any parametric or nonparametric model fitting process. This method explores the simplicity of the inverse view of regression; that is, instead of regressing the univariate output varable y against the multivariate X, we regress X against y. In this article, we propose bivariate sliced inverse regression, whose method regress the multivariate X against the bivariate output variables $y_1, Y_2$. Bivariate sliced inverse regression estimates the e.d.r. directions of satisfying two generalized regression model simultaneously. For the application of bivariate sliced inverse regression, we decompose the output variable y into two variables, one variable y gained by projecting the output variable y onto the column space of X and the other variable r through projecting the output variable y onto the space orthogonal to the column space of X, respectively and then estimate the e.d.r. directions of the generalized regression model by utilize two variables simultaneously. As a result, bivariate sliced inverse regression of considering the variable y and r simultaneously estimates the e.d.r. directions efficiently and steadily when the regression model is linear, quadratic and nonlinear, respectively.

  • PDF

Principal selected response reduction in multivariate regression (다변량회귀에서 주선택 반응변수 차원축소)

  • Yoo, Jae Keun
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.4
    • /
    • pp.659-669
    • /
    • 2021
  • Multivariate regression often appears in longitudinal or functional data analysis. Since multivariate regression involves multi-dimensional response variables, it is more strongly affected by the so-called curse of dimension that univariate regression. To overcome this issue, Yoo (2018) and Yoo (2019a) proposed three model-based response dimension reduction methodologies. According to various numerical studies in Yoo (2019a), the default method suggested in Yoo (2019a) is least sensitive to the simulated models, but it is not the best one. To release this issue, the paper proposes an selection algorithm by comparing the other two methods with the default one. This approach is called principal selected response reduction. Various simulation studies show that the proposed method provides more accurate estimation results than the default one by Yoo (2019a), and it confirms practical and empirical usefulness of the propose method over the default one by Yoo (2019a).

Bayesian Inference for the Zero In ated Negative Binomial Regression Model (제로팽창 음이항 회귀모형에 대한 베이지안 추론)

  • Shim, Jung-Suk;Lee, Dong-Hee;Jun, Byoung-Cheol
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.5
    • /
    • pp.951-961
    • /
    • 2011
  • In this paper, we propose a Bayesian inference using the Markov Chain Monte Carlo(MCMC) method for the zero inflated negative binomial(ZINB) regression model. The proposed model allows the regression model for zero inflation probability as well as the regression model for the mean of the dependent variable. This extends the work of Jang et al. (2010) to the fully defiend ZINB regression model. In addition, we apply the proposed method to a real data example, and compare the efficiency with the zero inflated Poisson model using the DIC. Since the DIC of the ZINB is smaller than that of the ZIP, the ZINB model shows superior performance over the ZIP model in zero inflated count data with overdispersion.

Graphical Method for Multiple Regression Model (다중회귀모형의 그래픽적 방법)

  • Lee, W.R.;Lee, U.K.;Hong, C.S.
    • The Korean Journal of Applied Statistics
    • /
    • v.20 no.1
    • /
    • pp.195-204
    • /
    • 2007
  • In order to represent multiple regression data, an alternative graphical method, called as SSR Plot, is proposed by using geometrical description methods. This plot uses the relation that the sum of sqaures for regression (SSR) of two explanatory variables is known as the sum of the SSR of one variable and the increase in the SSR due to the addition of other variable to the model that already contains a variable. This half circle shaped SSR plot contains vectors corresponding explanatory variables. We might conclude that some explanatory variables corresponding to vectors which locate near the horisontal axis do affect the response variable. Also, for the regression model with two explanatory variables, a magnitude of the angle between two vectors can be identified for suppression.

Comparison of GEE Estimation Methods for Repeated Binary Data with Time-Varying Covariates on Different Missing Mechanisms (시간-종속적 공변량이 포함된 이분형 반복측정자료의 GEE를 이용한 분석에서 결측 체계에 따른 회귀계수 추정방법 비교)

  • Park, Boram;Jung, Inkyung
    • The Korean Journal of Applied Statistics
    • /
    • v.26 no.5
    • /
    • pp.697-712
    • /
    • 2013
  • When analyzing repeated binary data, the generalized estimating equations(GEE) approach produces consistent estimates for regression parameters even if an incorrect working correlation matrix is used. However, time-varying covariates experience larger changes in coefficients than time-invariant covariates across various working correlation structures for finite samples. In addition, the GEE approach may give biased estimates under missing at random(MAR). Weighted estimating equations and multiple imputation methods have been proposed to reduce biases in parameter estimates under MAR. This article studies if the two methods produce robust estimates across various working correlation structures for longitudinal binary data with time-varying covariates under different missing mechanisms. Through simulation, we observe that time-varying covariates have greater differences in parameter estimates across different working correlation structures than time-invariant covariates. The multiple imputation method produces more robust estimates under any working correlation structure and smaller biases compared to the other two methods.

Statistical significance test of polynomial regression equation for Huff's quartile method of design rainfall (설계강우량의 Huff 4분위 방법 다항회귀식에 대한 유의성 검정)

  • Park, Jinhee;Lee, Jaejoon;Lee, Sungho
    • Journal of Korea Water Resources Association
    • /
    • v.51 no.3
    • /
    • pp.263-272
    • /
    • 2018
  • For the design of hydraulic structures, the design flood discharge corresponding to a specific frequency is generally used by using the design storm calculated according to the rainfall-runoff relationship. In the past, empirical equations such as rational equations were used to calculate the peak flow rate. However, as the duration of rainfall is prolonged, the outflow patterns are different from the actual events, so the accuracy of the temporal distribution of the probability rainfall becomes important. In the present work, Huff's quartile method is used for the temporal distribution of rainfall, and the third quartile is generally used. The regression equation for Huff's quadratic curve applies a sixth order polynomial equation because of its high accuracy throughout the duration of rainfall. However, in statistical modeling, the regression equation needs to be concise in accordance with the principle of simplicity, and it is necessary to determine the regression coefficient based on the statistical significance level. Therefore, in this study, the statistical significance test for regression equation for temporal distribution of the Huff's quartile method, which is used as the temporal distribution method of design rainfall, is conducted for 69 rainfall observation stations under the jurisdiction of the Korea Meteorological Administration. It is statistically significant that the regression equation of the Huff's quartile method can be considered only up to the 4th order polynomial equation, as the regression coefficient is significant in most of the 69 rainfall observation stations.

Settlement Prediction Accuracy Analysis of Weighted Nonlinear Regression Hyperbolic Method According to the Weighting Method (가중치 부여 방법에 따른 가중 비선형 회귀 쌍곡선법의 침하 예측 정확도 분석)

  • Kwak, Tae-Young ;Woo, Sang-Inn;Hong, Seongho ;Lee, Ju-Hyung;Baek, Sung-Ha
    • Journal of the Korean Geotechnical Society
    • /
    • v.39 no.4
    • /
    • pp.45-54
    • /
    • 2023
  • The settlement prediction during the design phase is primarily conducted using theoretical methods. However, measurement-based settlement prediction methods that predict future settlements based on measured settlement data over time are primarily used during construction due to accuracy issues. Among these methods, the hyperbolic method is commonly used. However, the existing hyperbolic method has accuracy issues and statistical limitations. Therefore, a weighted nonlinear regression hyperbolic method has been proposed. In this study, two weighting methods were applied to the weighted nonlinear regression hyperbolic method to compare and analyze the accuracy of settlement prediction. Measured settlement plate data from two sites located in Busan New Port were used. The settlement of the remaining sections was predicted by setting the regression analysis section to 30%, 50%, and 70% of the total data. Thus, regardless of the weight assignment method, the settlement prediction based on the hyperbolic method demonstrated a remarkable increase in accuracy as the regression analysis section increased. The weighted nonlinear regression hyperbolic method predicted settlement more accurately than the existing linear regression hyperbolic method. In particular, despite a smaller regression analysis section, the weighted nonlinear regression hyperbolic method showed higher settlement prediction performance than the existing linear regression hyperbolic method. Thus, it was confirmed that the weighted nonlinear regression hyperbolic method could predict settlement much faster and more accurately.

A comparison study of inverse censoring probability weighting in censored regression (중도절단 회귀모형에서 역절단확률가중 방법 간의 비교연구)

  • Shin, Jungmin;Kim, Hyungwoo;Shin, Seung Jun
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.6
    • /
    • pp.957-968
    • /
    • 2021
  • Inverse censoring probability weighting (ICPW) is a popular technique in survival data analysis. In applications of the ICPW technique such as the censored regression, it is crucial to accurately estimate the censoring probability. A simulation study is undertaken in this article to see how censoring probability estimate influences model performance in censored regression using the ICPW scheme. We compare three censoring probability estimators, including Kaplan-Meier (KM) estimator, Cox proportional hazard model estimator, and local KM estimator. For the local KM estimator, we propose to reduce the predictor dimension to avoid the curse of dimensionality and consider two popular dimension reduction tools: principal component analysis and sliced inverse regression. Finally, we found that the Cox proportional hazard model estimator shows the best performance as a censoring probability estimator in both mean and median censored regressions.