• 제목/요약/키워드: Data regression

검색결과 20,274건 처리시간 0.043초

Censored Kernel Ridge Regression

  • Shim, Joo-Yong
    • Journal of the Korean Data and Information Science Society
    • /
    • 제16권4호
    • /
    • pp.1045-1052
    • /
    • 2005
  • This paper deals with the estimations of kernel ridge regression when the responses are subject to randomly right censoring. The weighted data are formed by redistributing the weights of the censored data to the uncensored data. Then kernel ridge regression can be taken up with the weighted data. The hyperparameters of model which affect the performance of the proposed procedure are selected by a generalized approximate cross validation(GACV) function. Experimental results are then presented which indicate the performance of the proposed procedure.

  • PDF

An application to Multivariate Zero-Inflated Poisson Regression Model

  • Kim, Kyung-Moo
    • Journal of the Korean Data and Information Science Society
    • /
    • 제14권2호
    • /
    • pp.177-186
    • /
    • 2003
  • The Zero-Inflated Poisson regression is a model for count data with exess zeros. When the correlated response variables are intrested, we have to extend the univariate zero-inflated regression model to multivariate model. In this paper, we study and simulate the multivariate zero-inflated regression model. A real example was applied to this model. Regression parameters are estimated by using MLE's. We also compare the fitness of multivariate zero-inflated Poisson regression model with the decision tree model.

  • PDF

An application to Zero-Inflated Poisson Regression Model

  • Kim, Kyung-Moo
    • Journal of the Korean Data and Information Science Society
    • /
    • 제14권1호
    • /
    • pp.45-53
    • /
    • 2003
  • The Zero-Inflated Poisson regression is a model for count data with exess zeros. When the reponse variables have excess zeros, it is not easy to apply the Poisson regression model. In this paper, we study and simulate the zero-inflated Poisson regression model. An real example was applied to this model. Regression parameters are estimated by using MLE's. We also compare the fitness of zero-inflated Poisson model with the Poisson regression and decision tree model.

  • PDF

Quantile regression with errors in variables

  • Shim, Jooyong
    • Journal of the Korean Data and Information Science Society
    • /
    • 제25권2호
    • /
    • pp.439-446
    • /
    • 2014
  • Quantile regression models with errors in variables have received a great deal of attention in the social and natural sciences. Some eorts have been devoted to develop eective estimation methods for such quantile regression models. In this paper we propose an orthogonal distance quantile regression model that eectively considers the errors on both input and response variables. The performance of the proposed method is evaluated through simulation studies.

TIME SERIES PREDICTION USING INCREMENTAL REGRESSION

  • Kim, Sung-Hyun;Lee, Yong-Mi;Jin, Long;Chai, Duck-Jin;Ryu, Keun-Ho
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2006년도 Proceedings of ISRS 2006 PORSEC Volume II
    • /
    • pp.635-638
    • /
    • 2006
  • Regression of conventional prediction techniques in data mining uses the model which is generated from the training step. This model is applied to new input data without any change. If this model is applied directly to time series, the rate of prediction accuracy will be decreased. This paper proposes an incremental regression for time series prediction like typhoon track prediction. This technique considers the characteristic of time series which may be changed over time. It is composed of two steps. The first step executes a fractional process for applying input data to the regression model. The second step updates the model by using its information as new data. Additionally, the model is maintained by only recent data in a queue. This approach has the following two advantages. It maintains the minimum information of the model by using a matrix, so space complexity is reduced. Moreover, it prevents the increment of error rate by updating the model over time. Accuracy rate of the proposed method is measured by RME(Relative Mean Error) and RMSE(Root Mean Square Error). The results of typhoon track prediction experiment are performed by the proposed technique IMLR(Incremental Multiple Linear Regression) is more efficient than those of MLR(Multiple Linear Regression) and SVR(Support Vector Regression).

  • PDF

호우피해자료에서의 고차원 자료 및 다중공선성 문제를 해소한 회귀모형 개발 (Development of Regression Models Resolving High-Dimensional Data and Multicollinearity Problem for Heavy Rain Damage Data)

  • 김정환;박지현;최창현;김형수
    • 대한토목학회논문집
    • /
    • 제38권6호
    • /
    • pp.801-808
    • /
    • 2018
  • 선형회귀모형의 학습은 일반적으로 자료의 개수가 설명변수의 개수보다 충분히 크고, 설명변수들 사이에 심각한 다중공선성이 없다는 가정 하에서 안정적으로 이루어진다. 본 연구에서는 이러한 가정이 위배되었을 경우 모형 학습의 어려움을 실제 호우피해자료를 분석함으로써 조명하였고, 이를 해결하기 위해 자료를 통합한 다음 주성분회귀모형 또는 능형회귀모형을 사용할 것을 검토하였다. 모형의 학습에 사용된 자료와 별도의 독립된 자료에서 제안된 모형들의 예측력을 평가하였고, 제안된 방법이 선형회귀모형보다 더 나은 예측력을 보이는 것을 확인하였다.

Efficiency of Aggregate Data in Non-linear Regression

  • Huh, Jib
    • Communications for Statistical Applications and Methods
    • /
    • 제8권2호
    • /
    • pp.327-336
    • /
    • 2001
  • This work concerns estimating a regression function, which is not linear, using aggregate data. In much of the empirical research, data are aggregated for various reasons before statistical analysis. In a traditional parametric approach, a linear estimation of the non-linear function with aggregate data can result in unstable estimators of the parameters. More serious consequence is the bias in the estimation of the non-linear function. The approach we employ is the kernel regression smoothing. We describe the conditions when the aggregate data can be used to estimate the regression function efficiently. Numerical examples will illustrate our findings.

  • PDF

Modelling Online Word-of-Mouth Effect on Korean Box-Office Sales Based on Kernel Regression Model

  • Park, Si-Yun;Kim, Jin-Gyo
    • Journal of the Korean Data and Information Science Society
    • /
    • 제18권4호
    • /
    • pp.995-1004
    • /
    • 2007
  • In this paper, we analyse online word-of-mouth and Korean box-office sales data based on kernel regression method. To do this, we consider the regression model with mixed-data and apply the least square cross-validation method proposed by Li and Racine (2004) to the model. We found the box-office sales can be explained by volume of online word-of-mouth and the characteristics of the movies.

  • PDF

Parameter Estimation and Prediction for NHPP Software Reliability Model and Time Series Regression in Software Failure Data

  • Song, Kwang-Yoon;Chang, In-Hong
    • 통합자연과학논문집
    • /
    • 제7권1호
    • /
    • pp.67-73
    • /
    • 2014
  • We consider the mean value function for NHPP software reliability model and time series regression model in software failure data. We estimate parameters for the proposed models from two data sets. The values of SSE and MSE is presented from two data sets. We compare the predicted number of faults with the actual two data sets using the mean value function and regression curve.

A Study on the Several Robust Regression Estimators

  • Kim, Jee-Yun;Roh, Kyung-Mi;Hwang, Jin-Soo
    • Journal of the Korean Data and Information Science Society
    • /
    • 제15권2호
    • /
    • pp.307-316
    • /
    • 2004
  • Principal Component Regression(PCR) and Partial Least Squares Regression(PLSR) are the two most popular regression techniques in chemometrics. In the field of chemometrics usually the number of regressor variables greatly exceeds the number of observation. So we have to reduce the number of regressors to avoid the identifiability problem. In this paper we compare PCR and PLSR techniques combined with various robust regression methods including regression depth estimation. We compare the efficiency, goodness-of-fit and robustness of each estimators under several contamination schemes.

  • PDF