• Title/Summary/Keyword: Least-squares Regression

Search Result 567, Processing Time 0.028 seconds

A new classification method using penalized partial least squares (벌점 부분최소자승법을 이용한 분류방법)

  • Kim, Yun-Dae;Jun, Chi-Hyuck;Lee, Hye-Seon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.5
    • /
    • pp.931-940
    • /
    • 2011
  • Classification is to generate a rule of classifying objects into several categories based on the learning sample. Good classification model should classify new objects with low misclassification error. Many types of classification methods have been developed including logistic regression, discriminant analysis and tree. This paper presents a new classification method using penalized partial least squares. Penalized partial least squares can make the model more robust and remedy multicollinearity problem. This paper compares the proposed method with logistic regression and PCA based discriminant analysis by some real and artificial data. It is concluded that the new method has better power as compared with other methods.

Short-Term Wind Speed Forecast Based on Least Squares Support Vector Machine

  • Wang, Yanling;Zhou, Xing;Liang, Likai;Zhang, Mingjun;Zhang, Qiang;Niu, Zhiqiang
    • Journal of Information Processing Systems
    • /
    • v.14 no.6
    • /
    • pp.1385-1397
    • /
    • 2018
  • There are many factors that affect the wind speed. In addition, the randomness of wind speed also leads to low prediction accuracy for wind speed. According to this situation, this paper constructs the short-time forecasting model based on the least squares support vector machines (LSSVM) to forecast the wind speed. The basis of the model used in this paper is support vector regression (SVR), which is used to calculate the regression relationships between the historical data and forecasting data of wind speed. In order to improve the forecast precision, historical data is clustered by cluster analysis so that the historical data whose changing trend is similar with the forecasting data can be filtered out. The filtered historical data is used as the training samples for SVR and the parameters would be optimized by particle swarm optimization (PSO). The forecasting model is tested by actual data and the forecast precision is more accurate than the industry standards. The results prove the feasibility and reliability of the model.

The Identification Of Multiple Outliers

  • Park, Jin-Pyo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.11 no.2
    • /
    • pp.201-215
    • /
    • 2000
  • The classical method for regression analysis is the least squares method. However, if the data contain significant outliers, the least squares estimator can be broken down by outliers. To remedy this problem, the robust methods are important complement to the least squares method. Robust methods down weighs or completely ignore the outliers. This is not always best because the outliers can contain some very important information about the population. If they can be detected, the outliers can be further inspected and appropriate action can be taken based on the results. In this paper, I propose a sequential outlier test to identify outliers. It is based on the nonrobust estimate and the robust estimate of scatter of a robust regression residuals and is applied in forward procedure, removing the most extreme data at each step, until the test fails to detect outliers. Unlike other forward procedures, the present one is unaffected by swamping or masking effects because the statistics is based on the robust regression residuals. I show the asymptotic distribution of the test statistics and apply the test to several real data and simulated data for the test to be shown to perform fairly well.

  • PDF

Kernel Ridge Regression with Randomly Right Censored Data

  • Shim, Joo-Yong;Seok, Kyung-Ha
    • Communications for Statistical Applications and Methods
    • /
    • v.15 no.2
    • /
    • pp.205-211
    • /
    • 2008
  • This paper deals with the estimations of kernel ridge regression when the responses are subject to randomly right censoring. The iterative reweighted least squares(IRWLS) procedure is employed to treat censored observations. The hyperparameters of model which affect the performance of the proposed procedure are selected by a generalized cross validation(GCV) function. Experimental results are then presented which indicate the performance of the proposed procedure.

Multiple Structural Change-Point Estimation in Linear Regression Models

  • Kim, Jae-Hee
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.3
    • /
    • pp.423-432
    • /
    • 2012
  • This paper is concerned with the detection of multiple change-points in linear regression models. The proposed procedure relies on the local estimation for global change-point estimation. We propose a multiple change-point estimator based on the local least squares estimators for the regression coefficients and the split measure when the number of change-points is unknown. Its statistical properties are shown and its performance is assessed by simulations and real data applications.

Biased-Recovering Algorithm to Solve a Highly Correlated Data System (상관관계가 강한 독립변수들을 포함한 데이터 시스템 분석을 위한 편차 - 복구 알고리듬)

  • 이미영
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.28 no.3
    • /
    • pp.61-66
    • /
    • 2003
  • In many multiple regression analyses, the “multi-collinearity” problem arises since some independent variables are highly correlated with each other. Practically, the Ridge regression method is often adopted to deal with the problems resulting from multi-collinearity. We propose a better alternative method using iteration to obtain an exact least squares estimator. We prove the solvability of the proposed algorithm mathematically and then compare our method with the traditional one.

The Bias of the Least Squares Estimator of Variance, the Autocorrelation of the Regressor Matrix, and the Autocorrelation of Disturbances

  • Jeong, Ki-Jun
    • Journal of the Korean Statistical Society
    • /
    • v.12 no.2
    • /
    • pp.81-90
    • /
    • 1983
  • The least squares estimator of disturbance variance in a regression model is biased under a serial correlation. Under the assumption of an AR(I), Theil(1971) crudely related the bias with the autocorrelation of the disturbances and the autocorrelation of the explanatory variable for a simple regression. In this paper we derive a relation which relates the bias with the autocorrelation of disturbances and the autocorrelation of explanatory variables for a multiple regression with improved precision.

  • PDF

Support vector expectile regression using IRWLS procedure

  • Choi, Kook-Lyeol;Shim, Jooyong;Seok, Kyungha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.4
    • /
    • pp.931-939
    • /
    • 2014
  • In this paper we propose the iteratively reweighted least squares procedure to solve the quadratic programming problem of support vector expectile regression with an asymmetrically weighted squares loss function. The proposed procedure enables us to select the appropriate hyperparameters easily by using the generalized cross validation function. Through numerical studies on the artificial and the real data sets we show the effectiveness of the proposed method on the estimation performances.

A Method for Screening Product Design Variables for Building A Usability Model : Genetic Algorithm Approach (사용편의성 모델수립을 위한 제품 설계 변수의 선별방법 : 유전자 알고리즘 접근방법)

  • Yang, Hui-Cheol;Han, Seong-Ho
    • Journal of the Ergonomics Society of Korea
    • /
    • v.20 no.1
    • /
    • pp.45-62
    • /
    • 2001
  • This study suggests a genetic algorithm-based partial least squares (GA-based PLS) method to select the design variables for building a usability model. The GA-based PLS uses a genetic algorithm to minimize the root-mean-squared error of a partial least square regression model. A multiple linear regression method is applied to build a usability model that contains the variables seleded by the GA-based PLS. The performance of the usability model turned out to be generally better than that of the previous usability models using other variable selection methods such as expert rating, principal component analysis, cluster analysis, and partial least squares. Furthermore, the model performance was drastically improved by supplementing the category type variables selected by the GA-based PLS in the usability model. It is recommended that the GA-based PLS be applied to the variable selection for developing a usability model.

  • PDF

Preference Map using Weighted Regression

  • S.Y. Hwang;Jung, Su-Jin;Kim, Young-Won
    • Communications for Statistical Applications and Methods
    • /
    • v.8 no.3
    • /
    • pp.651-659
    • /
    • 2001
  • Preference map is a widely used graphical method for the preference data set which is frequently encountered in the field of marketing research. This provides joint configuration usually in two dimensional space between "products" and their "attributes". Whereas the classical preference map adopts the ordinary least squares method in deriving map, the present article suggests the weighted least squares approach providing the better graphical display and interpretation compared to the classical one. Internet search engine data in Korea are analysed for illustration.

  • PDF