• Title/Summary/Keyword: least-squares regression analysis

Search Result 254, Processing Time 0.024 seconds

The Identification Of Multiple Outliers

  • Park, Jin-Pyo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.11 no.2
    • /
    • pp.201-215
    • /
    • 2000
  • The classical method for regression analysis is the least squares method. However, if the data contain significant outliers, the least squares estimator can be broken down by outliers. To remedy this problem, the robust methods are important complement to the least squares method. Robust methods down weighs or completely ignore the outliers. This is not always best because the outliers can contain some very important information about the population. If they can be detected, the outliers can be further inspected and appropriate action can be taken based on the results. In this paper, I propose a sequential outlier test to identify outliers. It is based on the nonrobust estimate and the robust estimate of scatter of a robust regression residuals and is applied in forward procedure, removing the most extreme data at each step, until the test fails to detect outliers. Unlike other forward procedures, the present one is unaffected by swamping or masking effects because the statistics is based on the robust regression residuals. I show the asymptotic distribution of the test statistics and apply the test to several real data and simulated data for the test to be shown to perform fairly well.

  • PDF

Switching Regression Analysis via Fuzzy LS-SVM

  • Hwang, Chang-Ha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.2
    • /
    • pp.609-617
    • /
    • 2006
  • A new fuzzy c-regression algorithm for switching regression analysis is presented, which combines fuzzy c-means clustering and least squares support vector machine. This algorithm can detect outliers in switching regression models while yielding the simultaneous estimates of the associated parameters together with a fuzzy c-partitions of data. It can be employed for the model-free nonlinear regression which does not assume the underlying form of the regression function. We illustrate the new approach with some numerical examples that show how it can be used to fit switching regression models to almost all types of mixed data.

  • PDF

A Study on the Improvement of the Accuracy for the Least-Squares Method Using Orthogonal Function (직교함수를 이용한 최소자승법의 정밀도 향상에 관한 연구)

  • Cho, Won Cheol;Lee, Jae Joon
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.6 no.4
    • /
    • pp.43-52
    • /
    • 1986
  • With increasing of computer use, a least squares method is now widely used in the regression analysis of various data. Unreliable results of regression coefficients due to the floating point of computer and problems of ordinary least squares method are described in detail. To improve these problems, a least squares method using orthogonal function is developed. Also, Comparison and analysis are performed through an example of numerical test, and re-orthogonalization method is used to increase the accuracy. As an example of application, the optimum order of AR process for the time series of monthly flow at the Pyungchang station is determined using Akaike's FPE(Final Prediction Error) which decides optimum degree of AR process. The result shows the AR(2) process is optimum to the series at the station.

  • PDF

Pathway and Network Analysis in Glioma with the Partial Least Squares Method

  • Gu, Wen-Tao;Gu, Shi-Xin;Shou, Jia-Jun
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.7
    • /
    • pp.3145-3149
    • /
    • 2014
  • Gene expression profiling facilitates the understanding of biological characteristics of gliomas. Previous studies mainly used regression/variance analysis without considering various background biological and environmental factors. The aim of this study was to investigate gene expression differences between grade III and IV gliomas through partial least squares (PLS) based analysis. The expression data set was from the Gene Expression Omnibus database. PLS based analysis was performed with the R statistical software. A total of 1,378 differentially expressed genes were identified. Survival analysis identified four pathways, including Prion diseases, colorectal cancer, CAMs, and PI3K-Akt signaling, which may be related with the prognosis of the patients. Network analysis identified two hub genes, ELAVL1 and FN1, which have been reported to be related with glioma previously. Our results provide new understanding of glioma pathogenesis and prognosis with the hope to offer theoretical support for future therapeutic studies.

Analysis of market share attraction data using LS-SVM (최소제곱 서포트벡터기계를 이용한 시장점유율 자료 분석)

  • Park, Hye-Jung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.5
    • /
    • pp.879-886
    • /
    • 2009
  • The purpose of this article is to present the application of Least Squares Support Vector Machine in analyzing the existing structure of brand. We estimate the parameters of the Market Share Attraction Model using a non-parametric technique for function estimation called Least Squares Support Vector Machine, which allows us to perform even nonlinear regression by constructing a linear regression function in a high dimensional feature space. Estimation by Least Squares Support Vector Machine technique makes it a good candidate for solving the Market Share Attraction Model. To illustrate the performance of the proposed method, we use the car sales data in South Korea's car market.

  • PDF

Modeling of compressive strength of HPC mixes using a combined algorithm of genetic programming and orthogonal least squares

  • Mousavi, S.M.;Gandomi, A.H.;Alavi, A.H.;Vesalimahmood, M.
    • Structural Engineering and Mechanics
    • /
    • v.36 no.2
    • /
    • pp.225-241
    • /
    • 2010
  • In this study, a hybrid search algorithm combining genetic programming with orthogonal least squares (GP/OLS) is utilized to generate prediction models for compressive strength of high performance concrete (HPC) mixes. The GP/OLS models are developed based on a comprehensive database containing 1133 experimental test results obtained from previously published papers. A multiple least squares regression (LSR) analysis is performed to benchmark the GP/OLS models. A subsequent parametric study is carried out to verify the validity of the models. The results indicate that the proposed models are effectively capable of evaluating the compressive strength of HPC mixes. The derived formulas are very simple, straightforward and provide an analysis tool accessible to practicing engineers.

Influencing factors and prediction of carbon dioxide emissions using factor analysis and optimized least squares support vector machine

  • Wei, Siwei;Wang, Ting;Li, Yanbin
    • Environmental Engineering Research
    • /
    • v.22 no.2
    • /
    • pp.175-185
    • /
    • 2017
  • As the energy and environmental problems are increasingly severe, researches about carbon dioxide emissions has aroused widespread concern. The accurate prediction of carbon dioxide emissions is essential for carbon emissions controlling. In this paper, we analyze the relationship between carbon dioxide emissions and influencing factors in a comprehensive way through correlation analysis and regression analysis, achieving the effective screening of key factors from 16 preliminary selected factors including GDP, total population, total energy consumption, power generation, steel production coal consumption, private owned automobile quantity, etc. Then fruit fly algorithm is used to optimize the parameters of least squares support vector machine. And the optimized model is used for prediction, overcoming the blindness of parameter selection in least squares support vector machine and maximizing the training speed and global searching ability accordingly. The results show that the prediction accuracy of carbon dioxide emissions is improved effectively. Besides, we conclude economic and environmental policy implications on the basis of analysis and calculation.

A New Deletion Criterion of Principal Components Regression with Orientations of the Parameters

  • Lee, Won-Woo
    • Journal of the Korean Statistical Society
    • /
    • v.16 no.2
    • /
    • pp.55-70
    • /
    • 1987
  • The principal components regression is one of the substitues for least squares method when there exists multicollinearity in the multiple linear regression model. It is observed graphically that the performance of the principal components regression is strongly dependent upon the values of the parameters. Accordingly, a new deletion criterion which determines proper principal components to be deleted from the analysis is developed and its usefulness is checked by simulations.

  • PDF

Comparison of linear and non-linear equation for the calibration of roxithromycin analysis using liquid chromatography/mass spectrometry

  • Lim, Jong-Hwan;Yun, Hyo-In
    • Korean Journal of Veterinary Research
    • /
    • v.50 no.1
    • /
    • pp.11-17
    • /
    • 2010
  • Linear and non-linear regressions were used to derive the calibration function for the measurement of roxithromycin plasma concentration. Their results were compared with weighted least squares regression by usual weight factors. In this paper the performance of a non-linear calibration equation with the capacity to account empirically for the curvature, y = ax$^{b}$ + c (b $\neq$ 1) is compared with the commonly used linear equation, y = ax + b, as well as the quadratic equation, y = ax$^{2}$+ bx + c. In the calibration curve (range of 0.01 to 10 ${\mu}g/mL$) of roxithromycin, both heteroscedasticity and nonlinearity were present therefore linear least squares regression methods could result in large errors in the determination of roxithromycin concentration. By the non-linear and weighted least squares regression, the accuracy of the analytical method was improved at the lower end of the calibration curve. This study suggests that the non-linear calibration equation should be considered when a curve is required to be fitted to low dose calibration data which exhibit slight curvature.

AI Technology Analysis using Partial Least Square Regression

  • Choi, JunHyeog;Jun, Sunghae
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.3
    • /
    • pp.109-115
    • /
    • 2020
  • In this paper, we propose an artificial intelligence(AI) technology analysis using partial least square(PLS) regression model. AI technology is now affecting most areas of our society. So, it is necessary to understand this technology. To analyze the AI technology, we collect the patent documents related to AI from the patent databases in the world. We extract AI technology keywords from the patent documents by text mining techniques. In addition, we analyze the AI keyword data by PLS regression model. This regression model is based on the technique of partial least squares used in the advanced analyses such as bioinformatics, social science, and engineering. To show the performance of our proposed method, we make experiments using AI patent documents, and we illustrate how our research can be applied to real problems. This paper is applicable not only to AI technology but also to other technological fields. This also contributes to understanding other various technologies by PLS regression analysis.