• Title/Summary/Keyword: 최소제곱회귀분석

Search Result 75, Processing Time 0.021 seconds

An Analysis on the Spatio-temporal Heterogeneity of Real Transaction Price of Apartment in Seoul Using the Geostatistical Methods (공간통계기법을 이용한 서울시 아파트 실거래가 변인의 시공간적 이질성 분석)

  • Kim, Jung Hee
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.24 no.4
    • /
    • pp.75-81
    • /
    • 2016
  • This study focused on exploring real transaction price of apartment and spatial and temporal heterogeneity of the variables that influence real transaction price of apartment from the spatial and temporal perspective. As independent variables that are considered to influence real transaction price of apartment, transport, local characteristics, educational conditions, population, and economic characteristics were taken into account. Accordingly, the influence of independent variables and spatial distribution pattern were analyzed from the global and local aspects. The spatial and temporal changing patterns of real transaction price of apartment which is a dependent variable were analyzed. First, to establish an analysis model, OLS analysis and GWR analysis were conducted, and thereby more efficient and proper model was selected. Secondly, to find spatial and temporal heterogeneity of independent variables with the use of the selected GWR model, Local $R^2$ was used for local analysis. Thirdly, to look into spatial distribution of independent variables, kriging analysis was carried out. Therefore, based on the results, it is considered that it is possible to carry out more microscopic housing submarket analysis and lay the foundation for establishing a policy on real property.

Procedure for the Selection of Principal Components in Principal Components Regression (주성분회귀분석에서 주성분선정을 위한 새로운 방법)

  • Kim, Bu-Yong;Shin, Myung-Hee
    • The Korean Journal of Applied Statistics
    • /
    • v.23 no.5
    • /
    • pp.967-975
    • /
    • 2010
  • Since the least squares estimation is not appropriate when multicollinearity exists among the regressors of the linear regression model, the principal components regression is used to deal with the multicollinearity problem. This article suggests a new procedure for the selection of suitable principal components. The procedure is based on the condition index instead of the eigenvalue. The principal components corresponding to the indices are removed from the model if any condition indices are larger than the upper limit of the cutoff value. On the other hand, the corresponding principal components are included if any condition indices are smaller than the lower limit. The forward inclusion method is employed to select proper principal components if any condition indices are between the upper limit and the lower limit. The limits are obtained from the linear model which is constructed on the basis of the conjoint analysis. The procedure is evaluated by Monte Carlo simulation in terms of the mean square error of estimator. The simulation results indicate that the proposed procedure is superior to the existing methods.

A Study on the Treatment of Uncertainty in Linear Regression Method for Chemical Analysis (회귀식 사용에 따른 화학 분석 과정의 불확도 처리 연구)

  • Woo, Jin-Chun;Suh, JungKee;Lim, MyungChul;Park, MinSu
    • Analytical Science and Technology
    • /
    • v.16 no.3
    • /
    • pp.185-190
    • /
    • 2003
  • We applied modified least square method (MLS) and ordinary least square method (OLS) to 1st order equation for the comparison of the uncertainties calculated by these methods. The uncertainty calculated by OLS covered statistically safe interval because it was over-estimated in many cases of measurement and concentration level. But, if the uncertainty of the concentration as a reference value was comparably large (about 5% of the relative standard deviation of random scattering from the regression line and about 7% of relative standard uncertainty of reference values), then uncertainty calculated by OLS was seriously under-estimated at high concentration level. It was revealed that the calculated uncertainty didn't cover statistically safe interval at the stated confidence level. It was found that the method, MLS, described in the previously article would be valid for this calculation of uncertainty.

Mixed effects least squares support vector machine for survival data analysis (생존자료분석을 위한 혼합효과 최소제곱 서포트벡터기계)

  • Hwang, Chang-Ha;Shim, Joo-Yong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.4
    • /
    • pp.739-748
    • /
    • 2012
  • In this paper we propose a mixed effects least squares support vector machine (LS-SVM) for the censored data which are observed from different groups. We use weights by which the randomly right censoring is taken into account in the nonlinear regression. The weights are formed with Kaplan-Meier estimates of censoring distribution. In the proposed model a random effects term representing inter-group variation is included. Furthermore generalized cross validation function is proposed for the selection of the optimal values of hyper-parameters. Experimental results are then presented which indicate the performance of the proposed LS-SVM by comparing with a standard LS-SVM for the censored data.

Influence Comparison of Customer Satisfaction Factor using Quantile Regression Model (분위회귀모형을 이용한 고객만족도 요인의 영향력 비교)

  • Kim, Seong-Yoon;Kim, Yong-Tae;Lee, Sang-Jun
    • Journal of Digital Convergence
    • /
    • v.13 no.6
    • /
    • pp.125-132
    • /
    • 2015
  • It is current situation that a number of issues are being raised how the weight is calculated from customer satisfaction survey. This study investigated how the weight of satisfaction for each quantile is different by comparing ordinary least square regression model to quantile regression model and carried out bootstrap verification to find the influence difference of regression coefficient for each quantile. As the analysis result of using R(Quantreg package) that is open software, it appeared that there was the influence size of satisfaction factor along study result and quantile and there was the significant difference statistically regarding regression coefficient for each quantile. So, to use quantile regression model that offers the influence of satisfaction factor for each customer group along satisfaction level would contribute to plan the quantitative convergence policy for customer satisfaction.

Time series analysis for Korean COVID-19 confirmed cases: HAR-TP-T model approach (한국 COVID-19 확진자 수에 대한 시계열 분석: HAR-TP-T 모형 접근법)

  • Yu, SeongMin;Hwang, Eunju
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.2
    • /
    • pp.239-254
    • /
    • 2021
  • This paper studies time series analysis with estimation and forecasting for Korean COVID-19 confirmed cases, based on the approach of a heterogeneous autoregressive (HAR) model with two-piece t (TP-T) distributed errors. We consider HAR-TP-T time series models and suggest a step-by-step method to estimate HAR coefficients as well as TP-T distribution parameters. In our proposed step-by-step estimation, the ordinary least squares method is utilized to estimate the HAR coefficients while the maximum likelihood estimation (MLE) method is adopted to estimate the TP-T error parameters. A simulation study on the step-by-step method is conducted and it shows a good performance. For the empirical analysis on the Korean COVID-19 confirmed cases, estimates in the HAR-TP-T models of order p = 2, 3, 4 are computed along with a couple of selected lags, which include the optimal lags chosen by minimizing the mean squares errors of the models. The estimation results by our proposed method and the solely MLE are compared with some criteria rules. Our proposed step-by-step method outperforms the MLE in two aspects: mean squares error of the HAR model and mean squares difference between the TP-T residuals and their densities. Moreover, forecasting for the Korean COVID-19 confirmed cases is discussed with the optimally selected HAR-TP-T model. Mean absolute percentage error of one-step ahead out-of-sample forecasts is evaluated as 0.0953% in the proposed model. We conclude that our proposed HAR-TP-T time series model with optimally selected lags and its step-by-step estimation provide an accurate forecasting performance for the Korean COVID-19 confirmed cases.

A study on semi-supervised kernel ridge regression estimation (준지도 커널능형회귀모형에 관한 연구)

  • Seok, Kyungha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.2
    • /
    • pp.341-353
    • /
    • 2013
  • In many practical machine learning and data mining applications, unlabeled data are inexpensive and easy to obtain. Semi-supervised learning try to use such data to improve prediction performance. In this paper, a semi-supervised regression method, semi-supervised kernel ridge regression estimation, is proposed on the basis of kernel ridge regression model. The proposed method does not require a pilot estimation of the label of the unlabeled data. This means that the proposed method has good advantages including less number of parameters, easy computing and good generalization ability. Experiments show that the proposed method can effectively utilize unlabeled data to improve regression estimation.

Preliminary test estimation method accounting for error variance structure in nonlinear regression models (비선형 회귀모형에서 오차의 분산에 따른 예비검정 추정방법)

  • Yu, Hyewon;Lim, Changwon
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.4
    • /
    • pp.595-611
    • /
    • 2016
  • We use nonlinear regression models (such as the Hill Model) when we analyze data in toxicology and/or pharmacology. In nonlinear regression models an estimator of parameters and estimation of measurement about uncertainty of the estimator are influenced by the variance structure of the error. Thus, estimation methods should be different depending on whether the data are homoscedastic or heteroscedastic. However, we do not know the variance structure of the error until we actually analyze the data. Therefore, developing estimation methods robust to the variance structure of the error is an important problem. In this paper we propose a method to estimate parameters in nonlinear regression models based on a preliminary test. We define an estimator which uses either the ordinary least square estimation method or the iterative weighted least square estimation method according to the results of a simple preliminary test for the equality of the error variance. The performance of the proposed estimator is compared to those of existing estimators by simulation studies. We also compare estimation methods using real data obtained from the National Toxicology program of the United States.

A study on the properties of sensitivity analysis in principal component regression and latent root regression (주성분회귀와 고유값회귀에 대한 감도분석의 성질에 대한 연구)

  • Shin, Jae-Kyoung;Chang, Duk-Joon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.2
    • /
    • pp.321-328
    • /
    • 2009
  • In regression analysis, the ordinary least squares estimates of regression coefficients become poor, when the correlations among predictor variables are high. This phenomenon, which is called multicollinearity, causes serious problems in actual data analysis. To overcome this multicollinearity, many methods have been proposed. Ridge regression, shrinkage estimators and methods based on principal component analysis (PCA) such as principal component regression (PCR) and latent root regression (LRR). In the last decade, many statisticians discussed sensitivity analysis (SA) in ordinary multiple regression and same topic in PCR, LRR and logistic principal component regression (LPCR). In those methods PCA plays important role. Many statisticians discussed SA in PCA and related multivariate methods. We introduce the method of PCR and LRR. We also introduce the methods of SA in PCR and LRR, and discuss the properties of SA in PCR and LRR.

  • PDF

A procedure for simultaneous variable selection, variable transformation and outlier identification in linear regression (선형회귀에서 변수선택, 변수변환과 이상치 탐지의 동시적 수행을 위한 절차)

  • Seo, Han Son;Yoon, Min
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.1
    • /
    • pp.1-10
    • /
    • 2020
  • We propose a unified approach to variable selection, transformation and outliers in the linear model. The procedure includes a sequential method for outlier detection and a least trimmed squares estimator for variable transformation. It uses all possible subsets regressions for model selection. Some real data analyses and the simulation results are provided to show the efficiency of the methods in the context of the correct variable selection and the fitness of the estimated model.