• 제목/요약/키워드: weighted least squares regression

검색결과 44건 처리시간 0.027초

KRUGLYAK과 LANDER의 유전연관성 비모수 방법과 반복 자료를 고려한 가중 회귀분석법의 비교 (Comparisons of Kruglyak and Lander's Nonparametric Linkage Test and Weighted Regression Incorporating Replications)

  • 최은경;송혜향
    • 응용통계연구
    • /
    • 제21권1호
    • /
    • pp.1-17
    • /
    • 2008
  • 형제 쌍(sibpair)의 연속형 형질(continuous traits) 자료를 이용한 유전연관성 검정 법(linkage test)으로서 Haseman과 Elston (1972)의 최소제곱(ordinary least square, OLS) 회귀분석법이 주로 사용된다. 비모수적 방법으로서 제시된 Kruglyak과 Lander (1995)의 검정통계량은 Haseman과 Elston (1972)의 방법에 대응되는 방법처럼 보이지만 실제로는 매우 다르다. 본 논문에서는 Kruglyak와 Lander (1995)의 검정통계량과 Haseman과 Elston (1972)의 검정통계량의 관계를 설명하고 모의실험으로 두 검정통계량의 검정력을 비교한다. 유전연관성에 사용되는 형제 자료의 특징은 한정된 설명변수의 값에 매우 많은 자료가 반복(replicated)되었다는 점이며, 이러한 반복 자료에 더욱 적절한 가중 회귀분석법을 제안한다. 가중 회귀분석법의 효율성을 정규분포 또는 정규분포가 아닌 연속형 형질 모의실험 자료로 알아본 결과 형제 쌍 자료의 유전연관성 검정에서 가중 회귀분석법이 다른 검정법들보다도 검정력이 높음을 확인하였다.

Weighted Support Vector Machines for Heteroscedastic Regression

  • Park, Hye-Jung;Hwang, Chang-Ha
    • Journal of the Korean Data and Information Science Society
    • /
    • 제17권2호
    • /
    • pp.467-474
    • /
    • 2006
  • In this paper we present a weighted support vector machine(SVM) and a weighted least squares support vector machine(LS-SVM) for the prediction in the heteroscedastic regression model. By adding weights to standard SVM and LS-SVM the better fitting ability can be achieved when errors are heteroscedastic. In the numerical studies, we illustrate the prediction performance of the proposed procedure by comparing with the procedure which combines standard SVM and LS-SVM and wild bootstrap for the prediction.

  • PDF

비선형 평균 일반화 이분산 자기회귀모형의 추정 (Estimation of nonlinear GARCH-M model)

  • 심주용;이장택
    • Journal of the Korean Data and Information Science Society
    • /
    • 제21권5호
    • /
    • pp.831-839
    • /
    • 2010
  • 최소제곱 서포트벡터기계는 비선형회귀분석과 분류에 널리 쓰이는 커널기법이다. 본 논문에서는 금융시계열자료의 평균 및 변동성을 추정하기 위하여 평균의 추정 방법으로는 가중최소제곱 서포트벡터기계, 변동성의 추정 방법으로는 최소제곱 서포트벡터기계를 사용하는 비선형 평균 일반화 이분산 자기회귀모형을 제안한다. 제안된 모형은 선형 일반화 이분산 자기회귀모형 및 선형 평균 일반화 이분산 자기회귀모형보다 더 나은 추정 능력을 가진다는 것을 실제자료의 추정을 통하여 보였다.

Weighted Least Absolute Deviation Lasso Estimator

  • Jung, Kang-Mo
    • Communications for Statistical Applications and Methods
    • /
    • 제18권6호
    • /
    • pp.733-739
    • /
    • 2011
  • The linear absolute shrinkage and selection operator(Lasso) method improves the low prediction accuracy and poor interpretation of the ordinary least squares(OLS) estimate through the use of $L_1$ regularization on the regression coefficients. However, the Lasso is not robust to outliers, because the Lasso method minimizes the sum of squared residual errors. Even though the least absolute deviation(LAD) estimator is an alternative to the OLS estimate, it is sensitive to leverage points. We propose a robust Lasso estimator that is not sensitive to outliers, heavy-tailed errors or leverage points.

Robustness of Minimum Disparity Estimators in Linear Regression Models

  • Pak, Ro-Jin
    • Journal of the Korean Statistical Society
    • /
    • 제24권2호
    • /
    • pp.349-360
    • /
    • 1995
  • This paper deals with the robustness properties of the minimum disparity estimation in linear regression models. The estimators defined as statistical quantities whcih minimize the blended weight Hellinger distance between a weighted kernel density estimator of the residuals and a smoothed model density of the residuals. It is shown that if the weights of the density estimator are appropriately chosen, the estimates of the regression parameters are robust.

  • PDF

Exploring Spatial Patterns of Theft Crimes Using Geographically Weighted Regression

  • Yoo, Youngwoo;Baek, Taekyung;Kim, Jinsoo;Park, Soyoung
    • 한국측량학회지
    • /
    • 제35권1호
    • /
    • pp.31-39
    • /
    • 2017
  • The goal of this study was to efficiently analyze the relationships of the number of thefts with related factors, considering the spatial patterns of theft crimes. Theft crime data for a 5-year period (2009-2013) were collected from Haeundae Police Station. A logarithmic transformation was performed to ensure an effective statistical analysis and the number of theft crimes was used as the dependent variable. Related factors were selected through a literature review and divided into social, environmental, and defensive factors. Seven factors, were selected as independent variables: the numbers of foreigners, aged persons, single households, companies, entertainment venues, community security centers, and CCTV (Closed-Circuit Television) systems. OLS (Ordinary Least Squares) and GWR (Geographically Weighted Regression) were used to analyze the relationship between the dependent variable and independent variables. In the GWR results, each independent variable had regression coefficients that differed by location over the study area. The GWR model calculated local values for, and could explain the relationships between, variables more efficiently than the OLS model. Additionally, the adjusted R square value of the GWR model was 10% higher than that of the OLS model, and the GWR model produced a AICc (Corrected Akaike Information Criterion) value that was lower by 230, as well as lower Moran's I values. From these results, it was concluded that the GWR model was more robust in explaining the relationship between the number of thefts and the factors related to theft crime.

Modeling mechanical strength of self-compacting mortar containing nanoparticles using wavelet-based support vector machine

  • Khatibinia, Mohsen;Feizbakhsh, Abdosattar;Mohseni, Ehsan;Ranjbar, Malek Mohammad
    • Computers and Concrete
    • /
    • 제18권6호
    • /
    • pp.1065-1082
    • /
    • 2016
  • The main aim of this study is to predict the compressive and flexural strengths of self-compacting mortar (SCM) containing $nano-SiO_2$, $nano-Fe_2O_3$ and nano-CuO using wavelet-based weighted least squares-support vector machines (WLS-SVM) approach which is called WWLS-SVM. The WWLS-SVM regression model is a relatively new metamodel has been successfully introduced as an excellent machine learning algorithm to engineering problems and has yielded encouraging results. In order to achieve the aim of this study, first, the WLS-SVM and WWLS-SVM models are developed based on a database. In the database, nine variables which consist of cement, sand, NS, NF, NC, superplasticizer dosage, slump flow diameter and V-funnel flow time are considered as the input parameters of the models. The compressive and flexural strengths of SCM are also chosen as the output parameters of the models. Finally, a statistical analysis is performed to demonstrate the generality performance of the models for predicting the compressive and flexural strengths. The numerical results show that both of these metamodels have good performance in the desirable accuracy and applicability. Furthermore, by adopting these predicting metamodels, the considerable cost and time-consuming laboratory tests can be eliminated.

Number of sampling leaves for reflectance measurement of Chinese cabbage and kale

  • Chung, Sun-Ok;Ngo, Viet-Duc;Kabir, Md. Shaha Nur;Hong, Soon-Jung;Park, Sang-Un;Kim, Sun-Ju;Park, Jong-Tae
    • 농업과학연구
    • /
    • 제41권3호
    • /
    • pp.169-175
    • /
    • 2014
  • Objective of this study was to investigate effects of pre-processing method and number of sampling leaves on stability of the reflectance measurement for Chinese cabbage and kale leaves. Chinese cabbage and kale were transplanted and cultivated in a plant factory. Leaf samples of the kale and cabbage were collected at 4 weeks after transplanting of the seedlings. Spectra data were collected with an UV/VIS/NIR spectrometer in the wavelength region from 190 to 1130 nm. All leaves (mature and young leaves) were measured on 9 and 12 points in the blade part in the upper area for kale and cabbage leaves, respectively. To reduce the spectral noise, the raw spectral data were preprocessed by different methods: i) moving average, ii) Savitzky-Golay filter, iii) local regression using weighted linear least squares and a $1^{st}$ degree polynomial model (lowess), iv) local regression using weighted linear least squares and a $2^{nd}$ degree polynomial model (loess), v) a robust version of 'lowess', vi) a robust version of 'loess', with 7, 11, 15 smoothing points. Effects of number of sampling leaves were investigated by reflectance difference (RD) and cross-correlation (CC) methods. Results indicated that the contribution of the spectral data collected at 4 sampling leaves were good for both of the crops for reflectance measurement that does not change stability of measurement much. Furthermore, moving average method with 11 smoothing points was believed to provide reliable pre-processed data for further analysis.

A study on robust regression estimators in heteroscedastic error models

  • Son, Nayeong;Kim, Mijeong
    • Journal of the Korean Data and Information Science Society
    • /
    • 제28권5호
    • /
    • pp.1191-1204
    • /
    • 2017
  • Weighted least squares (WLS) estimation is often easily used for the data with heteroscedastic errors because it is intuitive and computationally inexpensive. However, WLS estimator is less robust to a few outliers and sometimes it may be inefficient. In order to overcome robustness problems, Box-Cox transformation, Huber's M estimation, bisquare estimation, and Yohai's MM estimation have been proposed. Also, more efficient estimations than WLS have been suggested such as Bayesian methods (Cepeda and Achcar, 2009) and semiparametric methods (Kim and Ma, 2012) in heteroscedastic error models. Recently, Çelik (2015) proposed the weight methods applicable to the heteroscedasticity patterns including butterfly-distributed residuals and megaphone-shaped residuals. In this paper, we review heteroscedastic regression estimators related to robust or efficient estimation and describe their properties. Also, we analyze cost data of U.S. Electricity Producers in 1955 using the methods discussed in the paper.

GWR을 활용한 NDVI와 지형·태양광도의 상관성 평가 : 금강산 지역을 사례로 (Exploring NDVI Gradient Varying Across Landform and Solar Intensity using GWR: a Case Study of Mt. Geumgang in North Korea)

  • 김준우;엄정섭
    • 대한공간정보학회지
    • /
    • 제21권4호
    • /
    • pp.73-81
    • /
    • 2013
  • 식생의 분포와 지형 태양광도의 상관성을 규명하는 것은 공간적 이질성을 내포하는 공간데이터의 분석이지만 기존의 많은 선형모델들은 이들 데이터가 갖는 공간적 특성을 고려하지 못하고 있다. 이러한 문제점을 극복하기 위해 금강산을 대상으로 식생분포를 정량적으로 나타내는 NDVI(Normalized Difference Vegetation Index)와 일사량, 일조시간, 고도, 경사에 대하여 지리가중회귀분석(GWR : Geographically Weighted Regression)을 실시하였다. GWR 은 전역적 모형인 OLS(Ordinary Least Squares)에 비해 모형의 설명력과 적합성이 확연히 높아졌으며, 잔차의 공간적 자기상관성 또한 해소된 것으로 나타났다. OLS 분석결과는 NDVI에 미치는 지형 태양광도의 영향력을 연구지역에서 단일하게 추정하였으나, GWR은 각 인자가 NDVI에 미치는 영향력을 국지적으로 보다 세밀하게 추정하여 공간단위에 따른 각 인자의 영향력을 보다 확연히 나타내었다. 국지적 차원에서 추정된 NDVI와 지형 태양광도의 상관성은 식생분포를 조사하는 과정에서 보다 객관적이고 세밀한 분석을 위한 중요한 참고자료로 사용될 수 있을 것이다.