• 제목/요약/키워드: regression function

검색결과 2,134건 처리시간 0.026초

Distribution of the Estimator for Peak of a Regression Function Using the Concomitants of Extreme Oder Statistics

  • Kim, S.H;Kim, T.S.
    • Communications for Statistical Applications and Methods
    • /
    • 제5권3호
    • /
    • pp.855-868
    • /
    • 1998
  • For a random sample of size n from general linear model, $Y_i= heta(X_i)+varepsilon_i,;let Y_{in}$ denote the ith oder statistics of the Y sample values. The X-value associated with $Y_{in}$ is denoted by $X_{[in]}$ and is called the concomitant of ith order statistics. The estimator of the location of a maximum of a regression function, $ heta$($\chi$), was proposed by (equation omitted) and was found the convergence rate of it under certain weak assumptions on $ heta$. We will discuss the asymptotic distributions of both $ heta(X_{〔n-r+1〕}$) and (equation omitted) when r is fixed as nolongrightarrow$\infty$(i.e. extreme case) on the basis of the theorem of the concomitants of order statistics. And the will investigate the asymptotic behavior of Max{$\theta$( $X_{〔n-r+1:n〕/}$ ), . , $\theta$( $X_{〔n:n〕}$)}as an estimator for the peak of a regression function.

  • PDF

Deep LS-SVM for regression

  • Hwang, Changha;Shim, Jooyong
    • Journal of the Korean Data and Information Science Society
    • /
    • 제27권3호
    • /
    • pp.827-833
    • /
    • 2016
  • In this paper, we propose a deep least squares support vector machine (LS-SVM) for regression problems, which consists of the input layer and the hidden layer. In the hidden layer, LS-SVMs are trained with the original input variables and the perturbed responses. For the final output, the main LS-SVM is trained with the outputs from LS-SVMs of the hidden layer as input variables and the original responses. In contrast to the multilayer neural network (MNN), LS-SVMs in the deep LS-SVM are trained to minimize the penalized objective function. Thus, the learning dynamics of the deep LS-SVM are entirely different from MNN in which all weights and biases are trained to minimize one final error function. When compared to MNN approaches, the deep LS-SVM does not make use of any combination weights, but trains all LS-SVMs in the architecture. Experimental results from real datasets illustrate that the deep LS-SVM significantly outperforms state of the art machine learning methods on regression problems.

QUASI-LIKELIHOOD REGRESSION FOR VARYING COEFFICIENT MODELS WITH LONGITUDINAL DATA

  • Kim, Choong-Rak;Jeong, Mee-Seon;Kim, Woo-Chul;Park, Byeong-U.
    • Journal of the Korean Statistical Society
    • /
    • 제33권4호
    • /
    • pp.367-379
    • /
    • 2004
  • This article deals with the nonparametric analysis of longitudinal data when there exist possible correlations among repeated measurements for a given subject. We consider a quasi-likelihood regression model where a transformation of the regression function through a link function is linear in time-varying coefficients. We investigate the local polynomial approach to estimate the time-varying coefficients, and derive the asymptotic distribution of the estimators in this quasi-likelihood context. A real data set is analyzed as an illustrative example.

Estimating Regression Function with $\varepsilon-Insensitive$ Supervised Learning Algorithm

  • Hwang, Chang-Ha
    • Journal of the Korean Data and Information Science Society
    • /
    • 제15권2호
    • /
    • pp.477-483
    • /
    • 2004
  • One of the major paradigms for supervised learning in neural network community is back-propagation learning. The standard implementations of back-propagation learning are optimal under the assumptions of identical and independent Gaussian noise. In this paper, for regression function estimation, we introduce $\varepsilon-insensitive$ back-propagation learning algorithm, which corresponds to minimizing the least absolute error. We compare this algorithm with support vector machine(SVM), which is another $\varepsilon-insensitive$ supervised learning algorithm and has been very successful in pattern recognition and function estimation problems. For comparison, we consider a more realistic model would allow the noise variance itself to depend on the input variables.

  • PDF

Kernel Poisson Regression for Longitudinal Data

  • Shim, Joo-Yong;Seok, Kyung-Ha
    • Journal of the Korean Data and Information Science Society
    • /
    • 제19권4호
    • /
    • pp.1353-1360
    • /
    • 2008
  • An estimating procedure is introduced for the nonlinear mixed-effect Poisson regression, for longitudinal study, where data from different subjects are independent whereas data from same subject are correlated. The proposed procedure provides the estimates of the mean function of the response variables, where the canonical parameter is related to the input vector in a nonlinear form. The generalized cross validation function is introduced to choose optimal hyper-parameters in the procedure. Experimental results are then presented, which indicate the performance of the proposed estimating procedure.

  • PDF

Sparse Kernel Regression using IRWLS Procedure

  • Park, Hye-Jung
    • Journal of the Korean Data and Information Science Society
    • /
    • 제18권3호
    • /
    • pp.735-744
    • /
    • 2007
  • Support vector machine(SVM) is capable of providing a more complete description of the linear and nonlinear relationships among random variables. In this paper we propose a sparse kernel regression(SKR) to overcome a weak point of SVM, which is, the steep growth of the number of support vectors with increasing the number of training data. The iterative reweighted least squares(IRWLS) procedure is used to solve the optimal problem of SKR with a Laplacian prior. Furthermore, the generalized cross validation(GCV) function is introduced to select the hyper-parameters which affect the performance of SKR. Experimental results are then presented which illustrate the performance of the proposed procedure.

  • PDF

Nonparametric Estimation of Univariate Binary Regression Function

  • Jung, Shin Ae;Kang, Kee-Hoon
    • International Journal of Advanced Culture Technology
    • /
    • 제10권1호
    • /
    • pp.236-241
    • /
    • 2022
  • We consider methods of estimating a binary regression function using a nonparametric kernel estimation when there is only one covariate. For this, the Nadaraya-Watson estimation method using single and double bandwidths are used. For choosing a proper smoothing amount, the cross-validation and plug-in methods are compared. In the real data analysis for case study, German credit data and heart disease data are used. We examine whether the nonparametric estimation for binary regression function is successful with the smoothing parameter using the above two approaches, and the performance is compared.

Some efficient ratio-type exponential estimators using the Robust regression's Huber M-estimation function

  • Vinay Kumar Yadav;Shakti Prasad
    • Communications for Statistical Applications and Methods
    • /
    • 제31권3호
    • /
    • pp.291-308
    • /
    • 2024
  • The current article discusses ratio type exponential estimators for estimating the mean of a finite population in sample surveys. The estimators uses robust regression's Huber M-estimation function, and their bias as well as mean squared error expressions are derived. It was campared with Kadilar, Candan, and Cingi (Hacet J Math Stat, 36, 181-188, 2007) estimators. The circumstances under which the suggested estimators perform better than competing estimators are discussed. Five different population datasets with a well recognized outlier have been widely used in numerical and simulation-based research. These thorough studies seek to provide strong proof to back up our claims by carefully assessing and validating the theoretical results reported in our study. The estimators that have been proposed are intended to significantly improve both the efficiency and accuracy of estimating the mean of a finite population. As a result, the results that are obtained from statistical analyses will be more reliable and precise.

ADF를 사용한 유전프로그래밍 기반 비선형 회귀분석 기법 개선 및 풍속 예보 보정 응용 (Improvement of Genetic Programming Based Nonlinear Regression Using ADF and Application for Prediction MOS of Wind Speed)

  • 오승철;서기성
    • 전기학회논문지
    • /
    • 제64권12호
    • /
    • pp.1748-1755
    • /
    • 2015
  • A linear regression is widely used for prediction problem, but it is hard to manage an irregular nature of nonlinear system. Although nonlinear regression methods have been adopted, most of them are only fit to low and limited structure problem with small number of independent variables. However, real-world problem, such as weather prediction required complex nonlinear regression with large number of variables. GP(Genetic Programming) based evolutionary nonlinear regression method is an efficient approach to attach the challenging problem. This paper introduces the improvement of an GP based nonlinear regression method using ADF(Automatically Defined Function). It is believed ADFs allow the evolution of modular solutions and, consequently, improve the performance of the GP technique. The suggested ADF based GP nonlinear regression methods are compared with UM, MLR, and previous GP method for 3 days prediction of wind speed using MOS(Model Output Statistics) for partial South Korean regions. The UM and KLAPS data of 2007-2009, 2011-2013 years are used for experimentation.

다중회귀분석에 의한 하천 월 유출량의 추계학적 추정에 관한 연구 (A Study on Stochastic Estimation of Monthly Runoff by Multiple Regression Analysis)

  • 김태철;정하우
    • 한국농공학회지
    • /
    • 제22권3호
    • /
    • pp.75-87
    • /
    • 1980
  • Most hydro]ogic phenomena are the complex and organic products of multiple causations like climatic and hydro-geological factors. A certain significant correlation on the run-off in river basin would be expected and foreseen in advance, and the effect of each these causual and associated factors (independant variables; present-month rainfall, previous-month run-off, evapotranspiration and relative humidity etc.) upon present-month run-off(dependent variable) may be determined by multiple regression analysis. Functions between independant and dependant variables should be treated repeatedly until satisfactory and optimal combination of independant variables can be obtained. Reliability of the estimated function should be tested according to the result of statistical criterion such as analysis of variance, coefficient of determination and significance-test of regression coefficients before first estimated multiple regression model in historical sequence is determined. But some error between observed and estimated run-off is still there. The error arises because the model used is an inadequate description of the system and because the data constituting the record represent only a sample from a population of monthly discharge observation, so that estimates of model parameter will be subject to sampling errors. Since this error which is a deviation from multiple regression plane cannot be explained by first estimated multiple regression equation, it can be considered as a random error governed by law of chance in nature. This unexplained variance by multiple regression equation can be solved by stochastic approach, that is, random error can be stochastically simulated by multiplying random normal variate to standard error of estimate. Finally hybrid model on estimation of monthly run-off in nonhistorical sequence can be determined by combining the determistic component of multiple regression equation and the stochastic component of random errors. Monthly run-off in Naju station in Yong-San river basin is estimated by multiple regression model and hybrid model. And some comparisons between observed and estimated run-off and between multiple regression model and already-existing estimation methods such as Gajiyama formula, tank model and Thomas-Fiering model are done. The results are as follows. (1) The optimal function to estimate monthly run-off in historical sequence is multiple linear regression equation in overall-month unit, that is; Qn=0.788Pn+0.130Qn-1-0.273En-0.1 About 85% of total variance of monthly runoff can be explained by multiple linear regression equation and its coefficient of determination (R2) is 0.843. This means we can estimate monthly runoff in historical sequence highly significantly with short data of observation by above mentioned equation. (2) The optimal function to estimate monthly runoff in nonhistorical sequence is hybrid model combined with multiple linear regression equation in overall-month unit and stochastic component, that is; Qn=0. 788Pn+0. l30Qn-1-0. 273En-0. 10+Sy.t The rest 15% of unexplained variance of monthly runoff can be explained by addition of stochastic process and a bit more reliable results of statistical characteristics of monthly runoff in non-historical sequence are derived. This estimated monthly runoff in non-historical sequence shows up the extraordinary value (maximum, minimum value) which is not appeared in the observed runoff as a random component. (3) "Frequency best fit coefficient" (R2f) of multiple linear regression equation is 0.847 which is the same value as Gaijyama's one. This implies that multiple linear regression equation and Gajiyama formula are theoretically rather reasonable functions.

  • PDF