• Title/Summary/Keyword: nonparametric regression

Search Result 191, Processing Time 0.019 seconds

Bias corrected non-response estimation using nonparametric function estimation of super population model (선형 응답률 모형에서 초모집단 모형의 비모수적 함수 추정을 이용한 무응답 편향 보정 추정)

  • Sim, Joo-Yong;Shin, Key-Il
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.6
    • /
    • pp.923-936
    • /
    • 2021
  • A large number of non-responses are occurring in the sample survey, and various methods have been developed to deal with them appropriately. In particular, the bias caused by non-ignorable non-response greatly reduces the accuracy of estimation and makes non-response processing difficult. Recently, Chung and Shin (2017, 2020) proposed an estimator that improves the accuracy of estimation using parametric super-population model and response rate model. In this study, we suggested a bias corrected non-response mean estimator using a nonparametric function generalizing the form of a parametric super-population model. We confirmed the superiority of the proposed estimator through simulation studies.

Implicit Treatment of Technical Specification and Thermal Hydraulic Parameter Uncertainties in Gaussian Process Model to Estimate Safety Margin

  • Fynan, Douglas A.;Ahn, Kwang-Il
    • Nuclear Engineering and Technology
    • /
    • v.48 no.3
    • /
    • pp.684-701
    • /
    • 2016
  • The Gaussian process model (GPM) is a flexible surrogate model that can be used for nonparametric regression for multivariate problems. A unique feature of the GPM is that a prediction variance is automatically provided with the regression function. In this paper, we estimate the safety margin of a nuclear power plant by performing regression on the output of best-estimate simulations of a large-break loss-of-coolant accident with sampling of safety system configuration, sequence timing, technical specifications, and thermal hydraulic parameter uncertainties. The key aspect of our approach is that the GPM regression is only performed on the dominant input variables, the safety injection flow rate and the delay time for AC powered pumps to start representing sequence timing uncertainty, providing a predictive model for the peak clad temperature during a reflood phase. Other uncertainties are interpreted as contributors to the measurement noise of the code output and are implicitly treated in the GPM in the noise variance term, providing local uncertainty bounds for the peak clad temperature. We discuss the applicability of the foregoing method to reduce the use of conservative assumptions in best estimate plus uncertainty (BEPU) and Level 1 probabilistic safety assessment (PSA) success criteria definitions while dealing with a large number of uncertainties.

Generalization of Fisher′s linear discriminant analysis via the approach of sliced inverse regression

  • Chen, Chun-Houh;Li, Ker-Chau
    • Journal of the Korean Statistical Society
    • /
    • v.30 no.2
    • /
    • pp.193-217
    • /
    • 2001
  • Despite of the rich literature in discriminant analysis, this complicated subject remains much to be explored. In this article, we study the theoretical foundation that supports Fisher's linear discriminant analysis (LDA) by setting up the classification problem under the dimension reduction framework as in Li(1991) for introducing sliced inverse regression(SIR). Through the connection between SIR and LDA, our theory helps identify sources of strength and weakness in using CRIMCOORDS(Gnanadesikan 1977) as a graphical tool for displaying group separation patterns. This connection also leads to several ways of generalizing LDA for better exploration and exploitation of nonlinear data patterns.

  • PDF

On Adaptation to Sparse Design in Bivariate Local Linear Regression

  • Hall, Peter;Seifert, Burkhardt;Turlach, Berwin A.
    • Journal of the Korean Statistical Society
    • /
    • v.30 no.2
    • /
    • pp.231-246
    • /
    • 2001
  • Local linear smoothing enjoys several excellent theoretical and numerical properties, an in a range of applications is the method most frequently chosen for fitting curves to noisy data. Nevertheless, it suffers numerical problems in places where the distribution of design points(often called predictors, or explanatory variables) is spares. In the case of univariate design, several remedies have been proposed for overcoming this problem, of which one involves adding additional ″pseudo″ design points in places where the orignal design points were too widely separated. This approach is particularly well suited to treating sparse bivariate design problem, and in fact attractive, elegant geometric analogues of unvariate imputation and interpolation rules are appropriate for that case. In the present paper we introduce and develop pseudo dta rules for bivariate design, and apply them to real data.

  • PDF

Nonparametric test on dimensionality of explantory variables (설명변수 차원 축소에 관한 비모수적 검정)

  • 서한손
    • The Korean Journal of Applied Statistics
    • /
    • v.8 no.2
    • /
    • pp.65-75
    • /
    • 1995
  • For the determination of dimension of e.d.r. space, both of Sliced Inverse Regression (SIR) and Principal Hessian Directions (PHD) proposed asymptotic test. But the asymptotic test requires the normality and large samples of explanatory variables. Cook and Weisberg(1991) suggested permutation tests instead. In this study permutation tests are actually made, and the power of them is compared with asymptotic test in the case of SIR and PHD.

  • PDF

A Local Linear Kernel Estimator for Sparse Multinomial Data

  • Baek, Jangsun
    • Journal of the Korean Statistical Society
    • /
    • v.27 no.4
    • /
    • pp.515-529
    • /
    • 1998
  • Burman (1987) and Hall and Titterington (1987) studied kernel smoothing for sparse multinomial data in detail. Both of their estimators for cell probabilities are sparse asymptotic consistent under some restrictive conditions on the true cell probabilities. Dong and Simonoff (1994) adopted boundary kernels to relieve the restrictive conditions. We propose a local linear kernel estimator which is popular in nonparametric regression to estimate cell probabilities. No boundary adjustment is necessary for this estimator since it adapts automatically to estimation at the boundaries. It is shown that our estimator attains the optimal rate of convergence in mean sum of squared error under sparseness. Some simulation results and a real data application are presented to see the performance of the estimator.

  • PDF

Logistic Regression Method in Interval-Censored Data

  • Yun, Eun-Young;Kim, Jin-Mi;Ki, Choong-Rak
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.5
    • /
    • pp.871-881
    • /
    • 2011
  • In this paper we propose a logistic regression method to estimate the survival function and the median survival time in interval-censored data. The proposed method is motivated by the data augmentation technique with no sacrifice in augmenting data. In addition, we develop a cross validation criterion to determine the size of data augmentation. We compare the proposed estimator with other existing methods such as the parametric method, the single point imputation method, and the nonparametric maximum likelihood estimator through extensive numerical studies to show that the proposed estimator performs better than others in the sense of the mean squared error. An illustrative example based on a real data set is given.

A note on nonparametric density deconvolution by weighted kernel estimators

  • Lee, Sungho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.4
    • /
    • pp.951-959
    • /
    • 2014
  • Recently Hazelton and Turlach (2009) proposed a weighted kernel density estimator for the deconvolution problem. In the case of Gaussian kernels and measurement error, they argued that the weighted kernel density estimator is a competitive estimator over the classical deconvolution kernel estimator. In this paper we consider weighted kernel density estimators when sample observations are contaminated by double exponentially distributed errors. The performance of the weighted kernel density estimators is compared over the classical deconvolution kernel estimator and the kernel density estimator based on the support vector regression method by means of a simulation study. The weighted density estimator with the Gaussian kernel shows numerical instability in practical implementation of optimization function. However the weighted density estimates with the double exponential kernel has very similar patterns to the classical kernel density estimates in the simulations, but the shape is less satisfactory than the classical kernel density estimator with the Gaussian kernel.

Feature selection in the semivarying coefficient LS-SVR

  • Hwang, Changha;Shim, Jooyong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.2
    • /
    • pp.461-471
    • /
    • 2017
  • In this paper we propose a feature selection method identifying important features in the semivarying coefficient model. One important issue in semivarying coefficient model is how to estimate the parametric and nonparametric components. Another issue is how to identify important features in the varying and the constant effects. We propose a feature selection method able to address this issue using generalized cross validation functions of the varying coefficient least squares support vector regression (LS-SVR) and the linear LS-SVR. Numerical studies indicate that the proposed method is quite effective in identifying important features in the varying and the constant effects in the semivarying coefficient model.

A study on the multivariate sliced inverse regression (다변량 분할 역회귀모형에 관한 연구)

  • 이용구;이덕기
    • The Korean Journal of Applied Statistics
    • /
    • v.10 no.2
    • /
    • pp.293-308
    • /
    • 1997
  • Sliced inverse regression is a method for reducing the dimension of the explanatory variable X without going through any parametric or nonparametric model fitting process. This method explores the simplicity of the inverse view of regression; that is, instead of regressing the univariate output varable y against the multivariate X, we regress X against y. In this article, we propose bivariate sliced inverse regression, whose method regress the multivariate X against the bivariate output variables $y_1, Y_2$. Bivariate sliced inverse regression estimates the e.d.r. directions of satisfying two generalized regression model simultaneously. For the application of bivariate sliced inverse regression, we decompose the output variable y into two variables, one variable y gained by projecting the output variable y onto the column space of X and the other variable r through projecting the output variable y onto the space orthogonal to the column space of X, respectively and then estimate the e.d.r. directions of the generalized regression model by utilize two variables simultaneously. As a result, bivariate sliced inverse regression of considering the variable y and r simultaneously estimates the e.d.r. directions efficiently and steadily when the regression model is linear, quadratic and nonlinear, respectively.

  • PDF