• Title/Summary/Keyword: Nonparametric Statistics Analysis

Search Result 103, Processing Time 0.025 seconds

Comparison of estimation methods for expectile regression (평률 회귀분석을 위한 추정 방법의 비교)

  • Kim, Jong Min;Kang, Kee-Hoon
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.3
    • /
    • pp.343-352
    • /
    • 2018
  • We can use quantile regression and expectile regression analysis to estimate trends in extreme regions as well as the average trends of response variables in given explanatory variables. In this paper, we compare the performance between the parametric and nonparametric methods for expectile regression. We introduce each estimation method and analyze through various simulations and the application to real data. The nonparametric model showed better results if the model is complex and difficult to deduce the relationship between variables. The use of nonparametric methods can be recommended in terms of the difficulty of assuming a parametric model in expectile regression.

Portfolio Selection for Socially Responsible Investment via Nonparametric Frontier Models

  • Jeong, Seok-Oh;Hoss, Andrew;Park, Cheolwoo;Kang, Kee-Hoon;Ryu, Youngjae
    • Communications for Statistical Applications and Methods
    • /
    • v.20 no.2
    • /
    • pp.115-127
    • /
    • 2013
  • This paper provides an effective stock portfolio screening tool for socially responsible investment (SRI) based upon corporate social responsibility (CSR) and financial performance. The proposed approach utilizes nonparametric frontier models. Data envelopment analysis (DEA) has been used to build SRI portfolios in a few previous works; however, we show that free disposal hull (FDH), a similar model that does not assume the convexity of the technology, yields superior results when applied to a stock universe of 253 Korean companies. Over a four-year time span (from 2006 to 2009) the portfolios selected by the proposed method consistently outperform those selected by DEA as well as the benchmark.

Local linear regression analysis for interval-valued data

  • Jang, Jungteak;Kang, Kee-Hoon
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.3
    • /
    • pp.365-376
    • /
    • 2020
  • Interval-valued data, a type of symbolic data, is given as an interval in which the observation object is not a single value. It can also occur frequently in the process of aggregating large databases into a form that is easy to manage. Various regression methods for interval-valued data have been proposed relatively recently. In this paper, we introduce a nonparametric regression model using the kernel function and a nonlinear regression model for the interval-valued data. We also propose applying the local linear regression model, one of the nonparametric methods, to the interval-valued data. Simulations based on several distributions of the center point and the range are conducted using each of the methods presented in this paper. Various conditions confirm that the performance of the proposed local linear estimator is better than the others.

Empirical variogram for achieving the best valid variogram

  • Mahdi, Esam;Abuzaid, Ali H.;Atta, Abdu M.A.
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.5
    • /
    • pp.547-568
    • /
    • 2020
  • Modeling the statistical autocorrelations in spatial data is often achieved through the estimation of the variograms, where the selection of the appropriate valid variogram model, especially for small samples, is crucial for achieving precise spatial prediction results from kriging interpolations. To estimate such a variogram, we traditionally start by computing the empirical variogram (traditional Matheron or robust Cressie-Hawkins or kernel-based nonparametric approaches). In this article, we conduct numerical studies comparing the performance of these empirical variograms. In most situations, the nonparametric empirical variable nearest-neighbor (VNN) showed better performance than its competitors (Matheron, Cressie-Hawkins, and Nadaraya-Watson). The analysis of the spatial groundwater dataset used in this article suggests that the wave variogram model, with hole effect structure, fitted to the empirical VNN variogram is the most appropriate choice. This selected variogram is used with the ordinary kriging model to produce the predicted pollution map of the nitrate concentrations in groundwater dataset.

A study on the Bayesian nonparametric model for predicting group health claims

  • Muna Mauliza;Jimin Hong
    • Communications for Statistical Applications and Methods
    • /
    • v.31 no.3
    • /
    • pp.323-336
    • /
    • 2024
  • The accurate forecasting of insurance claims is a critical component for insurers' risk management decisions. Hierarchical Bayesian parametric (BP) models can be used for health insurance claims forecasting, but they are unsatisfactory to describe the claims distribution. Therefore, Bayesian nonparametric (BNP) models can be a more suitable alternative to deal with the complex characteristics of the health insurance claims distribution, including heavy tails, skewness, and multimodality. In this study, we apply both a BP model and a BNP model to predict group health claims using simulated and real-world data for a private life insurer in Indonesia. The findings show that the BNP model outperforms the BP model in terms of claims prediction accuracy. Furthermore, our analysis highlights the flexibility and robustness of BNP models in handling diverse data structures in health insurance claims.

A Comparative Study on the Performance of Bayesian Partially Linear Models

  • Woo, Yoonsung;Choi, Taeryon;Kim, Wooseok
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.6
    • /
    • pp.885-898
    • /
    • 2012
  • In this paper, we consider Bayesian approaches to partially linear models, in which a regression function is represented by a semiparametric additive form of a parametric linear regression function and a nonparametric regression function. We make a comparative study on the performance of widely used Bayesian partially linear models in terms of empirical analysis. Specifically, we deal with three Bayesian methods to estimate the nonparametric regression function, one method using Fourier series representation, the other method based on Gaussian process regression approach, and the third method based on the smoothness of the function and differencing. We compare the numerical performance of three methods by the root mean squared error(RMSE). For empirical analysis, we consider synthetic data with simulation studies and real data application by fitting each of them with three Bayesian methods and comparing the RMSEs.

Nonparametric Bayesian Multiple Change Point Problems

  • Kim, Chansoo;Younshik Chung
    • Journal of the Korean Statistical Society
    • /
    • v.31 no.1
    • /
    • pp.1-16
    • /
    • 2002
  • Since changepoint identification is important in many data analysis problem, we wish to make inference about the locations of one or more changepoints of the sequence. We consider the Bayesian nonparameteric inference for multiple changepoint problem using a Bayesian segmentation procedure proposed by Yang and Kuo (2000). A mixture of products of Dirichlet process is used as a prior distribution. To decide whether there exists a single change or not, our approach depends on nonparametric Bayesian Schwartz information criterion at each step. We discuss how to choose the precision parameter (total mass parameter) in nonparametric setting and show that the discreteness of the Dirichlet process prior can ha17e a large effect on the nonparametric Bayesian Schwartz information criterion and leads to conclusions that are very different results from reasonable parametric model. One example is proposed to show this effect.

Nonparametric Bayesian methods: a gentle introduction and overview

  • MacEachern, Steven N.
    • Communications for Statistical Applications and Methods
    • /
    • v.23 no.6
    • /
    • pp.445-466
    • /
    • 2016
  • Nonparametric Bayesian methods have seen rapid and sustained growth over the past 25 years. We present a gentle introduction to the methods, motivating the methods through the twin perspectives of consistency and false consistency. We then step through the various constructions of the Dirichlet process, outline a number of the basic properties of this process and move on to the mixture of Dirichlet processes model, including a quick discussion of the computational methods used to fit the model. We touch on the main philosophies for nonparametric Bayesian data analysis and then reanalyze a famous data set. The reanalysis illustrates the concept of admissibility through a novel perturbation of the problem and data, showing the benefit of shrinkage estimation and the much greater benefit of nonparametric Bayesian modelling. We conclude with a too-brief survey of fancier nonparametric Bayesian methods.

Practical statistics in pain research

  • Kim, Tae Kyun
    • The Korean Journal of Pain
    • /
    • v.30 no.4
    • /
    • pp.243-249
    • /
    • 2017
  • Pain is subjective, while statistics related to pain research are objective. This review was written to help researchers involved in pain research make statistical decisions. The main issues are related with the level of scales that are often used in pain research, the choice of statistical methods between parametric or nonparametric statistics, and problems which arise from repeated measurements. In the field of pain research, parametric statistics used to be applied in an erroneous way. This is closely related with the scales of data and repeated measurements. The level of scales includes nominal, ordinal, interval, and ratio scales. The level of scales affects the choice of statistics between parametric or non-parametric methods. In the field of pain research, the most frequently used pain assessment scale is the ordinal scale, which would include the visual analogue scale (VAS). There used to be another view, however, which considered the VAS to be an interval or ratio scale, so that the usage of parametric statistics would be accepted practically in some cases. Repeated measurements of the same subjects always complicates statistics. It means that measurements inevitably have correlations between each other, and would preclude the application of one-way ANOVA in which independence between the measurements is necessary. Repeated measures of ANOVA (RMANOVA), however, would permit the comparison between the correlated measurements as long as the condition of sphericity assumption is satisfied. Conclusively, parametric statistical methods should be used only when the assumptions of parametric statistics, such as normality and sphericity, are established.

Nonparametric Estimation of Mean Residual Life Function under Random Censorship

  • Park, Byung-Gu;Sohn, Joong-Kweon;Lee, Sang-Bock
    • Journal of the Korean Statistical Society
    • /
    • v.22 no.2
    • /
    • pp.147-157
    • /
    • 1993
  • In the survivla analysis the problem of estimating mean residual life function (MRLF) under random censoring is very important. In this paper we propose and study a nonparametric estimator of MRLF, which is a functional form based on the estimator of the survival function due to Susarla and Van Ryzin (1980). The proposed estimator is shown to be better than some other estimators in terms of mean square errors for the exponential and Weibull cases via Monte Carlo simulation studies.

  • PDF