• Title/Summary/Keyword: statistical measures

Search Result 1,180, Processing Time 0.026 seconds

Comparison of time series clustering methods and application to power consumption pattern clustering

  • Kim, Jaehwi;Kim, Jaehee
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.6
    • /
    • pp.589-602
    • /
    • 2020
  • The development of smart grids has enabled the easy collection of a large amount of power data. There are some common patterns that make it useful to cluster power consumption patterns when analyzing s power big data. In this paper, clustering analysis is based on distance functions for time series and clustering algorithms to discover patterns for power consumption data. In clustering, we use 10 distance measures to find the clusters that consider the characteristics of time series data. A simulation study is done to compare the distance measures for clustering. Cluster validity measures are also calculated and compared such as error rate, similarity index, Dunn index and silhouette values. Real power consumption data are used for clustering, with five distance measures whose performances are better than others in the simulation.

Accuracy of Multiple Outlier Tests in Nonlinear Regression

  • Kahng, Myung-Wook
    • Communications for Statistical Applications and Methods
    • /
    • v.18 no.1
    • /
    • pp.131-136
    • /
    • 2011
  • The original Bates-Watts framework applies only to the complete parameter vector. Thus, guidelines developed in that framework can be misleading when the adequacy of the linear approximation is very different for different subsets. The subset curvature measures appear to be reliable indicators of the adequacy of linear approximation for an arbitrary subset of parameters in nonlinear models. Given the specific mean shift outlier model, the standard approaches to obtaining test statistics for outliers are discussed. The accuracy of outlier tests is investigated using subset curvatures.

A View on Extension of Utility-Based on Links with Information Measures

  • Hoseinzadeh, A.R.;Borzadaran, G.R.Mohtashami;Yari, G.H.
    • Communications for Statistical Applications and Methods
    • /
    • v.16 no.5
    • /
    • pp.813-820
    • /
    • 2009
  • In this paper, we review the utility-based generalization of the Shannon entropy and Kullback-Leibler information measure as the U-entropy and the U-relative entropy that was introduced by Friedman et al. (2007). Then, we derive some relations between the U-relative entropy and other information measures based on a parametric family of utility functions.

Minimum Distance Estimation Based On The Kernels For U-Statistics

  • Park, Hyo-Il
    • Journal of the Korean Statistical Society
    • /
    • v.27 no.1
    • /
    • pp.113-132
    • /
    • 1998
  • In this paper, we consider a minimum distance (M.D.) estimation based on kernels for U-statistics. We use Cramer-von Mises type distance function which measures the discrepancy between U-empirical distribution function(d.f.) and modeled d.f. of kernel. In the distance function, we allow various integrating measures, which can be finite, $\sigma$-finite or discrete. Then we derive the asymptotic normality and study the qualitative robustness of M. D. estimates.

  • PDF

Different estimation methods for the unit inverse exponentiated weibull distribution

  • Amal S Hassan;Reem S Alharbi
    • Communications for Statistical Applications and Methods
    • /
    • v.30 no.2
    • /
    • pp.191-213
    • /
    • 2023
  • Unit distributions are frequently used in probability theory and statistics to depict meaningful variables having values between zero and one. Using convenient transformation, the unit inverse exponentiated weibull (UIEW) distribution, which is equally useful for modelling data on the unit interval, is proposed in this study. Quantile function, moments, incomplete moments, uncertainty measures, stochastic ordering, and stress-strength reliability are among the statistical properties provided for this distribution. To estimate the parameters associated to the recommended distribution, well-known estimation techniques including maximum likelihood, maximum product of spacings, least squares, weighted least squares, Cramer von Mises, Anderson-Darling, and Bayesian are utilised. Using simulated data, we compare how well the various estimators perform. According to the simulated outputs, the maximum product of spacing estimates has lower values of accuracy measures than alternative estimates in majority of situations. For two real datasets, the proposed model outperforms the beta, Kumaraswamy, unit Gompartz, unit Lomax and complementary unit weibull distributions based on various comparative indicators.

On Mimimal Sufficient Statistics

  • Nabeya, Seiji
    • Journal of the Korean Statistical Society
    • /
    • v.10
    • /
    • pp.83-90
    • /
    • 1981
  • Let (X, A) be a measurable space, i.e. X is a non-empty set and A is a $\sigma$-field of subsets of X. Let $\Omega$ be a parameter space and P be a family of probability measures $P_\theta, \theta \in \Omega$ defined on (X, A).

  • PDF

Optimal Criterion of Classification Accuracy Measures for Normal Mixture (정규혼합에서 분류정확도 측도들의 최적기준)

  • Yoo, Hyun-Sang;Hong, Chong-Sun
    • Communications for Statistical Applications and Methods
    • /
    • v.18 no.3
    • /
    • pp.343-355
    • /
    • 2011
  • For a data with the assumption of the mixture distribution, it is important to find an appropriate threshold and evaluate its performance. The relationship is found of well-known nine classification accuracy measures such as MVD, Youden's index, the closest-to-(0, 1) criterion, the amended closest-to-(0, 1) criterion, SSS, symmetry point, accuracy area, TA, TR. Then some conditions of these measures are categorized into seven groups. Under the normal mixture assumption, we calculate thresholds based on these measures and obtain the corresponding type I and II errors. We could explore that which classification measure has minimum type I and II errors for estimated mixture distribution to understand the strength and weakness of these classification measures.

Inference on Overlapping Coefficients in Two Exponential Populations Using Ranked Set Sampling

  • Samawi, Hani M.;Al-Saleh, Mohammad F.
    • Communications for Statistical Applications and Methods
    • /
    • v.15 no.2
    • /
    • pp.147-159
    • /
    • 2008
  • We consider using ranked set sampling methods to draw inference about the three well-known measures of overlap, namely Matusita's measure $\rho$, Morisita's measure $\lambda$ and Weitzman's measure $\Delta$. Two exponential populations with different means are considered. Due to the difficulties of calculating the precision or the bias of the resulting estimators of overlap measures, because there are no closed-form exact formulas for their variances and their exact sampling distributions, Monte Carlo evaluations are used. Confidence intervals for those measures are also constructed via the bootstrap method and Taylor series approximation.