• Title/Summary/Keyword: statistical measures

Search Result 1,180, Processing Time 0.024 seconds

Equivalence-Singularity Dichotomies of Gaussian and Poisson Processes from The Kolmogorov's Zero-One Law

  • Park, Jeong-Soo
    • Journal of the Korean Statistical Society
    • /
    • v.23 no.2
    • /
    • pp.367-378
    • /
    • 1994
  • Let P and Q be probability measures of a measurable space $(\Omega, F)$, and ${F_n}_{n \geq 1}$ be a sequence of increasing sub $\sigma$-fields which generates F. For each $n \geq 1$, let $P_n$ and $Q_n$ be the restrictions of P and Q to $F_n$, respectively. Under the assumption that $Q_n \ll P_n$ for every $n \geq 1$, a zero-one condition is derived for P and Q to have the dichotomy, i.e., either $Q \ll P$ or $Q \perp P$. Then using this condition and the Kolmogorov's zero-one law, we give new and simple proofs of the dichotomy theorems for a pair of Gaussian measures and Poisson processes with examples.

  • PDF

A Study on Decision Tree for Multiple Binary Responses

  • Lee, Seong-Keon
    • Communications for Statistical Applications and Methods
    • /
    • v.10 no.3
    • /
    • pp.971-980
    • /
    • 2003
  • The tree method can be extended to multivariate responses, such as repeated measure and longitudinal data, by modifying the split function so as to accommodate multiple responses. Recently, some decision trees for multiple responses have been constructed by Segal (1992) and Zhang (1998). Segal suggested a tree can analyze continuous longitudinal response using Mahalanobis distance for within node homogeneity measures and Zhang suggested a tree can analyze multiple binary responses using generalized entropy criterion which is proportional to maximum likelihood of joint distribution of multiple binary responses. In this paper, we will modify CART procedure and suggest a new tree-based method that can analyze multiple binary responses using similarity measures.

Recovery Levels of Clustering Algorithms Using Different Similarity Measures for Functional Data

  • Chae, Seong San;Kim, Chansoo;Warde, William D.
    • Communications for Statistical Applications and Methods
    • /
    • v.11 no.2
    • /
    • pp.369-380
    • /
    • 2004
  • Clustering algorithms with different similarity measures are commonly used to find an optimal clustering or close to original clustering. The recovery level of using Euclidean distance and distances transformed from correlation coefficients is evaluated and compared using Rand's (1971) C statistic. The C values present how the resultant clustering is close to the original clustering. In simulation study, the recovery level is improved by applying the correlation coefficients between objects. Using the data set from Spellman et al. (1998), the recovery levels with different similarity measures are also presented. In general, the recovery level of true clusters was increased by using the correlation coefficients.

STATISTICAL NOISE BAND REMOVAL FOR SURFACE CLUSTERING OF HYPERSPECTRAL DATA

  • Huan, Nguyen Van;Kim, Hak-Il
    • Proceedings of the KSRS Conference
    • /
    • 2008.10a
    • /
    • pp.111-114
    • /
    • 2008
  • The existence of noise bands may deform the typical shape of the spectrum, making the accuracy of clustering degraded. This paper proposes a statistical approach to remove noise bands in hyperspectral data using the correlation coefficient of bands as an indicator. Considering each band as a random variable, two adjacent signal bands in hyperspectral data are highly correlative. On the contrary, existence of a noise band will produce a low correlation. For clustering, the unsupervised ${\kappa}$-nearest neighbor clustering method is implemented in accordance with three well-accepted spectral matching measures, namely ED, SAM and SID. Furthermore, this paper proposes a hierarchical scheme of combining those measures. Finally, a separability assessment based on the between-class and the within-class scatter matrices is followed to evaluate the applicability of the proposed noise band removal method. Also, the paper brings out a comparison for spectral matching measures.

  • PDF

ESTIMATING VARIOUS MEASURES IN NORMAL POPULATION THROUGH A SINGLE CLASS OF ESTIMATORS

  • Sharad Saxena;Housila P. Singh
    • Journal of the Korean Statistical Society
    • /
    • v.33 no.3
    • /
    • pp.323-337
    • /
    • 2004
  • This article coined a general class of estimators for various measures in normal population when some' a priori' or guessed value of standard deviation a is available in addition to sample information. The class of estimators is primarily defined for a function of standard deviation. An unbiased estimator and the minimum mean squared error estimator are worked out and the suggested class of estimators is compared with these classical estimators. Numerical computations in terms of percent relative efficiency and absolute relative bias established the merits of the proposed class of estimators especially for small samples. Simulation study confirms the excellence of the proposed class of estimators. The beauty of this article lies in estimation of various measures like standard deviation, variance, Fisher information, precision of sample mean, process capability index $C_{p}$, fourth moment about mean, mean deviation about mean etc. as particular cases of the proposed class of estimators.

MEASURES FOR STABILITY OF SLOPE ESTIMATION ON THE SECOND ORDER RESPONSE SURFACE AND EQUALLY-STABLE SLOPE ROTATABILITY

  • Park, Sung H.;Kang, Ho-Seog;Kang, Kee-Hoon
    • Journal of the Korean Statistical Society
    • /
    • v.32 no.4
    • /
    • pp.337-357
    • /
    • 2003
  • This paper introduces new measures for the stability of slope estimation on the second order response surface at a point and on a sphere. As a measure of point stability of slope estimation, we suggest a point dispersion measure of slope variances over all directions at a point. A spherical dispersion measure is also proposed as a measure of spherical stability of slope estimation on each sphere. Some designs are studied to explore the usefulness of the proposed measures. Using the point dispersion measure, another concept of slope rotatability called equally-stable slope rotatability is proposed as a useful property of response surface designs. We provide a set of conditions for a design to have equally-stable slope rotatability.

A Note on the Efficiency Based Reliability Measures for Heterogeneous Populations

  • Cha, Ji-Hwan
    • Communications for Statistical Applications and Methods
    • /
    • v.18 no.2
    • /
    • pp.201-211
    • /
    • 2011
  • In many cases, populations in the real world are composed of different subpopulations. Furthermore, in addition to the heterogeneity in the lifetimes of items, there also could be the heterogeneity in the efficiency or performance of items. In this case, the reliability measures should be defined in a different way. In this article, we consider the mixture of stochastically ordered subpopulations. Efficiency based reliability measures are defined when the performance of items in the subpopulations has different levels. Discrete and continuous mixing models are studied. The concept of the association between the lifetime and the performance of items in subpopulations is defined. It is shown that the consideration of efficiency can change the shape of the mixture failure rate dramatically especially when the lifetime and the performance of items in subpopulations are negatively associated. Furthermore, the modelling method proposed in this paper is applied to the case when the stress levels of the operating environment of items are different.

Input Variable Importance in Supervised Learning Models

  • Huh, Myung-Hoe;Lee, Yong Goo
    • Communications for Statistical Applications and Methods
    • /
    • v.10 no.1
    • /
    • pp.239-246
    • /
    • 2003
  • Statisticians, or data miners, are often requested to assess the importances of input variables in the given supervised learning model. For the purpose, one may rely on separate ad hoc measures depending on modeling types, such as linear regressions, the neural networks or trees. Consequently, the conceptual consistency in input variable importance measures is lacking, so that the measures cannot be directly used in comparing different types of models, which is often done in data mining processes, In this short communication, we propose a unified approach to the importance measurement of input variables. Our method uses sensitivity analysis which begins by perturbing the values of input variables and monitors the output change. Research scope is limited to the models for continuous output, although it is not difficult to extend the method to supervised learning models for categorical outcomes.

Generalized Measure of Departure From Global Symmetry for Square Contingency Tables with Ordered Categories

  • Tomizawa, Sadao;Saitoh, Kayo
    • Journal of the Korean Statistical Society
    • /
    • v.27 no.3
    • /
    • pp.289-303
    • /
    • 1998
  • For square contingency tables with ordered categories, Tomizawa (1995) considered two kinds of measures to represent the degree of departure from global symmetry, which means that the probability that an observation will fall in one of cells in the upper-right triangle of square table is equal to the probability that the observation falls in one of cells in the lower-left triangle of it. This paper proposes a generalization of those measures. The proposed measure is expressed by using Cressie and Read's (1984) power divergence or Patil and Taillie's (1982) diversity index. Special cases of the proposed measure include TomiBawa's measures. The proposed measure would be useful for comparing the degree of departure from global symmetry in several tables.

  • PDF

Empirical Comparisons of Disparity Measures for Partial Association Models in Three Dimensional Contingency Tables

  • Jeong, D.B.;Hong, C.S.;Yoon, S.H.
    • Communications for Statistical Applications and Methods
    • /
    • v.10 no.1
    • /
    • pp.135-144
    • /
    • 2003
  • This work is concerned with comparison of the recently developed disparity measures for the partial association model in three dimensional categorical data. Data are generated by using simulation on each term in the log-linear model equation based on the partial association model, which is a proposed method in this paper. This alternative Monte Carlo methods are explored to study the behavior of disparity measures such as the power divergence statistic I(λ), the Pearson chi-square statistic X$^2$, the likelihood ratio statistic G$^2$, the blended weight chi-square statistic BWCS(λ), the blended weight Hellinger distance statistic BWHD(λ), and the negative exponential disparity statistic NED(λ) for moderate sample sizes. We find that the power divergence statistic I(2/3) and the blended weight Hellinger distance family BWHD(1/9) are the best tests with respect to size and power.