• Title/Summary/Keyword: Statistics Classification

Search Result 876, Processing Time 0.022 seconds

Improvement and Analysis for an Electrical Fire Cause Classification (전기화재원인분류의 문제점 분석 및 개선안 제시)

  • Lee, Jong-Ho;Kim, Doo-Hyun;Kim, Sung-Chul
    • Fire Science and Engineering
    • /
    • v.23 no.2
    • /
    • pp.36-40
    • /
    • 2009
  • This paper presents research about the development of electrical fire cause classification in order to improve the reliability of electrical fire statistics and to collect electrical fires data efficiently. The incorrect and biased knowledge for electrical fires changed the classification of certain types of fires, from non-electrical to electrical. It is convenient and required to develop the standardized form that makes, in the assessment of the cause of electrical fires, the fire investigators directly ticking the appropriate box on the fire report form or making an assessment of a text description. In this study, newly developed electrical fire cause classification structure, which is well-defined hierarchical structure so that there are not any relationship or overlap between cause categories, is suggested. Also the suggested classification structure can be used for electrical fire investigation and statistics, which minimizes the mistake that diagnose non-electrical fires into electrical ones.

Local Linear Logistic Classification of Microarray Data Using Orthogonal Components (직교요인을 이용한 국소선형 로지스틱 마이크로어레이 자료의 판별분석)

  • Baek, Jang-Sun;Son, Young-Sook
    • The Korean Journal of Applied Statistics
    • /
    • v.19 no.3
    • /
    • pp.587-598
    • /
    • 2006
  • The number of variables exceeds the number of samples in microarray data. We propose a nonparametric local linear logistic classification procedure using orthogonal components for classifying high-dimensional microarray data. The proposed method is based on the local likelihood and can be applied to multi-class classification. We applied the local linear logistic classification method using PCA, PLS, and factor analysis components as new features to Leukemia data and colon data, and compare the performance of the proposed method with the conventional statistical classification procedures. The proposed method outperforms the conventional ones for each component, and PLS has shown best performance when it is embedded in the proposed method among the three orthogonal components.

Comparison Studies of Classification Methods based on L1-Distance and L1-Data Depth (L1-거리와 L1-데이터뎁스를 이용한 분류방법의 비교연구)

  • Baek Soo-Jin;Hwang Jin-Soo;Kim Jean-Kyung
    • The Korean Journal of Applied Statistics
    • /
    • v.19 no.1
    • /
    • pp.183-193
    • /
    • 2006
  • We consider a new classification method(DnDclass) combining two classification rules based on $L_1$-distance(L1DISTclass) and $L_1$-data depth(L1DDclass). To investigate characteristics and to evaluate the performance of these classification methods, we use simulation data in various settings. Through this simulation study, we can confirm that the new method, DnDclass, performs relatively well in many cases.

Adaptive Nearest Neighbors for Classification (Adaptive Nearest Neighbors를 활용한 판별분류방법)

  • Jhun, Myoung-Shic;Choi, In-Kyung
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.3
    • /
    • pp.479-488
    • /
    • 2009
  • The ${\kappa}$-Nearest Neighbors Classification(KNNC) is a popular non-parametric classification method which assigns a fixed number ${\kappa}$ of neighbors to every observation without consideration of the local feature of the each observation. In this paper, we propose an Adaptive Nearest Neighbors Classification(ANNC) as an alternative to KNNC. The proposed ANNC method adapts the number of neighbors according to the local feature of the observation such as density of data. To verify characteristics of ANNC, we compare the number of misclassified observation with KNNC by Monte Carlo study and confirm the potential performance of ANNC method.

Standard Criterion of VUS for ROC Surface (ROC 곡면에서 VUS의 판단기준)

  • Hong, C.S.;Jung, E.S.;Jung, D.G.
    • The Korean Journal of Applied Statistics
    • /
    • v.26 no.6
    • /
    • pp.977-985
    • /
    • 2013
  • Many situations are classified into more than two categories in real world. In this work, we consider ROC surface and VUS, which are graphical representation methods for classification models with three categories. The standard criteria of AUC for the probability of default based on Basel II is extended to the VUS for ROC surface; therefore, the standardized criteria of VUS for the classification model is proposed. The ranges of AUC, K-S and mean difference statistics corresponding to VUS values for each class of the standard criteria are obtained. The standard criteria of VUS for ROC surface can be established by exploring the relationships of these statistics.

Classification Analysis for Unbalanced Data (불균형 자료에 대한 분류분석)

  • Kim, Dongah;Kang, Suyeon;Song, Jongwoo
    • The Korean Journal of Applied Statistics
    • /
    • v.28 no.3
    • /
    • pp.495-509
    • /
    • 2015
  • We study a classification problem of significant differences in the proportion of two groups known as the unbalanced classification problem. It is usually more difficult to classify classes accurately in unbalanced data than balanced data. Most observations are likely to be classified to the bigger group if we apply classification methods to the unbalanced data because it can minimize the misclassification loss. However, this smaller group is misclassified as the larger group problem that can cause a bigger loss in most real applications. We compare several classification methods for the unbalanced data using sampling techniques (up and down sampling). We also check the total loss of different classification methods when the asymmetric loss is applied to simulated and real data. We use the misclassification rate, G-mean, ROC and AUC (area under the curve) for the performance comparison.

On EM Algorithm For Discrete Classification With Bahadur Model: Unknown Prior Case

  • Kim, Hea-Jung;Jung, Hun-Jo
    • Journal of the Korean Statistical Society
    • /
    • v.23 no.1
    • /
    • pp.63-78
    • /
    • 1994
  • For discrimination with binary variables, reformulated full and first order Bahadur model with incomplete observations are presented. This allows prior probabilities associated with multiple population to be estimated for the sample-based classification rule. The EM algorithm is adopted to provided the maximum likelihood estimates of the parameters of interest. Some experiences with the models are evaluated and discussed.

  • PDF

Confidence Intervals on Variance Components in Two-Way Classification with Interaction Model

  • Kim, Jung I.;Park, Sung H.
    • Journal of Korean Society for Quality Management
    • /
    • v.10 no.1
    • /
    • pp.7-12
    • /
    • 1982
  • Arvesen (1969) has shown a procedure which produces an approximate confidence interval for a variance component in unbalanced one-way classification model. In this paper, his work is extended to the case when the model of interest is unbalanced two-way classification. Following the procedure described in this paper, approximate confidence intervals are computed by a Monte Carlo simulation.

  • PDF

Pre-Adjustment of Incomplete Group Variable via K-Means Clustering

  • Hwang, S.Y.;Hahn, H.E.
    • Journal of the Korean Data and Information Science Society
    • /
    • v.15 no.3
    • /
    • pp.555-563
    • /
    • 2004
  • In classification and discrimination, we often face with incomplete group variable arising typically from many missing values and/or incredible cases. This paper suggests the use of K-means clustering for pre-adjusting incompleteness and in turn classification based on generalized statistical distance is performed. For illustrating the proposed procedure, simulation study is conducted comparatively with CART in data mining and traditional techniques which are ignoring incompleteness of group variable. Simulation study manifests that our methodology out-performs.

  • PDF