• Title/Summary/Keyword: classification error

Search Result 822, Processing Time 0.032 seconds

Estimation of Classification Error Based on the Bhattacharyya Distance for Data with Multimodal Distribution (Multimodal 분포 데이터를 위한 Bhattacharyya distance 기반 분류 에러예측 기법)

  • 최의선;이철희
    • Proceedings of the IEEK Conference
    • /
    • 2000.06d
    • /
    • pp.85-87
    • /
    • 2000
  • In pattern classification, the Bhattacharyya distance has been used as a class separability measure and provides useful information for feature selection and extraction. In this paper, we propose a method to predict the classification error for multimodal data based on the Bhattacharyya distance. In our approach, we first approximate the pdf of multimodal distribution with a Gaussian mixture model and find the bhattacharyya distance and classification error. Exprimental results showed that there is a strong relationship between the Bhattacharyya distance and the classification error for multimodal data.

  • PDF

Utilizing Principal Component Analysis in Unsupervised Classification Based on Remote Sensing Data

  • Lee, Byung-Gul;Kang, In-Joan
    • Proceedings of the Korean Environmental Sciences Society Conference
    • /
    • 2003.11a
    • /
    • pp.33-36
    • /
    • 2003
  • Principal component analysis (PCA) was used to improve image classification by the unsupervised classification techniques, the K-means. To do this, I selected a Landsat TM scene of Jeju Island, Korea and proposed two methods for PCA: unstandardized PCA (UPCA) and standardized PCA (SPCA). The estimated accuracy of the image classification of Jeju area was computed by error matrix. The error matrix was derived from three unsupervised classification methods. Error matrices indicated that classifications done on the first three principal components for UPCA and SPCA of the scene were more accurate than those done on the seven bands of TM data and that also the results of UPCA and SPCA were better than those of the raw Landsat TM data. The classification of TM data by the K-means algorithm was particularly poor at distinguishing different land covers on the island. From the classification results, we also found that the principal component based classifications had characteristics independent of the unsupervised techniques (numerical algorithms) while the TM data based classifications were very dependent upon the techniques. This means that PCA data has uniform characteristics for image classification that are less affected by choice of classification scheme. In the results, we also found that UPCA results are better than SPCA since UPCA has wider range of digital number of an image.

  • PDF

Bootstrap Confidence Intervals of Classification Error Rate for a Block of Missing Observations

  • Chung, Hie-Choon
    • Communications for Statistical Applications and Methods
    • /
    • v.16 no.4
    • /
    • pp.675-686
    • /
    • 2009
  • In this paper, it will be assumed that there are two distinct populations which are multivariate normal with equal covariance matrix. We also assume that the two populations are equally likely and the costs of misclassification are equal. The classification rule depends on the situation when the training samples include missing values or not. We consider the bootstrap confidence intervals for classification error rate when a block of observation is missing.

Data-Adaptive ECOC for Multicategory Classification

  • Seok, Kyung-Ha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.19 no.1
    • /
    • pp.25-36
    • /
    • 2008
  • Error Correcting Output Codes (ECOC) can improve generalization performance when applied to multicategory classification problem. In this study we propose a new criterion to select hyperparameters included in ECOC scheme. Instead of margins of a data we propose to use the probability of misclassification error since it makes the criterion simple. Using this we obtain an upper bound of leave-one-out error of OVA(one vs all) method. Our experiments from real and synthetic data indicate that the bound leads to good estimates of parameters.

  • PDF

Optimal feature extraction for normally distributed multicall data (가우시안 분포의 다중클래스 데이터에 대한 최적 피춰추출 방법)

  • 최의선;이철희
    • Proceedings of the IEEK Conference
    • /
    • 1998.10a
    • /
    • pp.1263-1266
    • /
    • 1998
  • In this paper, we propose an optimal feature extraction method for normally distributed multiclass data. We search the whole feature space to find a set of features that give the smallest classification error for the Gaussian ML classifier. Initially, we start with an arbitrary feature vector. Assuming that the feature vector is used for classification, we compute the classification error. Then we move the feature vector slightly and compute the classification error with this vector. Finally we update the feature vector such that the classification error decreases most rapidly. This procedure is done by taking gradient. Alternatively, the initial vector can be those found by conventional feature extraction algorithms. We propose two search methods, sequential search and global search. Experiment results show that the proposed method compares favorably with the conventional feature extraction methods.

  • PDF

Comparison Study of Multi-class Classification Methods

  • Bae, Wha-Soo;Jeon, Gab-Dong;Seok, Kyung-Ha
    • Communications for Statistical Applications and Methods
    • /
    • v.14 no.2
    • /
    • pp.377-388
    • /
    • 2007
  • As one of multi-class classification methods, ECOC (Error Correcting Output Coding) method is known to have low classification error rate. This paper aims at suggesting effective multi-class classification method (1) by comparing various encoding methods and decoding methods in ECOC method and (2) by comparing ECOC method and direct classification method. Both SVM (Support Vector Machine) and logistic regression model were used as binary classifiers in comparison.

Error Estimation Based on the Bhattacharyya Distance for Classifying Multimodal Data (Multimodal 데이터에 대한 분류 에러 예측 기법)

  • Choe, Ui-Seon;Kim, Jae-Hui;Lee, Cheol-Hui
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.39 no.2
    • /
    • pp.147-154
    • /
    • 2002
  • In this paper, we propose an error estimation method based on the Bhattacharyya distance for multimodal data. First, we try to find the empirical relationship between the classification error and the Bhattacharyya distance. Then, we investigate the possibility to derive the error estimation equation based on the Bhattacharyya distance for multimodal data. We assume that the distribution of multimodal data can be approximated as a mixture of several Gaussian distributions. Experimental results with remotely sensed data showed that there exist strong relationships between the Bhattacharyya distance and the classification error and that it is possible to predict the classification error using the Bhattacharyya distance for multimodal data.

Learning Reference Vectors by the Nearest Neighbor Network (최근점 이웃망에의한 참조벡터 학습)

  • Kim Baek Sep
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.31B no.7
    • /
    • pp.170-178
    • /
    • 1994
  • The nearest neighbor classification rule is widely used because it is not only simple but the error rate is asymptotically less than twice Bayes theoretical minimum error. But the method basically use the whole training patterns as the reference vectors. so that both storage and classification time increase as the number of training patterns increases. LVQ(Learning Vector Quantization) resolved this problem by training the reference vectors instead of just storing the whole training patterns. But it is a heuristic algorithm which has no theoretic background there is no terminating condition and it requires a lot of iterations to get to meaningful result. This paper is to propose a new training method of the reference vectors. which minimize the given error function. The nearest neighbor network,the network version of the nearest neighbor classification rule is proposed. The network is funtionally identical to the nearest neighbor classification rule is proposed. The network is funtionally identical to the nearest neighbor classification rule and the reference vectors are represented by the weights between the nodes. The network is trained to minimize the error function with respect to the weights by the steepest descent method. The learning algorithm is derived and it is shown that the proposed method can adjust more reference vectors than LVQ in each iteration. Experiment showed that the proposed method requires less iterations and the error rate is smaller than that of LVQ2.

  • PDF

Robust Minimum Squared Error Classification Algorithm with Applications to Face Recognition

  • Liu, Zhonghua;Yang, Chunlei;Pu, Jiexin;Liu, Gang;Liu, Sen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.1
    • /
    • pp.308-320
    • /
    • 2016
  • Although the face almost always has an axisymmetric structure, it is generally not symmetrical image for the face image. However, the mirror image of the face image can reflect possible variation of the poses and illumination opposite to that of the original face image. A robust minimum squared error classification (RMSEC) algorithm is proposed in this paper. Concretely speaking, the original training samples and the mirror images of the original samples are taken to form a new training set, and the generated training set is used to perform the modified minimum sqreared error classification(MMSEC) algorithm. The extensive experiments show that the accuracy rate of the proposed RMSEC is greatly increased, and the the proposed RMSEC is not sensitive to the variations of the parameters.

Bathymetric mapping in Dong-Sha Atoll using SPOT data

  • Huang, Shih-Jen;Wen, Yao-Chung
    • Proceedings of the KSRS Conference
    • /
    • v.2
    • /
    • pp.525-528
    • /
    • 2006
  • The remote sensing data can be used to calculate the water depth especially in the clear and shallow water area. In this study, the SPOT data was used for bathymetric mapping in Dong-Sha atoll, located in northern South China Sea. The in situ sea depth was collected by echo sounder as well. A global positioning system was employed to locate the accurate sampling points for sea depth. An empirical model between measurement sea depth and band digital count was determined and based on least squares regression analysis. Both non-classification and unsupervised classification were used in this study. The results show that the standard error is less than 0.9m for non-classification. Besides, the 10% error related to the measurement water depth can be satisfied for more than 85% in situ data points. Otherwise, the 10% relative error can reach more than 97%, 69%, and 51% data points at class 4, 5, and 6 respectively if supervised classification is applied. Meanwhile, we also find that the unsupervised classification can get more accuracy to estimate water depth with standard error less than 0.63, 0.93, and 0.68m at class 4, 5, and 6 respectively.

  • PDF