Search | Korea Science

Comparison Study of Multi-class Classification Methods

Bae, Wha-Soo;Jeon, Gab-Dong;Seok, Kyung-Ha
- Communications for Statistical Applications and Methods
- /
- v.14 no.2
- /
- pp.377-388
- /
- 2007
As one of multi-class classification methods, ECOC (Error Correcting Output Coding) method is known to have low classification error rate. This paper aims at suggesting effective multi-class classification method (1) by comparing various encoding methods and decoding methods in ECOC method and (2) by comparing ECOC method and direct classification method. Both SVM (Support Vector Machine) and logistic regression model were used as binary classifiers in comparison.
https://doi.org/10.5351/CKSS.2007.14.2.377 인용 PDF KSCI

Evaluation of the classification method using ancestry SNP markers for ethnic group

Lee, Hyo Jung;Hong, Sun Pyo;Lee, Soong Deok;Rhee, Hwan seok;Lee, Ji Hyun;Jeong, Su Jin;Lee, Jae Won
- Communications for Statistical Applications and Methods
- /
- v.26 no.1
- /
- pp.1-9
- /
- 2019
Various probabilistic methods have been proposed for using interpopulation allele frequency differences to infer the ethnic group of a DNA specimen. The selection of the statistical method is critical because the accuracy of the statistical classification results vary. For the ancestry classification, we proposed a new ancestry evaluation method that estimate the combined ethnicity index as well as compared its performance with various classical classification methods using two real data sets. We selected 13 SNPs that are useful for the inference of ethnic origin. These single nucleotide polymorphisms (SNPs) were analyzed by restriction fragment mass polymorphism assay and followed by classification among ethnic groups. We genotyped 400 individuals from four ethnic groups (100 African-American, 100 Caucasian, 100 Korean, and 100 Mexican-American) for 13 SNPs and allele frequencies that differed among the four ethnic groups. Additionally, we applied our new method to HapMap SNP genotypes for 1,011 samples from 4 populations (African, European, East Asian, and Central-South Asian). Our proposed method yielded the highest accuracy among statistical classification methods. Our ethnic group classification system based on the analysis of ancestry informative SNP markers can provide a useful statistical tool to identify ethnic groups.
https://doi.org/10.29220/CSAM.2019.26.1.001 인용 PDF KSCI

Bivariate ROC Curve and Optimal Classification Function

Hong, C.S.;Jeong, J.A.
- Communications for Statistical Applications and Methods
- /
- v.19 no.4
- /
- pp.629-638
- /
- 2012
We propose some methods to obtain optimal thresholds and classification functions by using various cutoff criterion based on the bivariate ROC curve that represents bivariate cumulative distribution functions. The false positive rate and false negative rate are calculated with these classification functions for bivariate normal distributions.
https://doi.org/10.5351/CKSS.2012.19.4.629 인용 PDF KSCI

Classification via principal differential analysis

Jang, Eunseong;Lim, Yaeji
- Communications for Statistical Applications and Methods
- /
- v.28 no.2
- /
- pp.135-150
- /
- 2021
We propose principal differential analysis based classification methods. Computations of squared multiple correlation function (RSQ) and principal differential analysis (PDA) scores are reviewed; in addition, we combine principal differential analysis results with the logistic regression for binary classification. In the numerical study, we compare the principal differential analysis based classification methods with functional principal component analysis based classification. Various scenarios are considered in a simulation study, and principal differential analysis based classification methods classify the functional data well. Gene expression data is considered for real data analysis. We observe that the PDA score based method also performs well.
https://doi.org/10.29220/CSAM.2021.28.2.135 인용 PDF KSCI

A Note on Fuzzy Support Vector Classification

Lee, Sung-Ho;Hong, Dug-Hun
- Communications for Statistical Applications and Methods
- /
- v.14 no.1
- /
- pp.133-140
- /
- 2007
The support vector machine has been well developed as a powerful tool for solving classification problems. In many real world applications, each training point has a different effect on constructing classification rule. Lin and Wang (2002) proposed fuzzy support vector machines for this kind of classification problems, which assign fuzzy memberships to the input data and reformulate the support vector classification. In this paper another intuitive approach is proposed by using the fuzzy ${\alpha}-cut$ set. It will show us the trend of classification functions as ${\alpha}$ changes.
https://doi.org/10.5351/CKSS.2007.14.1.133 인용 PDF KSCI

Binary classification on compositional data

Joo, Jae Yun;Lee, Seokho
- Communications for Statistical Applications and Methods
- /
- v.28 no.1
- /
- pp.89-97
- /
- 2021
Due to boundedness and sum constraint, compositional data are often transformed by logratio transformation and their transformed data are put into traditional binary classification or discriminant analysis. However, it may be problematic to directly apply traditional multivariate approaches to the transformed data because class distributions are not Gaussian and Bayes decision boundary are not polynomial on the transformed space. In this study, we propose to use flexible classification approaches to transformed data for compositional data classification. Empirical studies using synthetic and real examples demonstrate that flexible approaches outperform traditional multivariate classification or discriminant analysis.
https://doi.org/10.29220/CSAM.2021.28.1.089 인용 PDF KSCI

Selection Method of Fuzzy Partitions in Fuzzy Rule-Based Classification Systems (퍼지 규칙기반 분류시스템에서 퍼지 분할의 선택방법)

Son, Chang-S.;Chung, Hwan-M.;Kwon, Soon-H.
- Journal of the Korean Institute of Intelligent Systems
- /
- v.18 no.3
- /
- pp.360-366
- /
- 2008
The initial fuzzy partitions in fuzzy rule-based classification systems are determined by considering the domain region of each attribute with the given data, and the optimal classification boundaries within the fuzzy partitions can be discovered by tuning their parameters using various learning processes such as neural network, genetic algorithm, and so on. In this paper, we propose a selection method for fuzzy partition based on statistical information to maximize the performance of pattern classification without learning processes where statistical information is used to extract the uncertainty regions (i.e., the regions which the classification boundaries in pattern classification problems are determined) in each input attribute from the numerical data. Moreover the methods for extracting the candidate rules which are associated with the partition intervals generated by statistical information and for minimizing the coupling problem between the candidate rules are additionally discussed. In order to show the effectiveness of the proposed method, we compared the classification accuracy of the proposed with those of conventional methods on the IRIS and New Thyroid Cancer data. From experimental results, we can confirm the fact that the proposed method only considering statistical information of the numerical patterns provides equal to or better classification accuracy than that of the conventional methods.
https://doi.org/10.5391/JKIIS.2008.18.3.360 인용 PDF KSCI

Logistic Regression Classification by Principal Component Selection

Kim, Kiho;Lee, Seokho
- Communications for Statistical Applications and Methods
- /
- v.21 no.1
- /
- pp.61-68
- /
- 2014
We propose binary classification methods by modifying logistic regression classification. We use variable selection procedures instead of original variables to select the principal components. We describe the resulting classifiers and discuss their properties. The performance of our proposals are illustrated numerically and compared with other existing classification methods using synthetic and real datasets.
https://doi.org/10.5351/CSAM.2014.21.1.061 인용 PDF KSCI

Discriminant Analysis of Binary Data by Using the Maximum Entropy Distribution

Lee, Jung Jin;Hwang, Joon
- Communications for Statistical Applications and Methods
- /
- v.10 no.3
- /
- pp.909-917
- /
- 2003
Although many classification models have been used to classify binary data, none of the classification models dominates all varying circumstances depending on the number of variables and the size of data(Asparoukhov and Krzanowski (2001)). This paper proposes a classification model which uses information on marginal distributions of sub-variables and its maximum entropy distribution. Classification experiments by using simulation are discussed.
https://doi.org/10.5351/CKSS.2003.10.3.909 인용 PDF KSCI

A Note on Linear SVM in Gaussian Classes

Jeon, Yongho
- Communications for Statistical Applications and Methods
- /
- v.20 no.3
- /
- pp.225-233
- /
- 2013
The linear support vector machine(SVM) is motivated by the maximal margin separating hyperplane and is a popular tool for binary classification tasks. Many studies exist on the consistency properties of SVM; however, it is unknown whether the linear SVM is consistent for estimating the optimal classification boundary even in the simple case of two Gaussian classes with a common covariance, where the optimal classification boundary is linear. In this paper we show that the linear SVM can be inconsistent in the univariate Gaussian classification problem with a common variance, even when the best tuning parameter is used.
https://doi.org/10.5351/CSAM.2013.20.3.225 인용 PDF KSCI

Search Result 1,428, Processing Time 0.012 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)