• Title/Summary/Keyword: Classification Methods

Search Result 5,228, Processing Time 0.067 seconds

A Study on the Relationship between Class Similarity and the Performance of Hierarchical Classification Method in a Text Document Classification Problem (텍스트 문서 분류에서 범주간 유사도와 계층적 분류 방법의 성과 관계 연구)

  • Jang, Soojung;Min, Daiki
    • The Journal of Society for e-Business Studies
    • /
    • v.25 no.3
    • /
    • pp.77-93
    • /
    • 2020
  • The literature has reported that hierarchical classification methods generally outperform the flat classification methods for a multi-class document classification problem. Unlike the literature that has constructed a class hierarchy, this paper evaluates the performance of hierarchical and flat classification methods under a situation where the class hierarchy is predefined. We conducted numerical evaluations for two data sets; research papers on climate change adaptation technologies in water sector and 20NewsGroup open data set. The evaluation results show that the hierarchical classification method outperforms the flat classification methods under a certain condition, which differs from the literature. The performance of hierarchical classification method over flat classification method depends on class similarities at levels in the class structure. More importantly, the hierarchical classification method works better when the upper level similarity is less that the lower level similarity.

Bivariate ROC Curve and Optimal Classification Function

  • Hong, C.S.;Jeong, J.A.
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.4
    • /
    • pp.629-638
    • /
    • 2012
  • We propose some methods to obtain optimal thresholds and classification functions by using various cutoff criterion based on the bivariate ROC curve that represents bivariate cumulative distribution functions. The false positive rate and false negative rate are calculated with these classification functions for bivariate normal distributions.

The Methods for the Improvement of the KDC 5th Edition of Architecture Engineering Classification System (KDC 제5판 건축공학분야 분류체계 개선 방안)

  • Kim, Yeon-Rye
    • Journal of Korean Library and Information Science Society
    • /
    • v.40 no.4
    • /
    • pp.401-425
    • /
    • 2009
  • This study is intended to present methods improving the classification system of KDC architecture engineering fields after comparing and analyzing the academic system of architecture engineering, classification system of KDC, DDC, and LCC, and that of the research field classification system of National Research Foundation of Korea. The results of the analysis have revealed that it is required to improve and correct the KDC 5th edition of architectural engineering including the addition of classification items that reflect the trend of academic development, proper development in the rank classification terms of architectural structure engineering, addition of detailed subjects, selection of proper classification terms, errors of classification symbols and English expression, and omission of correlative indexes in the classification items. This study has proposed improved methods to solve those problems.

  • PDF

열거식 계층분류체계에 분석합성식 기법의 도입에 관한 연구-KDC를 중심으로

  • 도태현
    • Journal of Korean Library and Information Science Society
    • /
    • v.29
    • /
    • pp.241-272
    • /
    • 1998
  • The purpose of this paper is to examine the analytic-assembling(faceted analysis) methods applied in enumerative-hierarchical classification schemes. (mainly in KDC) The methods are summarized as follows : 1. For the enumerative-hierarchical classification schemes, in principle the subjects are divided into subdivisions by only one facet at the same level, and step by step. However some subjects, for example 'library and information science' 'education' and others in KDC, are divided into subdivisions by multiple facets at same level like Colon Classification. 2. Most of enumerative-hierarchical classification schemes have various kinds of auxiliary tables, such as standard subdivisions, areas, periods, and languages. Each of them is considered as foci by a facet applied to subdivide all kinds of subjects or some special subjects into lower level. 3. To classify the compound subjects with phase relation, KDC provides ready-made classification numbers or notes that says 'divide by 001-999'(whole subjects) of 'divide by xxx-xxx'(limited scope of subjects). The ready-made compound subjects, or subdividing by whole or limited scope of subjects are similar to representation of phase relation in Colon Classification. Yet these analytic-assembling methods in KDC are needed to be supplemented and amended. Subdividing methods for faceted analysis have to be unified through the whole schedule. The auxiliary tables should be enlarged and subdivided more specifically. And for representation of phase relation, the linking signs can be useful in KDC as well as UDC and other analytic-assembling classification schemes like Colon Classification.

  • PDF

Evaluation of the classification method using ancestry SNP markers for ethnic group

  • Lee, Hyo Jung;Hong, Sun Pyo;Lee, Soong Deok;Rhee, Hwan seok;Lee, Ji Hyun;Jeong, Su Jin;Lee, Jae Won
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.1
    • /
    • pp.1-9
    • /
    • 2019
  • Various probabilistic methods have been proposed for using interpopulation allele frequency differences to infer the ethnic group of a DNA specimen. The selection of the statistical method is critical because the accuracy of the statistical classification results vary. For the ancestry classification, we proposed a new ancestry evaluation method that estimate the combined ethnicity index as well as compared its performance with various classical classification methods using two real data sets. We selected 13 SNPs that are useful for the inference of ethnic origin. These single nucleotide polymorphisms (SNPs) were analyzed by restriction fragment mass polymorphism assay and followed by classification among ethnic groups. We genotyped 400 individuals from four ethnic groups (100 African-American, 100 Caucasian, 100 Korean, and 100 Mexican-American) for 13 SNPs and allele frequencies that differed among the four ethnic groups. Additionally, we applied our new method to HapMap SNP genotypes for 1,011 samples from 4 populations (African, European, East Asian, and Central-South Asian). Our proposed method yielded the highest accuracy among statistical classification methods. Our ethnic group classification system based on the analysis of ancestry informative SNP markers can provide a useful statistical tool to identify ethnic groups.

A Comparison Study of Multiclass SVM Methods in Microarray Data

  • Hwang, Jin-Soo;Lee, Ji-Young;Kim, Jee-Yun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.2
    • /
    • pp.311-324
    • /
    • 2006
  • The Support Vector Machine(SVM) is very functional and efficient classification method to any other classification analysis method. However, its optimal extension to more than two classes is not obvious. In this paper several multi-category SVM methods are introduced and compared using simulation and real data sets. Also comparison with traditional multi-category classification and SVM based methods is performed.

  • PDF

New Splitting Criteria for Classification Trees

  • Lee, Yung-Seop
    • Communications for Statistical Applications and Methods
    • /
    • v.8 no.3
    • /
    • pp.885-894
    • /
    • 2001
  • Decision tree methods is the one of data mining techniques. Classification trees are used to predict a class label. When a tree grows, the conventional splitting criteria use the weighted average of the left and the right child nodes for measuring the node impurity. In this paper, new splitting criteria for classification trees are proposed which improve the interpretablity of trees comparing to the conventional methods. The criteria search only for interesting subsets of the data, as opposed to modeling all of the data equally well. As a result, the tree is very unbalanced but extremely interpretable.

  • PDF

Applicaton of a Geomechanical Classification for Rock Slope (암반 사면에 대한 새로운 암반 분류안의 적용)

  • 김대복
    • Tunnel and Underground Space
    • /
    • v.4 no.3
    • /
    • pp.215-227
    • /
    • 1994
  • Rock Mass classifications have been developed in many European countries. The most widely used classification methods are the Rock Mass Rating (RMR) system proposed by Bieniawski(1973) and the Q-system developed By Barton et al. (1974). These methods are also adopted at many mountain tunnels and subway sites in our country. Here, a geomechanical classification for slopeds in rock, the "Slope Mass Rating"(SMR) is presented for the preliminary assessment of slope stabiliyt. This method can be applied to excavation and support design in the front part of tunnel and cutting area as a guide line and recommendation on support methods which allow a systemmetic use of geomechanical classification for rock slopes.

  • PDF

A Study on the Improvement Directions of Data Classification Format for Efficient Information Management System (효율적인 정보화경영을 위한 데이터분류체계의 개선방안에 관한 연구)

  • Park, Jae-Yong
    • International Commerce and Information Review
    • /
    • v.6 no.3
    • /
    • pp.41-61
    • /
    • 2004
  • Today, most companies are needed to become interested on e-Biz and information management system. Especially, Data classification format system was very important for application to effective and efficiency management decision support. They should include main entry which consists of department, employee's name, title, publication date. Now, each company is using eleven different methods on data classification format system. In this paper finding result was as follows, in other words, general management document case using the nine date classification methods and special report management document ca se using the twodata classification methods. The aim of this study is to investigate problems that the present data classification format system has and some concerns that should be taken into account in case of the modification of the data classification system and change into a new one. This study is based on the survey in that the company managergave to 35 companies throughout the nation. As a result, the survey indicates that the crucial concerns of the participating managers are ineffective management information source and the duplication of data classification systems. This paper is the transcendental study the introduction of data classification format systems to business companies in Korea. This paper provided the fundamental data for the effective business process reengineering in business activity for management information.

  • PDF

The Methods for the Improvement of the KDC 5th Edition of Education Classification System (KDC 제5판 교육학분야 분류체계 개선 방안)

  • Kim, Yeon-Rye
    • Journal of Korean Library and Information Science Society
    • /
    • v.41 no.4
    • /
    • pp.5-33
    • /
    • 2010
  • This study is intended to present methods improving the classification system of KDC education fields after comparing and analyzing the academic system of education, classification system of KDC, NDC, DDC and LCC, and that of the research field classification system of National Research Foundation of Korea. The results of the analysis have revealed that it is required to improve and correct the KDC 5th edition of education including the addition of classification items that reflect the trend of academic development, proper development in the rank classification terms of education detailed fields, addition of detailed subjects, errors of classification symbols and omission of correlative indexes in the classification items. This study has proposed improved methods to solve those problems.

  • PDF