• Title/Summary/Keyword: Group Classification Method

Search Result 541, Processing Time 0.023 seconds

Design and Performance Measurement of a Genetic Algorithm-based Group Classification Method : The Case of Bond Rating (유전 알고리듬 기반 집단분류기법의 개발과 성과평가 : 채권등급 평가를 중심으로)

  • Min, Jae-H.;Jeong, Chul-Woo
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.32 no.1
    • /
    • pp.61-75
    • /
    • 2007
  • The purpose of this paper is to develop a new group classification method based on genetic algorithm and to com-pare its prediction performance with those of existing methods in the area of bond rating. To serve this purpose, we conduct various experiments with pilot and general models. Specifically, we first conduct experiments employing two pilot models : the one searching for the cluster center of each group and the other one searching for both the cluster center and the attribute weights in order to maximize classification accuracy. The results from the pilot experiments show that the performance of the latter in terms of classification accuracy ratio is higher than that of the former which provides the rationale of searching for both the cluster center of each group and the attribute weights to improve classification accuracy. With this lesson in mind, we design two generalized models employing genetic algorithm : the one is to maximize the classification accuracy and the other one is to minimize the total misclassification cost. We compare the performance of these two models with those of existing statistical and artificial intelligent models such as MDA, ANN, and Decision Tree, and conclude that the genetic algorithm-based group classification method that we propose in this paper significantly outperforms the other methods in respect of classification accuracy ratio as well as misclassification cost.

Evaluation of the classification method using ancestry SNP markers for ethnic group

  • Lee, Hyo Jung;Hong, Sun Pyo;Lee, Soong Deok;Rhee, Hwan seok;Lee, Ji Hyun;Jeong, Su Jin;Lee, Jae Won
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.1
    • /
    • pp.1-9
    • /
    • 2019
  • Various probabilistic methods have been proposed for using interpopulation allele frequency differences to infer the ethnic group of a DNA specimen. The selection of the statistical method is critical because the accuracy of the statistical classification results vary. For the ancestry classification, we proposed a new ancestry evaluation method that estimate the combined ethnicity index as well as compared its performance with various classical classification methods using two real data sets. We selected 13 SNPs that are useful for the inference of ethnic origin. These single nucleotide polymorphisms (SNPs) were analyzed by restriction fragment mass polymorphism assay and followed by classification among ethnic groups. We genotyped 400 individuals from four ethnic groups (100 African-American, 100 Caucasian, 100 Korean, and 100 Mexican-American) for 13 SNPs and allele frequencies that differed among the four ethnic groups. Additionally, we applied our new method to HapMap SNP genotypes for 1,011 samples from 4 populations (African, European, East Asian, and Central-South Asian). Our proposed method yielded the highest accuracy among statistical classification methods. Our ethnic group classification system based on the analysis of ancestry informative SNP markers can provide a useful statistical tool to identify ethnic groups.

Varietal Classification on the Basis of Cluster Analysis in Burley Tobacco of N. tabacum L. (Cluster분석에 의한 버어리종 담배품종의 분류)

  • Ann, Dai-Jin;Kim, Yoon-Dong
    • Journal of the Korean Society of Tobacco Science
    • /
    • v.5 no.2
    • /
    • pp.25-32
    • /
    • 1983
  • To obtain basic information on the breeding of burley tobacco, classification of 41 varieties was carried out by using the cluster analysis of correlation coefficients and taxonomic distance based on twenty-one agromonic characters. Eight characters, such as days to flowering, length of flower axis, internode length, leaf length, yield, leaf angle to stem, vein angle to midrib and plant height, were useful in monothetic classification. Forty-one varieties were classified into four groups (I, II, III and IV) with weighted variable group method (WVGM ) and weighted jai. group method(WPGM), whereas the results classification of 33 varieties among them by WVGM were coincident with the results by WPGM. As for the characteristics of each group, group I related to late maturity, tall height and high yield, group II related to intermediate maturity, tall height and low yield, group 19 related to early maturity, intermediate height and low yield, and group W related to early maturity, short height and intermediate yield.

  • PDF

A Study on the Relationship between Class Similarity and the Performance of Hierarchical Classification Method in a Text Document Classification Problem (텍스트 문서 분류에서 범주간 유사도와 계층적 분류 방법의 성과 관계 연구)

  • Jang, Soojung;Min, Daiki
    • The Journal of Society for e-Business Studies
    • /
    • v.25 no.3
    • /
    • pp.77-93
    • /
    • 2020
  • The literature has reported that hierarchical classification methods generally outperform the flat classification methods for a multi-class document classification problem. Unlike the literature that has constructed a class hierarchy, this paper evaluates the performance of hierarchical and flat classification methods under a situation where the class hierarchy is predefined. We conducted numerical evaluations for two data sets; research papers on climate change adaptation technologies in water sector and 20NewsGroup open data set. The evaluation results show that the hierarchical classification method outperforms the flat classification methods under a certain condition, which differs from the literature. The performance of hierarchical classification method over flat classification method depends on class similarities at levels in the class structure. More importantly, the hierarchical classification method works better when the upper level similarity is less that the lower level similarity.

A Resetting Scheme for Process Parameters using the Mahalanobis-Taguchi System

  • Park, Chang-Soon
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.4
    • /
    • pp.589-603
    • /
    • 2012
  • Mahalanobis-Taguchi system(MTS) is a statistical tool for classifying the normal group and abnormal group in multivariate data structures. In addition to the classification itself, the MTS uses a method for selecting variables useful for the classification. This method can be used efficiently especially when the abnormal group data are scattered without a specific directionality. When the feedback adjustment procedure through the measurements of the process output for controlling process input variables is not practically possible, the reset procedure can be an alternative one. This article proposes a reset procedure using the MTS. Moreover, a method for identifying input variables to reset is also proposed by the use of the contribution. The identification of the root-cause parameters using the existing dimension-reduced contribution tends to be difficult due to the variety of correlation relationships of multivariate data structures. However, it became possible to provide an improved decision when used together with the location-centered contribution and the individual-parameter contribution.

Temporal Classification Method for Forecasting Power Load Patterns From AMR Data

  • Lee, Heon-Gyu;Shin, Jin-Ho;Park, Hong-Kyu;Kim, Young-Il;Lee, Bong-Jae;Ryu, Keun-Ho
    • Korean Journal of Remote Sensing
    • /
    • v.23 no.5
    • /
    • pp.393-400
    • /
    • 2007
  • We present in this paper a novel power load prediction method using temporal pattern mining from AMR(Automatic Meter Reading) data. Since the power load patterns have time-varying characteristic and very different patterns according to the hour, time, day and week and so on, it gives rise to the uninformative results if only traditional data mining is used. Also, research on data mining for analyzing electric load patterns focused on cluster analysis and classification methods. However despite the usefulness of rules that include temporal dimension and the fact that the AMR data has temporal attribute, the above methods were limited in static pattern extraction and did not consider temporal attributes. Therefore, we propose a new classification method for predicting power load patterns. The main tasks include clustering method and temporal classification method. Cluster analysis is used to create load pattern classes and the representative load profiles for each class. Next, the classification method uses representative load profiles to build a classifier able to assign different load patterns to the existing classes. The proposed classification method is the Calendar-based temporal mining and it discovers electric load patterns in multiple time granularities. Lastly, we show that the proposed method used AMR data and discovered more interest patterns.

Varietal Classification on the Basis of Cluster Analysis in Local Tobacco (Cluster분석에 의한 재래종 담배 품종의 분류에 관하여)

  • 안대진;김윤동
    • Journal of the Korean Society of Tobacco Science
    • /
    • v.4 no.1
    • /
    • pp.37-42
    • /
    • 1982
  • Korean local and introduced varieties were classified by the cluster analysis of correlation and taxonomic distance based on nineteen growth characters. 1. Thirty six varieties can be classified into three groups(I, II, III) by WVGM (weighted variable group method) 2. Major characters for classifying cultivars were days to flowering, number of leaves, leaf length, stem diameter and width of midrib: the five characters seemed to be useful in monothetic classification. 3. Korean varieties were similar to oriental, and japanese varieties to taiwan. 4. WVGM was more accurate and meaningful than classification by WPGM (weighted paired group method) and reticulate diagram of correlation. 5. Characteristics of each group: Group I closely related to many leaves, late of maturity and broad leaf type, Group II related to medium leaves, late of maturity and narrow leaf type, Croup 19 related to few leaves, early of maturity and medium leaf type respectively.

  • PDF

Clinical Outcomes according to Radiological Classification of Brainstem Hemorrhages (방사선학적 출혈양상에 근거한 뇌간출혈의 분류에 따른 임상결과)

  • Cho, Won Jung;Moon, Seong Ho;Lee, Seung Min;Yang, Jae Young;Choi, Chun Sik;Ju, Mun Bae
    • Journal of Korean Neurosurgical Society
    • /
    • v.29 no.2
    • /
    • pp.217-221
    • /
    • 2000
  • Objective : Brainstem hemorrhages usually result in much higher mortality and morbidty than any other intracranial vascular lesions. The purpose of the study is to evaluate the relationship of the radiological classification of the lesions and the clinical outcomes, and to evaluate the value of such classification on the choice of management modality. Method : Thirty seven patients with primary brainstem hemorrhage were managed medically or surgically between Oct. 1995 and Mar. 1998. The lesions were classified as two groups based on radiological findings as follows : Focal subependymal hematoma(group I, n=7) and diffuse tegmentobasilar hemorrhage(group II, n=30). The outcomes at discharge were retrospectively reviewed according to such classification. Result : The most common clinical pictures and radiological findings in each group were as followings : 1) Group I : focal compressive lesion which displaces rather than destroys brain tissue. It occurs in a younger age group and causes neurological deficits which are often partially reversible. Operative hematoma evacuation was performed in 43.3%. Their mean improved Glasgow Coma Scale(GCS) score was 4.7. 2) Group II : hypertensive brain stem hemorrhage. It usually causes a diffuse lesion occurring in an older age group and most often associated with profound irreversible neurological deficits which are often fatal. Operative hematoma evacuation was performed in 16.7%. Their mean improved GCS score was 1.4. In both conservatively treated group I and II has no siginificant clinical improvement. Conclusion : Although there is an overlap among them and the size of the group is small, the pathophysiologic classification of this lesion based on clinical features and radiological findings may be useful for decision of treatment method.

  • PDF

Study on Systematizing the Combination of Method of Treatment and Symptoms Using the Basic Traditional Medicine Theory (한의 기초 이론을 이용한 치법-증상 조합 분류, 체계화 연구)

  • Oh, Yong Taek;Kim, An Na;Kim, Sang Kyun;Seo, Jin Soon;Jang, Hyun Chul
    • Journal of Physiology & Pathology in Korean Medicine
    • /
    • v.27 no.4
    • /
    • pp.383-390
    • /
    • 2013
  • In order to improve the integrating accuracy and to elevate the serviceability of the KM(Korean Medicine) ontology constructed by the Korea Institute of Oriental Medicine, this research simplified the many-to-many corresponding relationship between groups of methods of treatment and groups of accompanied symptoms from disease ontology and categorized systematically the relationship. We first extracted the combinations of methods of treatment and accompanied symptoms from the KM ontology, then categorized the attributes of combinations that their frequencies were over 10 times by analyzing KM terms definition and the basic KM theory. We constructed the classification hierarchy having 14 kinds of classification in 4 steps and extracted 450 meaningful combinations. This research improved the integrating accuracy and elevated the serviceability of KM information by the classification system.

A Study on the Application of Combined Interpolation and Terrain Classification in Digital Terrain Model (수치지형모형에 있어 지형의 분석과 조합보관법의 적용에 관한 연구)

  • Yeu, Bock-Mo;Park, Woon-Yong;Kwon, Hyon;Mun, Du-Yeoul
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.8 no.2
    • /
    • pp.53-61
    • /
    • 1990
  • In this study, terrain classification was done by using the quantitative classification parameter and suitable interpolation method was applied to improve the accuracy of digital terrain models and to increase its practical applications. A study area was classified into three groups using the quantitative classification parameters and an interpolation equation suitable for each group was used for economical application of the interpolation method. The accuracy of digital terrain models was improved in case of large grid intervals by applying combined interpolation method suitable for each terrain group.

  • PDF