• Title/Summary/Keyword: 정분류율

Search Result 28, Processing Time 0.018 seconds

Diversity based Ensemble Genetic Programming for Improving Classification Performance (분류 성능 향상을 위한 다양성 기반 앙상블 유전자 프로그래밍)

  • Hong Jin-Hyuk;Cho Sung-Bae
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.12
    • /
    • pp.1229-1237
    • /
    • 2005
  • Combining multiple classifiers has been actively exploited to improve classification performance. It is required to construct a pool of accurate and diverse base classifier for obtaining a good ensemble classifier. Conventionally ensemble learning techniques such as bagging and boosting have been used and the diversify of base classifiers for the training set has been estimated, but there are some limitations in classifying gene expression profiles since only a few training samples are available. This paper proposes an ensemble technique that analyzes the diversity of classification rules obtained by genetic programming. Genetic programming generates interpretable rules, and a sample is classified by combining the most diverse set of rules. We have applied the proposed method to cancer classification with gene expression profiles. Experiments on lymphoma cancer dataset, prostate cancer dataset and ovarian cancer dataset have illustrated the usefulness of the proposed method. h higher classification accuracy has been obtained with the proposed method than without considering diversity. It has been also confirmed that the diversity increases classification performance.

Development of game indicators and winning forecasting models with game data (게임 데이터를 이용한 지표 개발과 승패예측모형 설계)

  • Ku, Jimin;Kim, Jaehee
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.2
    • /
    • pp.237-250
    • /
    • 2017
  • A new field of e-sports gains the great popularity in Korea as well as abroad. AOS (aeon of strife) genre games are quickly gaining popularity with gamers from all over the world and the game companies hold game competitions. The e-sports broadcasting teams and webzines use a variety of statistical indicators. In this paper, as an AOS genre game, League of Legends game data is used for statistical analysis using the indicators to predict the outcome. We develop new indicators with the factor analysis to improve existing indicators. Also we consider discriminant function, neural network model, and SVM (support vector machine) for make winning forecasting models. As a result, the new position indicators reflect the nature of the role in the game and winning forecasting models show more than 95 percent accuracy.

Predictive Analysis of Problematic Smartphone Use by Machine Learning Technique

  • Kim, Yu Jeong;Lee, Dong Su
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.2
    • /
    • pp.213-219
    • /
    • 2020
  • In this paper, we propose a classification analysis method for diagnosing and predicting problematic smartphone use in order to provide policy data on problematic smartphone use, which is getting worse year after year. Attempts have been made to identify key variables that affect the study. For this purpose, the classification rates of Decision Tree, Random Forest, and Support Vector Machine among machine learning analysis methods, which are artificial intelligence methods, were compared. The data were from 25,465 people who responded to the '2018 Problematic Smartphone Use Survey' provided by the Korea Information Society Agency and analyzed using the R statistical package (ver. 3.6.2). As a result, the three classification techniques showed similar classification rates, and there was no problem of overfitting the model. The classification rate of the Support Vector Machine was the highest among the three classification methods, followed by Decision Tree and Random Forest. The top three variables affecting the classification rate among smartphone use types were Life Service type, Information Seeking type, and Leisure Activity Seeking type.

A credit classification method based on generalized additive models using factor scores of mixtures of common factor analyzers (공통요인분석자혼합모형의 요인점수를 이용한 일반화가법모형 기반 신용평가)

  • Lim, Su-Yeol;Baek, Jang-Sun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.2
    • /
    • pp.235-245
    • /
    • 2012
  • Logistic discrimination is an useful statistical technique for quantitative analysis of financial service industry. Especially it is not only easy to be implemented, but also has good classification rate. Generalized additive model is useful for credit scoring since it has the same advantages of logistic discrimination as well as accounting ability for the nonlinear effects of the explanatory variables. It may, however, need too many additive terms in the model when the number of explanatory variables is very large and there may exist dependencies among the variables. Mixtures of factor analyzers can be used for dimension reduction of high-dimensional feature. This study proposes to use the low-dimensional factor scores of mixtures of factor analyzers as the new features in the generalized additive model. Its application is demonstrated in the classification of some real credit scoring data. The comparison of correct classification rates of competing techniques shows the superiority of the generalized additive model using factor scores.

Development of a Driver Safety Information Service Model Using Point Detectors at Signalized Intersections (지점검지자료 기반 신호교차로 운전자 안전서비스 개발)

  • Jang, Jeong-A;Choe, Gi-Ju;Mun, Yeong-Jun
    • Journal of Korean Society of Transportation
    • /
    • v.27 no.5
    • /
    • pp.113-124
    • /
    • 2009
  • This paper suggests a new approach for providing information for driver safety at signalized intersections. Particularly dangerous situations at signalized intersections such as red-light violations, accelerating through yellow intervals, red-light running, and stopping abruptly due to the dilemma zone problem are considered in this study. This paper presents the development of a dangerous vehicle determination algorithm by collecting real-time vehicle speeds and times from multiple point detectors when the vehicles are traveling during phase-change. For an evaluation of this algorithm, VISSIM is used to perform a real-time multiple detection situation by changing the input data such as various inflow-volume, design speed change, driver perception, and response time. As a result the correct-classification rate is approximately 98.5% and the prediction rate of the algorithm is approximately 88.5%. This paper shows the sensitivity results by changing the input data. This result showed that the new approach can be used to improve safety for signalized intersections.

Predicting Relative Superiority of TV Drama First Episodes based on the Quantitative Competency Index of the Cast and Crew (TV드라마 참여 인물의 계량 능력지표에 기반한 첫 회 시청률 상대적 우위 예측)

  • Ju, Sang Phil;Hong, June Seok;Kim, Wooju
    • The Journal of the Korea Contents Association
    • /
    • v.19 no.6
    • /
    • pp.179-191
    • /
    • 2019
  • It is not easy to predict the return on investment in the content business, and there is no index to evaluate cast & crew. The absolute number of TV ratings is steadily declining, but there is no substitute index yet. In this study, we tried to predict the relative popularity of the drama by designing the relative superiority of the individual drama viewership as the response variable and designing the relative superiority of the drama participants as the explanatory variables. We used various machine learning algorithms and added explanatory variables that were found to be useful in previous studies. As a result, with properly combined explanatory variables, a high prediction accuracy of 84% is obtained. In this study, we intend to promote the investment efficiency of the entire contents industry by predicting the relative popularity of the contents.

Comparison of evaluation measures for classification models on binary data (이진자료 분류모형에 대한 평가측도의 특성 비교)

  • Kim, Byungsoo;Kwon, Soyoung
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.2
    • /
    • pp.291-300
    • /
    • 2019
  • This study investigates the characteristics of evaluation measures for classification models on a binary response variable in order to evaluate their suitability for use. Six measures are considered: Accuracy, Sensitivity, Specificity, Precision, F-measure, and the Heidke's skill score (HSS). Evaluation measures are reformulated using x(ratio of actually 1), y(ratio predicted by 1), z(ratio of both actual and predicted by 1) from the confusion matrix. We suggest two necessary conditions to assess the suitability of the evaluation measures. The first condition is that the measure function is constant for x and y in the case of a random model. The second condition is that the measure function is increasing for z and decreasing for x and y. Since only HSS satisfies the two conditions, that is always appropriate as an evaluation measure for the classification model on the binary response variable, and the other measures should be used within a limited range.

A VALIDITY STUDY OF PARENT BEHAVIORAL RATING SCALES AS DIAGNOSTIC TOOLS OF ATTENTION DEFICIT/HYPERACTIVITY DISORDER (주의력결핍/과잉운동장애(ADHD) 아동의 진단도구로서 부모용 행동 평가지의 타당도 연구 - 한국아동인성검사와 아동 ${\cdot}$ 청소년 행동평가척도를 중심으로 -)

  • Kim, Ji-Hae;So, Yoo-Kyung;Jung, Yoo-Sook;Lee, Im-Soon;Hong, Sung-Do
    • Journal of the Korean Academy of Child and Adolescent Psychiatry
    • /
    • v.11 no.2
    • /
    • pp.282-289
    • /
    • 2000
  • This study was designed to examine the validity of HPR subscale in Korean Personality Inventory for Children(KPI-C) and Attention Problems subscale in Korean Child Behavior Checklist(K-CBCL) as diagnostic tool for Attention-Deficit/Hyperactivity Disorder(ADHD). Nineteen ADHD-1 type, twenty-three ADHD-H type, sixteen Neurosis, and fifteen normal children with the age from 6 to12 were selected based on DSM-IV, and their responses of the KPI-C and CBCL were analyzed. Omnibus F-test results showed that there were significant differences in the F scores of HPR and Attention Problems T scores(p<.05). But in Posthoc analysis, the HPR and AP scores in three clinical groups were significantly higher than in normal group, but there was no group difference among three clinical groups(p<.05). These results shows that HPR subscale and Attention Problems subscale may be useful tools for screening clinical groups(vs normal group) but there was a limit to the clinical validity of two subscales as diagnostic tools for the subtypes of ADHD.

  • PDF