• 제목/요약/키워드: binary classification model

검색결과 180건 처리시간 0.029초

Eigenvoice Adaptation of Classification Model for Binary Mask Estimation (Eigenvoice를 이용한 이진 마스크 분류 모델 적응 방법)

  • Kim, Gibak
    • Journal of Broadcast Engineering
    • /
    • 제20권1호
    • /
    • pp.164-170
    • /
    • 2015
  • This paper deals with the adaptation of classification model in the binary mask approach to suppress noise in the noisy environment. The binary mask estimation approach is known to improve speech intelligibility of noisy speech. However, the same type of noisy data for the test data should be included in the training data for building the classification model of binary mask estimation. The eigenvoice adaptation is applied to the noise-independent classification model and the adapted model is used as noise-dependent model. The results are reported in Hit rates and False alarm rates. The experimental results confirmed that the accuracy of classification is improved as the number of adaptation sentences increases.

Discriminant Analysis of Binary Data by Using the Maximum Entropy Distribution

  • Lee, Jung Jin;Hwang, Joon
    • Communications for Statistical Applications and Methods
    • /
    • 제10권3호
    • /
    • pp.909-917
    • /
    • 2003
  • Although many classification models have been used to classify binary data, none of the classification models dominates all varying circumstances depending on the number of variables and the size of data(Asparoukhov and Krzanowski (2001)). This paper proposes a classification model which uses information on marginal distributions of sub-variables and its maximum entropy distribution. Classification experiments by using simulation are discussed.

Could Decimal-binary Vector be a Representative of DNA Sequence for Classification?

  • Sanjaya, Prima;Kang, Dae-Ki
    • International journal of advanced smart convergence
    • /
    • 제5권3호
    • /
    • pp.8-15
    • /
    • 2016
  • In recent years, one of deep learning models called Deep Belief Network (DBN) which formed by stacking restricted Boltzman machine in a greedy fashion has beed widely used for classification and recognition. With an ability to extracting features of high-level abstraction and deal with higher dimensional data structure, this model has ouperformed outstanding result on image and speech recognition. In this research, we assess the applicability of deep learning in dna classification level. Since the training phase of DBN is costly expensive, specially if deals with DNA sequence with thousand of variables, we introduce a new encoding method, using decimal-binary vector to represent the sequence as input to the model, thereafter compare with one-hot-vector encoding in two datasets. We evaluated our proposed model with different contrastive algorithms which achieved significant improvement for the training speed with comparable classification result. This result has shown a potential of using decimal-binary vector on DBN for DNA sequence to solve other sequence problem in bioinformatics.

On EM Algorithm For Discrete Classification With Bahadur Model: Unknown Prior Case

  • Kim, Hea-Jung;Jung, Hun-Jo
    • Journal of the Korean Statistical Society
    • /
    • 제23권1호
    • /
    • pp.63-78
    • /
    • 1994
  • For discrimination with binary variables, reformulated full and first order Bahadur model with incomplete observations are presented. This allows prior probabilities associated with multiple population to be estimated for the sample-based classification rule. The EM algorithm is adopted to provided the maximum likelihood estimates of the parameters of interest. Some experiences with the models are evaluated and discussed.

  • PDF

A GA-based Binary Classification Method for Bankruptcy Prediction (도산예측을 위한 유전 알고리듬 기반 이진분류기법의 개발)

  • Min, Jae-H.;Jeong, Chul-Woo
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • 제33권2호
    • /
    • pp.1-16
    • /
    • 2008
  • The purpose of this paper is to propose a new binary classification method for predicting corporate failure based on genetic algorithm, and to validate its prediction power through empirical analysis. Establishing virtual companies representing bankrupt companies and non-bankrupt ones respectively, the proposed method measures the similarity between the virtual companies and the subject for prediction, and classifies the subject into either bankrupt or non-bankrupt one. The values of the classification variables of the virtual companies and the weights of the variables are determined by the proper model to maximize the hit ratio of training data set using genetic algorithm. In order to test the validity of the proposed method, we compare its prediction accuracy with ones of other existing methods such as multi-discriminant analysis, logistic regression, decision tree, and artificial neural network, and it is shown that the binary classification method we propose in this paper can serve as a premising alternative to the existing methods for bankruptcy prediction.

Comparative Analysis of the Binary Classification Model for Improving PM10 Prediction Performance (PM10 예측 성능 향상을 위한 이진 분류 모델 비교 분석)

  • Jung, Yong-Jin;Lee, Jong-Sung;Oh, Chang-Heon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • 제25권1호
    • /
    • pp.56-62
    • /
    • 2021
  • High forecast accuracy is required as social issues on particulate matter increase. Therefore, many attempts are being made using machine learning to increase the accuracy of particulate matter prediction. However, due to problems with the distribution of imbalance in the concentration and various characteristics of particulate matter, the learning of prediction models is not well done. In this paper, to solve these problems, a binary classification model was proposed to predict the concentration of particulate matter needed for prediction by dividing it into two classes based on the value of 80㎍/㎥. Four classification algorithms were utilized for the binary classification of PM10. Classification algorithms used logistic regression, decision tree, SVM, and MLP. As a result of performance evaluation through confusion matrix, the MLP model showed the highest binary classification performance with 89.98% accuracy among the four models.

Customer Level Classification Model Using Ordinal Multiclass Support Vector Machines

  • Kim, Kyoung-Jae;Ahn, Hyun-Chul
    • Asia pacific journal of information systems
    • /
    • 제20권2호
    • /
    • pp.23-37
    • /
    • 2010
  • Conventional Support Vector Machines (SVMs) have been utilized as classifiers for binary classification problems. However, certain real world problems, including corporate bond rating, cannot be addressed by binary classifiers because these are multi-class problems. For this reason, numerous studies have attempted to transform the original SVM into a multiclass classifier. These studies, however, have only considered nominal classification problems. Thus, these approaches have been limited by the existence of multiclass classification problems where classes are not nominal but ordinal in real world, such as corporate bond rating and multiclass customer classification. In this study, we adopt a novel multiclass SVM which can address ordinal classification problems using ordinal pairwise partitioning (OPP). The proposed model in our study may use fewer classifiers, but it classifies more accurately because it considers the characteristics of the order of the classes. Although it can be applied to all kinds of ordinal multiclass classification problems, most prior studies have applied it to finance area like bond rating. Thus, this study applies it to a real world customer level classification case for implementing customer relationship management. The result shows that the ordinal multiclass SVM model may also be effective for customer level classification.

Import Vector Voting Model for Multi-pattern Classification (다중 패턴 분류를 위한 Import Vector Voting 모델)

  • Choi, Jun-Hyeog;Kim, Dae-Su;Rim, Kee-Wook
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • 제13권6호
    • /
    • pp.655-660
    • /
    • 2003
  • In general, Support Vector Machine has a good performance in binary classification, but it has the limitation on multi-pattern classification. So, we proposed an Import Vector Voting model for two or more labels classification. This model applied kernel bagging strategy to Import Vector Machine by Zhu. The proposed model used a voting strategy which averaged optimal kernel function from many kernel functions. In experiments, not only binary but multi-pattern classification problems, our proposed Import Vector Voting model showed good performance for given machine learning data.

Discriminant Analysis of Binary Data with Multinomial Distribution by Using the Iterative Cross Entropy Minimization Estimation

  • Lee Jung Jin
    • Communications for Statistical Applications and Methods
    • /
    • 제12권1호
    • /
    • pp.125-137
    • /
    • 2005
  • Many discriminant analysis models for binary data have been used in real applications, but none of the classification models dominates in all varying circumstances(Asparoukhov & Krzanowski(2001)). Lee and Hwang (2003) proposed a new classification model by using multinomial distribution with the maximum entropy estimation method. The model showed some promising results in case of small number of variables, but its performance was not satisfactory for large number of variables. This paper explores to use the iterative cross entropy minimization estimation method in replace of the maximum entropy estimation. Simulation experiments show that this method can compete with other well known existing classification models.

Prediction of extreme PM2.5 concentrations via extreme quantile regression

  • Lee, SangHyuk;Park, Seoncheol;Lim, Yaeji
    • Communications for Statistical Applications and Methods
    • /
    • 제29권3호
    • /
    • pp.319-331
    • /
    • 2022
  • In this paper, we develop a new statistical model to forecast the PM2.5 level in Seoul, South Korea. The proposed model is based on the extreme quantile regression model with lasso penalty. Various meteorological variables and air pollution variables are considered as predictors in the regression model, and the lasso quantile regression performs variable selection and solves the multicollinearity problem. The final prediction model is obtained by combining various extreme lasso quantile regression estimators and we construct a binary classifier based on the model. Prediction performance is evaluated through the statistical measures of the performance of a binary classification test. We observe that the proposed method works better compared to the other classification methods, and predicts 'very bad' cases of the PM2.5 level well.