Search | Korea Science

Baek, Jang-Sun
- Journal of the Korean Data and Information Science Society
- /
- v.19 no.3
- /
- pp.751-759
- /
- 2008
Mixtures of factor analyzers(MFA) is useful to model the distribution of high-dimensional data on much lower dimensional space where the number of observations is very large relative to their dimension. Mixtures of common factor analyzers(MCFA) can reduce further the number of parameters in the specification of the component covariance matrices as the number of classes is not small. Moreover, the factor scores of MCFA can be displayed in low-dimensional space to distinguish the groups. We propose the factor scores of MCFA as new low-dimensional features for classification of high-dimensional data. Compared with the conventional dimension reduction methods such as principal component analysis(PCA) and canonical covariates(CV), the proposed factor score was shown to have higher correct classification rates for three real data sets when it was used in parametric and nonparametric classifiers.
PDF

Cho Dong-Yeon;Zhang Byoung-Tak
- Journal of KIISE:Software and Applications
- /
- v.32 no.11
- /
- pp.1071-1083
- /
- 2005
By estimating probability distributions of the good solutions in the current population, some researchers try to find the optimal solution more efficiently. Particularly, finite mixtures of distributions have a very useful role in dealing with complex problems. However, it is difficult to choose the number of components in the mixture models and merge superior partial solutions represented by each component. In this paper, we propose a new continuous evolutionary optimization algorithm with distribution estimation by variational Bayesian mixtures of factor analyzers. This technique can estimate the number of mixtures automatically and combine good sub-solutions by sampling new individuals with the latent variables. In a comparison with two probabilistic model-based evolutionary algorithms, the proposed scheme achieves superior performance on the traditional benchmark function optimization. We also successfully estimate the parameters of S-system for the dynamic modeling of biochemical networks.
PDF KSCI

Lim, Su-Yeol;Baek, Jang-Sun
- Journal of the Korean Data and Information Science Society
- /
- v.23 no.2
- /
- pp.235-245
- /
- 2012
Logistic discrimination is an useful statistical technique for quantitative analysis of financial service industry. Especially it is not only easy to be implemented, but also has good classification rate. Generalized additive model is useful for credit scoring since it has the same advantages of logistic discrimination as well as accounting ability for the nonlinear effects of the explanatory variables. It may, however, need too many additive terms in the model when the number of explanatory variables is very large and there may exist dependencies among the variables. Mixtures of factor analyzers can be used for dimension reduction of high-dimensional feature. This study proposes to use the low-dimensional factor scores of mixtures of factor analyzers as the new features in the generalized additive model. Its application is demonstrated in the classification of some real credit scoring data. The comparison of correct classification rates of competing techniques shows the superiority of the generalized additive model using factor scores.
https://doi.org/10.7465/jkdi.2012.23.2.235 인용 PDF KSCI

Cho Dong-Yeon;Zhang Byoung-Tak
- Proceedings of the Korean Information Science Society Conference
- /
- 2005.07b
- /
- pp.697-699
- /
- 2005
연속 변수 함수 최적화를 위한 진화 연산에서는 전통적으로 확률 분포를 도입하여 새로운 세대를 생성하는 기법을 사용하고 있다. 최근 들어 이러한 확률 분포를 개체군으로부터 추정하여 보다 효율적으로 최적화를 해결하려는 연구가 진행되고 있다. 본 논문에서는 variational 베이지안 혼합 인자 분석 기법(Bayesian mixtures of factor analyzers)을 사용한 개체군의 분포 추정을 통해 연속 변수 함수의 최적화 문제를 해결하는 방법을 제안한다. 이 기법은 혼합 분포의 개수 추정을 자동화하여 개체군의 다양성을 유지할 수 있기 때문에 지역 최적점으로 일찍 수렴하는 현상을 방지할 수 있으며, 세부 개체군 내의 분포 추정을 통해 탐색을 효율적으로 수행할 수 있다. 잘 알려진 평가 함수들에 대하여 다른 분포 추정 진화 연산과 비교하여 제안하는 방법의 우수성을 검증하였다.
PDF