DOI QR코드

DOI QR Code

Improvement of Self Organizing Maps using Gap Statistic and Probability Distribution

  • Jun, Sung-Hae (Department of Bioinformatics & Statistics, Cheongju University)
  • 발행 : 2008.06.01

초록

Clustering is a method for unsupervised learning. General clustering tools have been depended on statistical methods and machine learning algorithms. One of the popular clustering algorithms based on machine learning is the self organizing map(SOM). SOM is a neural networks model for clustering. SOM and extended SOM have been used in diverse classification and clustering fields such as data mining. But, SOM has had a problem determining optimal number of clusters. In this paper, we propose an improvement of SOM using gap statistic and probability distribution. The gap statistic was introduced to estimate the number of clusters in a dataset. We use gap statistic for settling the problem of SOM. Also, in our research, weights of feature nodes are updated by probability distribution. After complete updating according to prior and posterior distributions, the weights of SOM have probability distributions for optima clustering. To verify improved performance of our work, we make experiments compared with other learning algorithms using simulation data sets.

키워드

참고문헌

  1. T. Kohonen, Self Organizing Maps, Second Edition, Springer, 1997
  2. J. Han, M. Kamber, Data Mining Concepts and Techniques, Morgan Kaufmann, 2001
  3. S. Haykin, Neural Networks, Prentice Hall, 1999
  4. C. M. Bishop, M. Svensen, C. K. I. Williams, "GTM: A Principled Alternative to the Self Organizing Map", Proceeding of ICANN 1996, vol. 1112, pp. 165-170, 1996
  5. A. Ngan, S. Thiria, F. Badran, M. Yaccoub, C. Moulin, M. Crepon, Clustering and classification based on expert knowledge propagation using probabilistic self-organizing map(PRSOM): application to the classification of satellite ocean color TOA observations", Proceeding of IEEE International Symposium on Computational Intelligence for Measurement Systems and Applications, pp. 146-148, 2003
  6. D. A. Stacey, R. Farshad, "A probabilistic self-organizing classification neural network architecture", Proceeding of International Joint Conference on Neural Networks, vol. 6, pp. 4059-4063, 1999
  7. A. Utsugi, "Topology selection for self-organizing maps", Network: Computation in Neural Systems, vol. 7, no. 4, pp. 727-740, 1996 https://doi.org/10.1088/0954-898X/7/4/007
  8. A. Utsugi, "Hyperparameter selection for self-organizing maps", Neural Computation, vol. 9, no. 3, pp. 623-635, 1997 https://doi.org/10.1162/neco.1997.9.3.623
  9. H., Yin, N. M., Allinson, "Bayesian learning for self-organising maps", Electronics Letters, vol. 33, issue 4, pp. 304-305, 1997 https://doi.org/10.1049/el:19970196
  10. S. H. Jun, H. Jorn, J. Hwang, "Bayesian Learning for Self Organizing Maps", The Korean Journal of Applied Statistics, vol. 15, no. 2, pp. 251-267, 2002 https://doi.org/10.5351/KJAS.2002.15.2.251
  11. S. H. Jun, "An Optimal Clustering using Hybrid Self Organizing Map", International Journal of Fuzzy Logic and Intelligent Systems, vol. 6, no. 1, pp. 10-14, 2006 https://doi.org/10.5391/IJFIS.2006.6.1.010
  12. S. H. Jun, "New Heuristic of Self Organizing Map using Updating Distribution", Proceeding of the 1st International Conference on Cognitive Neurodynamics - 2007 (ICCN'07) and the 3rd Shanghai International Conference on Physiological Biophysics - Cognitive Neurodynamics (SICPB'07), 2007
  13. D. Dumitrescu, B. Lazzerini, L. C. Jain, Fuzzy Sets and Their Application to Clustering and Training, CRC Press, 2000
  14. B. S. Everitt, S. Landau, M. Leese, Cluster Analysis, Arnold, 2001
  15. M. J. Park, S. H. Jun, K. W. Oh, "Determination of Optimal Cluster Size Using Bootstrap and Genetic Algorithm", International Journal of Fuzzy Logic and Intelligent Systems, vol. 13, no. 1, pp. 12-17, 2003 https://doi.org/10.5391/JKIIS.2003.13.1.012
  16. R. Tibshirani, G. Walther, T. Hastie, "Estimating the number of clusters in a dataset via the Gap statistics", Journal of the Royal Statistical Society, B, 63, pp. 411-423, 2001 https://doi.org/10.1111/1467-9868.00293
  17. M. A. Tanner, Tools for Statistical inference, Springer, 1996
  18. A. Gelman, J. B. Carlin, H. S. Stern, D. B. Rudin, Bayesian Data Analysis, Chapman & Hill, 1995
  19. R. M. Neal, Bayesian Learning for Neural Networks, Springer, 1996
  20. S. J. Press, Bayesian Statistics: Principles, Models, and Applications, John Wiley & Sons, 1989
  21. W. L. Martinez, A. R. Zartinez, Computational Statistics Handbook with MATRAB, Chapman & Hall, 2002
  22. G. Mclachlan, D. Peel, Finite Mixture Models, John Wiley & Sons, 2000
  23. T. M. Mitchell, Machine Learning, McGraw-Hill, 1997
  24. A. S. Pandya, R. B. Macy, Pattern Recognition with Neural Networks in C++, IEEE Press, 1995