Browse > Article
http://dx.doi.org/10.5351/KJAS.2015.28.6.1147

A Divisive Clustering for Mixed Feature-Type Symbolic Data  

Kim, Jaejik (Department of Statistics, Sungkyunkwan University)
Publication Information
The Korean Journal of Applied Statistics / v.28, no.6, 2015 , pp. 1147-1161 More about this Journal
Abstract
Nowadays we are considering and analyzing not only classical data expressed by points in the p-dimensional Euclidean space but also new types of data such as signals, functions, images, and shapes, etc. Symbolic data also can be considered as one of those new types of data. Symbolic data can have various formats such as intervals, histograms, lists, tables, distributions, models, and the like. Up to date, symbolic data studies have mainly focused on individual formats of symbolic data. In this study, it is extended into datasets with both histogram and multimodal-valued data and a divisive clustering method for the mixed feature-type symbolic data is introduced and it is applied to the analysis of industrial accident data.
Keywords
mixed feature-type symbolic data; cluster analysis; industrial accident;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Billard, L. and Diday, E. (2006). Symbolic Data Analysis: Conceptual Statistics and Data Mining, John Wiley and Sons, New Jersey.
2 Billard, L. and Kim, J. (2013). Clustering in contemporary mixed-valued data, In Proceedings of the 2013 World Statistics Congress, International Statistical Institute.
3 Bock, H. H. and Diday, E. (2000). Analysis of Symbolic Data: Exploratory Methods for Extracting Statistical Information from Complex Data, Springer-Verlag, New York.
4 Cha, S. H. and Srihari, S. H. (2002). On measuring the distance between histograms, Pattern Recognition Letter, 35, 1355-1370.   DOI
5 Chavent, M. (1998). A monothetic clustering method, Pattern Recognition Letters, 19, 989-996.   DOI
6 Chavent, M. (2000). Criterion-based divisive clustering for symbolic data. In: Bock, H.H., Diday, E. (Eds.), Analysis of Symbolic Data, Exploratory Methods for Extracting Statistical Information from Complex Data, Springer, New York, 299-311.
7 Davis, D. L. and Bouldin, D. W. (1979). A cluster separation measure, IEEE Transactions on Pattern Analysis and Machine Intelligence, 1, 224-227.
8 De Carvalho, F. A. T. (1994). Proximity coefficients between boolean symbolic objects. In: Diday, E., Lechevallier, Y., Schader, M., Bertrand, P., (Eds.), New Approaches in Classification and Data Analysis, Springer-Verlag, Berlin, 387-394.
9 De Carvalho, F. A. T. (1998). Extension based proximity coefficients between constrained boolean symbolic objects. In: Hayashi, C., Ohsumi, N., Yajima, K., Tanaka, Y., Bock, H.-H., Baba, Y., (Eds.), In Proceedings of the Fifth Conference of the International Federation of Classification Societies (IFCS-96), Springer-Verlag, Berlin, 370-378.
10 De Carvalho, F. A. T., Brito, P. and Bock, H. H. (2006). Dynamic clustering for interval data based on $L_2$ distance, Computational Statistics, 2, 231-245.
11 De Carvalho, F. A. T. and Lechevallier, Y. (2009). Partitional clustering algorithms for symbolic interval data based on single adaptive distances, Pattern Recognition, 42, 1223-1236.   DOI
12 De Carvalho, F. A. T. and De Souza, R. M. C. R. (2010). Unsupervised pattern recognition models for mixed feature-type symbolic data. Pattern Recognition Letters, 31, 430-443.   DOI
13 De Souza, R. M. C. R. and De Carvalho, F. A. T. (2007). A clustering methods for mixed feature-type symbolic data using adaptive squared Euclidean distances, The 7th International Conference on Hybrid Intelligent Systems, 168-173.
14 Diday, E. (1987). Introduction a l'approche symbolique en analyse des donnees, Premiere Journees Symbolique-Numerique, CEREMADE, Universite Paris IX, 21-56.
15 Dunn, J. C. (1974). Well separated clusters and optimal fuzzy partitions, Journal of Cybernetica, 4, 95-104.   DOI
16 Gowda, K. C. and Diday, E. (1991). Symbolic clustering using a new dissimilarity measure, Pattern Recog-nition, 24, 567-578.   DOI
17 Gowda, K. C. and Ravi, T. V. (1995a). Agglomerative clustering of symbolic objects using the concepts of both similarity and dissimilarity, Pattern Recognition Letters, 16, 647-652.   DOI
18 Irpino, A. and Verde, R. (2006). A newWasserstein based distance for the hierarchical clustering of histogram symbolic data, IFCS 2006, 185-192.
19 Gowda, K. C. and Ravi, T. V. (1995b). Divisive clustering of symbolic objects using the concepts of both similarity and dissimilarity, Pattern Recognition, 28, 1277-1282.   DOI
20 Ichino, M. and Yaguchi, H. (1994). Generalized minkowski metrics for mixed feature type data analysis, IEEE Transactions on Systems, Man, and Cybernetics, 24, 698-709.   DOI
21 Kim, J. and Billard, L. (2011). A polythetic clustering process and cluster validity indexes for histogramvalued objects, Computational Statistics & Data Analysis, 55, 2250-2262.   DOI
22 Kim, J. and Billard, L. (2012). Dissimilarity measures and divisive clustering for symbolic multimodal-valued data, Computational Statistics & Data Analysis, 56, 2795-2808.   DOI
23 Kim, J. and Billard, L. (2013). Dissimilarity measures for histogram-valued observations, Communications in Statistics - Theory and Methods, 42, 283-303.   DOI