Browse > Article
http://dx.doi.org/10.5351/CSAM.2015.22.3.265

Exploratory Methods for Joint Distribution Valued Data and Their Application  

Igarashi, Kazuto (Graduate School of Information Science and Technology, Hokkaido University)
Minami, Hiroyuki (Information Initiative Center, Hokkaido University)
Mizuta, Masahiro (Information Initiative Center, Hokkaido University)
Publication Information
Communications for Statistical Applications and Methods / v.22, no.3, 2015 , pp. 265-276 More about this Journal
Abstract
In this paper, we propose hierarchical cluster analysis and multidimensional scaling for joint distribution valued data. Information technology is increasing the necessity of statistical methods for large and complex data. Symbolic Data Analysis (SDA) is an attractive framework for the data. In SDA, target objects are typically represented by aggregated data. Most methods on SDA deal with objects represented as intervals and histograms. However, those methods cannot consider information among variables including correlation. In addition, objects represented as a joint distribution can contain information among variables. Therefore, we focus on methods for joint distribution valued data. We expanded the two well-known exploratory methods using the dissimilarities adopted Hall Type relative projection index among joint distribution valued data. We show a simulation study and an actual example of proposed methods.
Keywords
Symbolic Data Analysis (SDA); cluster analysis; multidimensional scaling; projection index; kernel density estimation;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Billard, L. and Diday, E. (2006). Symbolic Data Analysis: Conceptual Statistics and Data Mining, John Wiley & Sons, Chichester.
2 Bock, H. H. and Diday, E. (2000). Analysis of Symbolic Data: Exploratory Methods for Extracting Statistical Information from Complex Data, Springer, Berlin.
3 Chavent, M. and Lechevallier, Y. (2002). Dynamical clustering of interval data: Optimization of an adequacy criterion based on Hausdorff distance. In K. Jajuga, A. Sokoowski, and H. H. Bock (Eds.), Classification, Clustering and Data Analysis, Springer, 53-59.
4 Diday, E. and Noirhomme-Fraiture, M. (2008). Symbolic Data Analysis and the SODAS Software, John Wiley & Sons, Chichester.
5 Gowda, K. C. and Diday, E. (1991). Symbolic clustering using a new dissimilarity measure, Pattern Recognition, 24, 567-578.   DOI   ScienceOn
6 Groenen, P. J. F.,Winsberg, S., Rodriguez, O. and Diday, E. (2006). I-Scal: Multidimensional scaling of interval dissimilarities, Computational Statistics and Data Analysis, 51, 360-378.   DOI   ScienceOn
7 Hartigan, J. A. (1975). Clustering Algorithms, John Wiley & Sons, New York.
8 Hiro, S., Komiya, Y., Minami, H. and Mizuta, M. (2004). Multidimensional relative projection pursuit, Japanese Journal of Applied Statistics, 33, 225-241.   DOI
9 Katayama, K., Minami, H. and Mizuta, M. (2010). Hierarchical symbolic clustering for distribution valued data, Journal of the Japanese Society of Computational Statistics, 22, 83-89.
10 Little, M., McSharry, P. E., Hunter, E. J., Spielman, J. and Ramig, L. O. (2009). Suitability of dysphonia measurements for telemonitoring of Parkinson's disease, IEEE Transactions on BioMedical Engneering, 56, 1-19.   DOI
11 Little, M., McSharry, P. E., Roberts, S. J., Castello, D. and Moroz, I. M. (2007). Exploiting nonlinear recurrence and fractal scaling properties for voice disorder detection, Biomedical Engineering OnLine, 6, 1015-1022.
12 Mizuta, M. and Minami, H. (2012). Analysis of distribution valued dissimilarity data. In W. A. Gaul, A. Geyer-Schulz, L. Schmidt-Thieme, and J. Kunze (Eds.), Challenges at the Interface of Data Analysis, Computer Science, and Optimization (pp. 23-28), Springer, Berlin.
13 Nelsen, R. B. (1999). An Introduction to Copulas, Springer, New York.
14 Scott, D. W. (1992). Multivariate Density Estimation, John Wiley & Sons, New York.
15 Terada, Y. and Yadohisa, H. (2010). Non-hierarchical clustering for distribution-valued data, In Proceedings of COMPSTAT 2010 (pp. 1653-1660), Physical-Verlag, Berlin.
16 Torgerson, W. S. (1958). Theory and Methods of Scaling, Wile, New York.
17 UCI Machine Learning Repository (2015). Available from: http://archive.ics.uci.edu/ml/
18 Tsanas, A. Little, M. A. McSharry, P. E. and Ramig, L. O. (2010). Accurate telemonitoring of Parkinson's disease progression by noninvasive speech tests, IEEE Transactions on Biomedical Engineering, 57, 884-893.   DOI   ScienceOn