DOI QR코드

DOI QR Code

Exploratory Methods for Joint Distribution Valued Data and Their Application

  • Received : 2015.01.23
  • Accepted : 2015.03.31
  • Published : 2015.05.31

Abstract

In this paper, we propose hierarchical cluster analysis and multidimensional scaling for joint distribution valued data. Information technology is increasing the necessity of statistical methods for large and complex data. Symbolic Data Analysis (SDA) is an attractive framework for the data. In SDA, target objects are typically represented by aggregated data. Most methods on SDA deal with objects represented as intervals and histograms. However, those methods cannot consider information among variables including correlation. In addition, objects represented as a joint distribution can contain information among variables. Therefore, we focus on methods for joint distribution valued data. We expanded the two well-known exploratory methods using the dissimilarities adopted Hall Type relative projection index among joint distribution valued data. We show a simulation study and an actual example of proposed methods.

Keywords

References

  1. Billard, L. and Diday, E. (2006). Symbolic Data Analysis: Conceptual Statistics and Data Mining, John Wiley & Sons, Chichester.
  2. Bock, H. H. and Diday, E. (2000). Analysis of Symbolic Data: Exploratory Methods for Extracting Statistical Information from Complex Data, Springer, Berlin.
  3. Chavent, M. and Lechevallier, Y. (2002). Dynamical clustering of interval data: Optimization of an adequacy criterion based on Hausdorff distance. In K. Jajuga, A. Sokoowski, and H. H. Bock (Eds.), Classification, Clustering and Data Analysis, Springer, 53-59.
  4. Diday, E. and Noirhomme-Fraiture, M. (2008). Symbolic Data Analysis and the SODAS Software, John Wiley & Sons, Chichester.
  5. Gowda, K. C. and Diday, E. (1991). Symbolic clustering using a new dissimilarity measure, Pattern Recognition, 24, 567-578. https://doi.org/10.1016/0031-3203(91)90022-W
  6. Groenen, P. J. F.,Winsberg, S., Rodriguez, O. and Diday, E. (2006). I-Scal: Multidimensional scaling of interval dissimilarities, Computational Statistics and Data Analysis, 51, 360-378. https://doi.org/10.1016/j.csda.2006.04.003
  7. Hartigan, J. A. (1975). Clustering Algorithms, John Wiley & Sons, New York.
  8. Hiro, S., Komiya, Y., Minami, H. and Mizuta, M. (2004). Multidimensional relative projection pursuit, Japanese Journal of Applied Statistics, 33, 225-241. https://doi.org/10.5023/jappstat.33.225
  9. Katayama, K., Minami, H. and Mizuta, M. (2010). Hierarchical symbolic clustering for distribution valued data, Journal of the Japanese Society of Computational Statistics, 22, 83-89.
  10. Little, M., McSharry, P. E., Hunter, E. J., Spielman, J. and Ramig, L. O. (2009). Suitability of dysphonia measurements for telemonitoring of Parkinson's disease, IEEE Transactions on BioMedical Engneering, 56, 1-19. https://doi.org/10.1109/TBME.2009.2014445
  11. Little, M., McSharry, P. E., Roberts, S. J., Castello, D. and Moroz, I. M. (2007). Exploiting nonlinear recurrence and fractal scaling properties for voice disorder detection, Biomedical Engineering OnLine, 6, 1015-1022.
  12. Mizuta, M. and Minami, H. (2012). Analysis of distribution valued dissimilarity data. In W. A. Gaul, A. Geyer-Schulz, L. Schmidt-Thieme, and J. Kunze (Eds.), Challenges at the Interface of Data Analysis, Computer Science, and Optimization (pp. 23-28), Springer, Berlin.
  13. Nelsen, R. B. (1999). An Introduction to Copulas, Springer, New York.
  14. Scott, D. W. (1992). Multivariate Density Estimation, John Wiley & Sons, New York.
  15. Terada, Y. and Yadohisa, H. (2010). Non-hierarchical clustering for distribution-valued data, In Proceedings of COMPSTAT 2010 (pp. 1653-1660), Physical-Verlag, Berlin.
  16. Torgerson, W. S. (1958). Theory and Methods of Scaling, Wile, New York.
  17. Tsanas, A. Little, M. A. McSharry, P. E. and Ramig, L. O. (2010). Accurate telemonitoring of Parkinson's disease progression by noninvasive speech tests, IEEE Transactions on Biomedical Engineering, 57, 884-893. https://doi.org/10.1109/TBME.2009.2036000
  18. UCI Machine Learning Repository (2015). Available from: http://archive.ics.uci.edu/ml/