• Title/Summary/Keyword: Symbolic Data Analysis (SDA)

Search Result 3, Processing Time 0.015 seconds

Exploratory Methods for Joint Distribution Valued Data and Their Application

  • Igarashi, Kazuto;Minami, Hiroyuki;Mizuta, Masahiro
    • Communications for Statistical Applications and Methods
    • /
    • v.22 no.3
    • /
    • pp.265-276
    • /
    • 2015
  • In this paper, we propose hierarchical cluster analysis and multidimensional scaling for joint distribution valued data. Information technology is increasing the necessity of statistical methods for large and complex data. Symbolic Data Analysis (SDA) is an attractive framework for the data. In SDA, target objects are typically represented by aggregated data. Most methods on SDA deal with objects represented as intervals and histograms. However, those methods cannot consider information among variables including correlation. In addition, objects represented as a joint distribution can contain information among variables. Therefore, we focus on methods for joint distribution valued data. We expanded the two well-known exploratory methods using the dissimilarities adopted Hall Type relative projection index among joint distribution valued data. We show a simulation study and an actual example of proposed methods.

Symbolic Cluster Analysis for Distribution Valued Dissimilarity

  • Matsui, Yusuke;Minami, Hiroyuki;Misuta, Masahiro
    • Communications for Statistical Applications and Methods
    • /
    • v.21 no.3
    • /
    • pp.225-234
    • /
    • 2014
  • We propose a novel hierarchical clustering for distribution valued dissimilarities. Analysis of large and complex data has attracted significant interest. Symbolic Data Analysis (SDA) was proposed by Diday in 1980's, which provides a new framework for statistical analysis. In SDA, we analyze an object with internal variation, including an interval, a histogram and a distribution, called a symbolic object. In the study, we focus on a cluster analysis for distribution valued dissimilarities, one of the symbolic objects. A hierarchical clustering has two steps in general: find out step and update step. In the find out step, we find the nearest pair of clusters. We extend it for distribution valued dissimilarities, introducing a measure on their order relations. In the update step, dissimilarities between clusters are redefined by mixture of distributions with a mixing ratio. We show an actual example of the proposed method and a simulation study.

Symbolic tree based model for HCC using SNP data (악성간암환자의 유전체자료 심볼릭 나무구조 모형연구)

  • Lee, Tae Rim
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.5
    • /
    • pp.1095-1106
    • /
    • 2014
  • Symbolic data analysis extends the data mining and exploratory data analysis to the knowledge mining, we can suggest the SDA tree model on clinical and genomic data with new knowledge mining SDA approach. Using SDA application for huge genomic SNP data, we can get the correlation the availability of understanding of hidden structure of HCC data could be proved. We can confirm validity of application of SDA to the tree structured progression model and to quantify the clinical lab data and SNP data for early diagnosis of HCC. Our proposed model constructs the representative model for HCC survival time and causal association with their SNP gene data. To fit the simple and easy interpretation tree structured survival model which could reduced from huge clinical and genomic data under the new statistical theory of knowledge mining with SDA.