Browse > Article
http://dx.doi.org/10.7465/jkdi.2014.25.5.1095

Symbolic tree based model for HCC using SNP data  

Lee, Tae Rim (Department of Information Statistics, Korea National Open University)
Publication Information
Journal of the Korean Data and Information Science Society / v.25, no.5, 2014 , pp. 1095-1106 More about this Journal
Abstract
Symbolic data analysis extends the data mining and exploratory data analysis to the knowledge mining, we can suggest the SDA tree model on clinical and genomic data with new knowledge mining SDA approach. Using SDA application for huge genomic SNP data, we can get the correlation the availability of understanding of hidden structure of HCC data could be proved. We can confirm validity of application of SDA to the tree structured progression model and to quantify the clinical lab data and SNP data for early diagnosis of HCC. Our proposed model constructs the representative model for HCC survival time and causal association with their SNP gene data. To fit the simple and easy interpretation tree structured survival model which could reduced from huge clinical and genomic data under the new statistical theory of knowledge mining with SDA.
Keywords
Hepato cellular carcinoma; SNP; symbolic data analysis; tree structured model;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Lee, H. (2003) Searching for host genetic factors influencing the outcome of chronic HBV infection, especially the progression to hepato cellular carcinoma(HCC) by single nucleotide polymorphism (SNP) screening, Project Report, 21C Frontier Research & Development Region, Seoul.
2 Lee, H. S., Kim, K. M., Yoon, J. H., Lee, T.R., Suh, K. S., Lee, K. U., Chung, J. W., Park, J. H. and Kim, C. Y. (2002). Theraputic efficacy of transcatheter aterial chemoembolization a compared with hepatic resection in hepatocellular carcinoma patients with compensated liver function in a hepatitisB virus-endemic area. Journal of Clinical Oncology, 20, 4459-4465.   DOI   ScienceOn
3 Lee, T. R. and Kim, M. J. and Myung, H. (2006). Independent prognostic factors of 861 cases of oral squamous cell carcinoma in Korean adults. Oral Oncology, 42, 208-217.   DOI   ScienceOn
4 Lee, T. R. and Moon, H. S. (1997). Classification of craniofacial patterns of children. The Journal of Korea Society of Oral Health, 21, 54-65.
5 Lee, T. R. and Moon, H. S. (1998). Classification model for high risk dental caries with RBF neural networks. The Journal of Data Science and Classification, 2, 38-47.
6 Lee, T. R. and Lee, H. S. (2009). Tree structured prognostic survival model for hepato cellular carcinoma using gene expression data. Journal of the Korean Society of Health Information and Health Statistics, 34, 73-83.
7 Loh, W. Y. and Cho, H. (2006). Piecewise-constant tree-structured modeling for censored data. Applied Statistics (Korea University Institute of Statistics), 21, 31-53.
8 Mballo, C., Asseraf M. and Diday E. (2004). Binary tree for interval and taxonomic variables. A Statistical Journal for Graduates Students, 5, 13-28.
9 Afonso, F., Haddad, R., Toque, C., Eliezer E.-S. and Diday, E. (2013). User manual of the SYR software, Syrokko Internal Publication. Available from http://www.syrokko.com.
10 Breiman, L. (2003). Manual for setting up, using, and understanding random forest V4.0. Available from http://oz.berkeley.edu/users/breiman/Using_random_forests_v4.0.pdf.
11 Billard, L. and Diday, E. (2003). From the statistics of data to the statistic of knowledge: Symbolic data analysis. Journal of American Statistical Association, 98, 462.
12 Billard, L. and Diday, E. (2006). Symbolic data analysis: Conceptual statistics and data mining, Wiley series in computational statistics, Wiley, Chichester.
13 Diday, E. and Noirhomme-Fraiture, M. (2008). Symbolic data analysis and the SODAS software, Wiley, Chichester.
14 Diday, E. (2010). Principal component analysis for categorical histogram data: Some open directions of research. In Classification and Multivariate Analysis for Complex Data Structures, edited by B. Fichet, D. Piccolo, R. Verde and M. Vichi, Springer Verlag, New York.
15 Diday, E. (2012). Nonlinear canonical analysis for bar chart data tables and interpretation by coherency of meta bins and diversity of concepts. Proceedings of 3rd Workshop in Symbolic Data Analysis, 39-40.
16 Kim, M. S., Lee, S. Y., Lee, T. R., Cho, W. H., Song, W. S., Cho, S. H., Lee, J. A., Yoo, J. Y., Jung, S. T. and Jeon, D. G. (2009). Prognostic effect of pathologic fracture in localized osteosarcoma: A cohort/case controlled study at a single institute. Journal of Surgical Oncology, 100, 233-239.   DOI   ScienceOn
17 He, Y. (2006). Missing data imputation for tree-based models, Ph. D. Thesis, University of California at Los Angeles, CA.