DOI QR코드

DOI QR Code

Interactive Visualization for Patient-to-Patient Comparison

  • Nguyen, Quang Vinh (MARCS Institute & School of Computing, Engineering and Mathematics, University of Western Sydney) ;
  • Nelmes, Guy (The Kids Research Institute, The Children's Hospital at Westmead) ;
  • Huang, Mao Lin (School of Software, Faculty of Engineering & IT, University of Technology) ;
  • Simoff, Simeon (MARCS Institute & School of Computing, Engineering and Mathematics, University of Western Sydney) ;
  • Catchpoole, Daniel (The Kids Research Institute, The Children's Hospital at Westmead)
  • Received : 2013.12.20
  • Accepted : 2014.02.20
  • Published : 2014.03.31

Abstract

A visual analysis approach and the developed supporting technology provide a comprehensive solution for analyzing large and complex integrated genomic and biomedical data. This paper presents a methodology that is implemented as an interactive visual analysis technology for extracting knowledge from complex genetic and clinical data and then visualizing it in a meaningful and interpretable way. By synergizing the domain knowledge into development and analysis processes, we have developed a comprehensive tool that supports a seamless patient-to-patient analysis, from an overview of the patient population in the similarity space to the detailed views of genes. The system consists of multiple components enabling the complete analysis process, including data mining, interactive visualization, analytical views, and gene comparison. We demonstrate our approach with medical scientists on a case study of childhood cancer patients on how they use the tool to confirm existing hypotheses and to discover new scientific insights.

Keywords

References

  1. Goronzy JJ, Matteson EL, Fulbright JW, Warrington KJ, Chang-Miller A, Hunder GG, et al. Prognostic markers of radiographic progression in early rheumatoid arthritis. Arthritis Rheum 2004;50:43-54. https://doi.org/10.1002/art.11445
  2. Yeoh EJ, Ross ME, Shurtleff SA, Williams WK, Patel D, Mahfouz R, et al. Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling. Cancer Cell 2002;1:133-143. https://doi.org/10.1016/S1535-6108(02)00032-6
  3. Flotho C, Coustan-Smith E, Pei D, Cheng C, Song G, Pui CH, et al. A set of genes that regulate cell proliferation predicts treatment outcome in childhood acute lymphoblastic leukemia. Blood 2007;110:1271-1277. https://doi.org/10.1182/blood-2007-01-068478
  4. McLachlan GJ, Wang K, Ng SK. Large-scale simultaneous inference with applications to the detection of differential expression with microarray data. Statistica 2008;68:1-30.
  5. Jolliffe IT. Principle Component Analysis. New York: Springer, 2002.
  6. Ringnér M. What is principal component analysis? Nat Biotechnol 2008;26:303-304. https://doi.org/10.1038/nbt0308-303
  7. Hao X, Sun B, Hu L, Lähdesmäki H, Dunmire V, Feng Y, et al. Differential gene and protein expression in primary breast malignancies and their lymph node metastases as revealed by combined cDNA microarray and tissue microarray analysis. Cancer 2004;100:1110-1122. https://doi.org/10.1002/cncr.20095
  8. Chen Y, Meltzer PS. Gene expression analysis via multidimensional scaling. Curr Protoc Bioinformatics 2005;Chapter 7:Unit 7.11.
  9. Schroeder MP, Gonzalez-Perez A, Lopez-Bigas N. Visualizing multidimensional cancer genomics data. Genome Med 2013; 5:9. https://doi.org/10.1186/gm413
  10. Gehlenborg N, O'Donoghue SI, Baliga NS, Goesmann A, Hibbs MA, Kitano H, et al. Visualization of omics data for systems biology. Nat Methods 2010;7(3 Suppl):S56-S68. https://doi.org/10.1038/nmeth.1436
  11. Procter JB, Thompson J, Letunic I, Creevey C, Jossinet F, Barton GJ. Visualization of multiple alignments, phylogenies and gene family evolution. Nat Methods 2010;7(3 Suppl): S16-S25. https://doi.org/10.1038/nmeth.1434
  12. Cerami E, Gao J, Dogrusoz U, Gross BE, Sumer SO, Aksoy BA, et al. The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov 2012;2:401-404. https://doi.org/10.1158/2159-8290.CD-12-0095
  13. Lex A, Streit M, Schulz HJ, Partl C, Schmalstieg D, Park PJ, et al. StratomeX: visual analysis of large-scale heterogeneous genomics data for cancer subtype characterization. Comput Graph Forum 2012;31:1175-1184. https://doi.org/10.1111/j.1467-8659.2012.03110.x
  14. Vaske CJ, Benz SC, Sanborn JZ, Earl D, Szeto C, Zhu J, et al. Inference of patient-specific pathway activities from multi-dimensional cancer genomics data using PARADIGM. Bioinformatics 2010;26:i237-i245. https://doi.org/10.1093/bioinformatics/btq182
  15. Thorvaldsdóttir H, Robinson JT, Mesirov JP. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinform 2013;14:178- https://doi.org/10.1093/bib/bbs017
  16. Sanborn JZ, Benz SC, Craft B, Szeto C, Kober KM, Meyer L, et al. The UCSC Cancer Genomics Browser: update 2011. Nucleic Acids Res 2011;39:D951-D959. https://doi.org/10.1093/nar/gkq1113
  17. Fiume M, Smith EJ, Brook A, Strbenac D, Turner B, Mezlini AM, et al. Savant Genome Browser 2: visualization and analysis for population-scale genomics. Nucleic Acids Res 2012;40: W615-W621.
  18. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 2003; 13:2498-2504. https://doi.org/10.1101/gr.1239303
  19. Junker BH, Klukas C, Schreiber F. VANTED: a system for advanced data analysis and visualization in the context of biological networks. BMC Bioinformatics 2006;7:109. https://doi.org/10.1186/1471-2105-7-109
  20. Hu Z, Hung JH, Wang Y, Chang YC, Huang CL, Huyck M, et al. VisANT 3.5: multi-scale network visualization, analysis and inference based on the gene ontology. Nucleic Acids Res 2009;37:W115-W121.
  21. McGuffin MJ, Jurisica I. Interaction techniques for selecting and manipulating subgraphs in network visualizations. IEEE Trans Vis Comput Graph 2009;15:937-944. https://doi.org/10.1109/TVCG.2009.151
  22. Barsky A, Munzner T, Gardy J, Kincaid R. Cerebral: visualizing multiple experimental conditions on a graph with biological context. IEEE Trans Vis Comput Graph 2008;14:1253-1260. https://doi.org/10.1109/TVCG.2008.117
  23. Meyer M, Wong B, Styczynski M, Munzner T, Pfister H. Pathline: a tool For comparative functional genomics. Comput Graph Forum 2010;29:1043-1052. https://doi.org/10.1111/j.1467-8659.2009.01710.x
  24. Venna J, Kaski S. Comparison of visualization methods for an atlas of gene expression data sets. Inf Vis 2007;6:139-154. https://doi.org/10.1057/palgrave.ivs.9500153
  25. Prasad TV, Ahson SI. Visualization of microarray gene expression data. Bioinformation 2006;1:141-145. https://doi.org/10.6026/97320630001141
  26. Lex A, Streit M, Kruijff E, Schmalstieg D. Caleydo: design and evaluation of a visual analysis framework for gene expression data in its biological context. In: Proceeding of the IEEE Symposium on Pacific Visualization (PacificVis '10), 2010 Mar 2-5, Taipei, pp. 57-64.
  27. Cvek U, Trutschl M, Stone R 2nd, Syed Z, Clifford JL, Sabichi AL. Multidimensional visualization tools for analysis of expression data. World Acad Sci Eng Technol 2009;54:281-289.
  28. Goecks J, Eberhard C, Too T; Galaxy Team, Nekrutenko A, Taylor J. Web-based visual analysis for high-throughput genomics. BMC Genomics 2013;14:397. https://doi.org/10.1186/1471-2164-14-397
  29. Breiman L. Radom forests. Mach Learn 2001;45:5-32. https://doi.org/10.1023/A:1010933404324
  30. Golub GH, van Loan CF. Matrix Computations. Baltimore: Johns Hopkins University Press, 1996.
  31. Nguyen QV, Qian Y, Huang ML, Zhang JW. 2013. TabuVis: a tool for visual analytics multidimensional datasets. Sci China Inf Sci 2013;56:052105(12).
  32. Nguyen QV, Gleeson A, Ho N, Huang ML, Simoff S, Catchpoole D. Visual analytics of clinical and genetic datasets of acute lymphoblastic leukaemia. Neural Inf Proces Lect Notes Comput Sci 2011;7062:113-120. https://doi.org/10.1007/978-3-642-24955-6_14
  33. Arthur D, Vassilvitskii S. K-means++: the advantages of careful seeding. In: SODA '07 Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, 2007 Jan 7-9, New Orleans. Philadelphia: Society for Mathematics, 2007. pp. 1027-1035.