Browse > Article
http://dx.doi.org/10.5351/KJAS.2009.22.1.115

A Penalized Spline Based Method for Detecting the DNA Copy Number Alteration in an Array-CGH Experiment  

Kim, Byung-Soo (Dept of Applied Statistics, Yonsei University)
Kim, Sang-Cheol (Dept. of Applied Statistics, Yonsei University)
Publication Information
The Korean Journal of Applied Statistics / v.22, no.1, 2009 , pp. 115-127 More about this Journal
Abstract
The purpose of statistical analyses of array-CGH experiment data is to divide the whole genome into regions of equal copy number, to quantify the copy number in each region and finally to evaluate its significance of being different from two. Several statistical procedures have been proposed which include the circular binary segmentation, and a Gaussian based local regression for detecting break points (GLAD) by estimating a piecewise constant function. We propose in this note a penalized spline regression and its simultaneous confidence band(SCB) approach to evaluate the statistical significance of regions of genetic gain/loss. The region of which the simultaneous confidence band stays above 0 or below 0 can be considered as a region of genetic gain or loss. We compare the performance of the SCB procedure with GLAD and hidden Markov model approaches through a simulation study in which the data were generated from AR(1) and AR(2) models to reflect spatial dependence of the array-CGH data in addition to the independence model. We found that the SCB method is more sensitive in detecting the low level copy number alterations.
Keywords
DNA copy number alteration; gastric cancer; penalized spline; simultaneous confidence band;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Shah, S. P., Lam, W. L., Ng, R. T. and Murphy, K. P. (2007). Modeling recurrent DNA copy number alterations in array CGH data, Bioinformatics, 23, i450-i458   DOI
2 Stjernqvist, S., Ryden, T., Skold, M. and Staaf, J. (2007). Continuous-index hidden Markov modelling of array CGH copy number data, Bioinformatics. 23, 1006-1014   DOI   ScienceOn
3 Tibshirani, R. and Wang, P. (2008). Spatial smoothing and hot spot detection for CGH data using the fused lasso, Biostatistics, 9, 18-29   DOI   ScienceOn
4 Venkatraman, E. S. and Olshen, A. B. (2007). A faster circular binary segmentation algorithm for the analysis of array CGH data, Bioinformatics, 23, 657-663   DOI   ScienceOn
5 Wen, C.C., Wu, Y-J., Huang, Y-H., Chen, W-C., Liu, S-C., Jiang, S. S., Juang, J. L., Lin, C. Y., Fang, W. T., Hsiung, C. A. and Chang, I. S. (2006). A Bayes regression approach to array-CGH data, Statistical Applications in Genetics and Molecular Biology, 5, Article 3   DOI
6 Yang, S. (2007). Gene amplifications at chromosome 7 of the human gastric cancer genome, International Journal of Molecular Medicine, 20, 225-231
7 Yang,S., Jeung, H. C., Choi, Y. H., Kim, J. E., Jung, J-J., Jeong, H. J., Rha, S. Y., Yang, W. I. and Chung, H. C. (2007). Identification of genes with correlated patterns of variations in DNA copy number and gene expression level in gastric cancer, Genomics, 89, 451-459   DOI   ScienceOn
8 Yang, Y. H., Dudoit, S., Luu, P., Lin, D. M., Peng, V., Ngai, J. and Speed, T. P. (2002). Normalization for cDNA rnicroarray data: A robust composite method addressing single and multiple slide systematic variation, Nucleic Acid Research, 30, e15   DOI   ScienceOn
9 Yistra, B., van der lJssel, P., Carvalho, B., Brakenhoff, R. H. and Meijer, G. A. (2006). BAC to the future! or oligonucleotides: A perspective for micro array comparative genomic hybridization(array CGH), Nucleic Acid Research, 34, 445-450   DOI   ScienceOn
10 Olshen, A. B., Venkatraman, E. S., Lucito, Rand Wigler, M. (2004). Circular binary segmentation for the analysis of array-based DNA copy number data, Biostatistics; 5, 557-572   DOI   ScienceOn
11 Picard, F., Robin,S., lebarbier, E. and Daudin, J-.J. (2007). A segmentation/clustering model for the analysis of array CGH data, Biometrics, 63, 758-766   DOI   ScienceOn
12 Pinkel, D. and Albertson, D. G. (2005). Array comparative genomic hybridization and its applications in cancer, Nature Genetics, 37, S11-S17   DOI   ScienceOn
13 Rouveirol, C, Stransky, N., Hupe, P., Rosa, P. L., Viara, E., Barillot, E. and Radvanyi, F. (2006). Computation of recurrent minimal genomic alterations from array-CGH data, Bioinformatics, 22, 849-856   DOI   ScienceOn
14 Pollack, J. R, Sorlie, T., Perou, C M., Rees, C A., Jeffrey, S. S., Lonning, P. E., Tibshirani, R, Botstein, D., Borresen-Dale, A. L. and Brown, P. O. (2002). Microarray analysis reveals a major direct role of DNA copy number alteration in the transcriptional program of human breast tumors, Proceedings of the National Academy of Sciences, 99, 12963-12968   DOI   ScienceOn
15 Rabiner, L. R (1989). A tutorial on hidden Markov models and selected applications in speech recognition, In Proceedings of the IEEE, 77, 257-286   DOI   ScienceOn
16 Rigaill, G., Hupe, P., LaRosa, P., Meyniel, J-.P., Decraene, C, Almeida, A. and Barillot, E. (2008). ITALICS: An algorithm for normalization and DNA copy number calling for Affymetrix SNP arrays, Bioinformatics, 24, 768-774   DOI   ScienceOn
17 Ruppert, D., Wand, M. P. and Carroll, R. J. (2003). Semiparametric Regression, Cambridge University Press, New York
18 Scheel, I., Aldrin, M., Glad, I. K., Sorum, R., Lying, H, and Frigessi, A. (2005). The inference of missing value imputation on detection of differentially expressed genes from microarray data, Bioinformatics, 21, 4272-4279   DOI   ScienceOn
19 Huang, T., Wu, B., Lizardi, P. and Zhao, H. (2005). Detection of DNA copy number alterations using penalized least squares regression, Bioinformatics, 21, 3811-3817   DOI   ScienceOn
20 Hupe, P., Stransky, N., Thiery, J. P., Radvanyi, F. and Barillot, E. (2004). Analysis of array CGH data: From signal ratio to gain and loss of DNA regions, Bioinformatics, 20, 3413-3422   DOI   ScienceOn
21 Li, Y. and Zhu, J. (2007). Analysis of array CGH data for cancer studies using fused quantile regression, Bioinformatics, 23, 2470-2476   DOI   ScienceOn
22 Jong, K., Marchiori, E., Meijer, G., Vaart, A. V. D. and Ylstra, B. (2004). Breakpoint identification and smoothing of array comparative genomic hybridization data, Bioinformatics, 20, 3636-3637   DOI   ScienceOn
23 Kim, B. S., Kim, I., Lee, S., Kim, S., Rha, S. Y. and Chung, C H. (2005). Statistical methods of translating microarray data into clinically relevant diagnostic information in colorectal cancer, Bioinformatics, 21, 517-528   DOI   ScienceOn
24 Lai, W. R., Johnson, M. D., Kucherlapati, R. and Park, P. J. (2005). Comparative analysis of algorithms for identifying amplifications and deletions in array CGH data, Bioinformatics, 21, 3763-3770   DOI   ScienceOn
25 Mestre-Escorihuela, C, Rubio-Moscardo, F., Richter, J. A., Seibert, R, Clement, J., Fresquet, V., Beltran, E., Agirre, X., Marugan, I., Marin, M., Rosenwald, A., Sugimoto, K. J., Wheat, L. M., Karran, E. L., Garcia, J. F., Sanchez. L., Prosper, F., Staudt, L. M., Pinkel, D., Dyer, M. J. and Martinez-Climent, J. A. (2007). Homozygous deletions localize novel tumor suppressor gene in B-cell lymphoma, Blood, 109, 271-280   DOI   ScienceOn
26 Myers, C L., Dunham, M. J., Kung, S. Y. and Troyanskaya, O. G. (2004). Accurate detection of aneuploidies in array CGH and gene expression microaray data, Bioinformatics, 20, 3533-3543   DOI   ScienceOn
27 Eilers, P. H. C and de Menezes, R X. (2005). Quantile smoothing of array CGH data, Bioinformatics, 21, 1146-1153   DOI   ScienceOn
28 Barry, D. and Hartigan, J. A. (1993). A Bayesian analysis for change point problems, Journal of the American Statistical Association, 88, 309-319   DOI   ScienceOn
29 Broet, P. and Richardson, S. (2006). Detection of gene copy number changes in CGH microarrays using a spatially correlated mixture model, Bioinformatics, 22, 911-918   DOI   ScienceOn
30 Chari, R., Lockwood, W. W. and Lam, W. L. (2006). Computational methods for the analysis of array comparative genomic hybridization, Cancer Informatics, 2, 48-58
31 Fan, J. and Niu, Y. (2007). Selection and validation of normalization methods fore-DNA microarrays using within-array replications, Bioinformatics, 23, 2391-2398   DOI   ScienceOn
32 Fridlyand, J., Snijders, A. M., Pinkel, D., Albertson, D. G. and Jain, A. N. (2004). Hidden Markov models approach to the annlysis of array CGH data, Journal of Multivariate Analysis, 90, 132-153   DOI   ScienceOn
33 Henderson, C R. (1975). Best linear unbiased estimation and prediction under a selection model, Biometrics, 31, 423-447   DOI   ScienceOn
34 Hsu, L., Self, S. G., Grove, D., Randolf, T., Wang, K., Delrow, J. J., Loo, L. and Porter, P. (2005). Denoising array-based comparative genomic hybridization data using wavelets, Biostatistics, 6, 211-226   DOI   ScienceOn