DOI QR코드

DOI QR Code

A Comparative Study of Microarray Data with Survival Times Based on Several Missing Mechanism

  • Kim Jee-Yun (Institute for Basic Science at Inha University) ;
  • Hwang Jin-Soo (Department of Statistics, Inha University) ;
  • Kim Seong-Sun (Division of Epidemic Intelligence Service, KCDC.)
  • Published : 2006.04.01

Abstract

One of the most widely used method of handling missingness in microarray data is the kNN(k Nearest Neighborhood) method. Recently Li and Gui (2004) suggested, so called PCR(Partial Cox Regression) method which deals with censored survival times and microarray data efficiently via kNN imputation method. In this article, we try to show that the way to treat missingness eventually affects the further statistical analysis.

Keywords

References

  1. Bishop, C.M. (1999). Variational principal components. In IEE Conference Publication on Artificial Neural Networks, 509-514
  2. Bo, T.H., Dysvik, B. and Jonassen, I. (2004). Lsimpute : accurate estimation of missing values in microarray data with least squares methods. Nucleic Acids Research, Vol. 32, No.3 e34 https://doi.org/10.1093/nar/gnh026
  3. Efron, B., Johnston, I., Hastie, T. and Tibshirani, R. (2004). Least angle regression. The Annals of Statistics, Vol. 32, 407-499 https://doi.org/10.1214/009053604000000067
  4. Gui, J. and Li, H. (2004). Penalized Cox Regression Analysis in the High-Dimensional and Low-sample Size Settings with Applications to Mi-croarray Gene Expression Data. Center for Bioinformatics & Molecular Biostatistics
  5. Hastie, T., Alter, O., Sherlock, G., Eisen, M., Tibshirani, R., Botstein, D. and Brown, P. (1999). Imputation of missing values in DNA microarrays. Technical report Stanford University Statistics Department
  6. Kim, H., Golub, G.H. and Park, H. (2005). Missing value estimation for DNA microarray gene expression data : local least squares imputation. Bioinformatics, Vol. 21, 187-198 https://doi.org/10.1093/bioinformatics/bth499
  7. Kim, K.Y., Kim, B.J. and Yi, G.S. (2004). Reuse of imputed data in microarray analysis increases imputation efficiency. BMC Bioinformatics, Vol. 5, 160 https://doi.org/10.1186/1471-2105-5-160
  8. Li, H. and Gui, J. (2004). Partial Cox regression analysis for high-dimensional microarray gene expression data. Bioinformatics, Vol. 20, i208-i215 https://doi.org/10.1093/bioinformatics/bth900
  9. Li, H. and Luan, Y. (2003). Kernel Cox regression models for linking gene expression profiles to censored survival data. Pacific Symposium on Biocomputing, 65-76
  10. Oba, S., Sato, M., Takemasa, I., Monden, M., Matsubara, K. and Ishii, S. (2003). A Bayesian missing value estimation method for gene expression profile data. Bioinformatics, Vol. 19, 2088-2096 https://doi.org/10.1093/bioinformatics/btg287
  11. Park, P.J., Tian, L. and Kohane, I.S. (2002). Linking gene expression data with patient survival times using partial least squares. Bioinformatics, Vol. 18, S120-S127 https://doi.org/10.1093/bioinformatics/18.suppl_1.S120
  12. Rosenwald, A., Wright, G., Chan, W.C, Connors, J.M., Campo, E., Fisher, R.I., Gascoyne, R.D., Muller-Hermelink, H.K., Smeland, E.B. and Staudt, L.M. (2002). The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma. The New England Journal of Medicine, Vol. 346, 1937-1947 https://doi.org/10.1056/NEJMoa012914
  13. Rubin, D.B. (1977). Formalizing subjective notions about the effect of nonrespondents in sample surveys. Journal of the American Statistical Association, Vol. 72, 538-543 https://doi.org/10.2307/2286214
  14. Segal, M.R. (2005). Microarray gene expression data with linked survival phenotypes : Diffuse large- B-cell lymphoma revisited. Center for Bioinformatics & Molecular Biostatistics
  15. Tibshirani, R. (1997). The Lasso method for variable selection in the Cox model. Statistics in Medicine, Vol. 16, 385-395 https://doi.org/10.1002/(SICI)1097-0258(19970228)16:4<385::AID-SIM380>3.0.CO;2-3
  16. Troyanskaya, O., Cantor, M., Sherlock, G., Brown, P., Hastie, T., Tibshirani, R, Botstein, D. and Altman, R.B. (2001). Missing value estimation methods for DNA microarrays. Bioinformatics, Vol. 17, 520-525 https://doi.org/10.1093/bioinformatics/17.6.520
  17. Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society, Series B, Vol. 67, 301-320 https://doi.org/10.1111/j.1467-9868.2005.00503.x