DOI QR코드

DOI QR Code

무 변화 패턴을 갖는 시간경로 유전자발현자료를 제거하기 위한 함수들의 비교

Comparison of Functions for Filtering Time Course Gene Expression Data with Flat Patterns

  • Kim, Kyung-Sook (Department of Statistics, Chonnam National University) ;
  • Oh, Mi-Ra (Department of Statistics, Chonnam National University) ;
  • Baek, Jang-Sun (Department of Statistics, Chonnam National University) ;
  • Son, Young-Sook (Department of Statistics, Chonnam National University)
  • 발행 : 2007.07.31

초록

시간경로 유전자 발현자료에 대한 본격적인 통계분석을 수행하기에 앞서 의미있는 정보를 제공하지 못할 것으로 여겨지는 유전자들은 선별하여 미리 제거함으로서 자료의 차원을 축소시킬 수 있을 뿐 아니라, 잡음이나 변이가 낮은 자료로 인한 잘못된 판단을 감소시킬 수 있다. 본 논문에서는 관측표본에 대한 백분위수 기준과 붓스트랩 표본에 대한 백분위수 기준 하에서 무 변화 패턴을 갖는 유전자들을 제거시킬 수 있는 기존의 필터링 함수들을 비교하였다. 이스트(yeast) 자료에 적용하여 두 가지 필터링 방식에 대해 가장 유사한 결과를 보인 것은 분산 함수였다.

Filtering genes that do not appear to contribute to regulation prior to the statistical analysis of time course gene expression data can reduce the dimensions of data and the possibility of misinterpretation due to noise or lack of variation. In this paper, we compare six different functions for filtering genes with flat pattern under the percentile criterion on an observed sample and that on a bootstrap sample. The result of applying to the yeast cell cycle data shows that the variance function is most similar in both samples.

키워드

참고문헌

  1. Chen, Y., Bittner, M. L. and Dougherty, E. R. (1999). Issues associated with microarray data analysis and integration, Nature Genetics, 22, 213-216 https://doi.org/10.1038/10265
  2. DeRisi, JL., Iyer VR. and Brown PO. (1997). Exploring the metabolic and genetic control of gene expression on a genomic scale, Science, 278, 680-686 https://doi.org/10.1126/science.278.5338.680
  3. Dudoit, S., Fridlyand, J. and Speed, T. (2002). Comparison of discrimination methods for the classification of tumors using gene expression data, Journal of the American Statistical Association, 97, 77-87 https://doi.org/10.1198/016214502753479248
  4. Fleury, G.A., Hero, O., Yoshida, S., Carter, T., Barlow, C. and Swaroop, A. (2002). Pareto analysis for gene filtering in microarray experiments, In Proceedings of the European Signal Processing Conference(EuSIPCO), Toulo- use, France
  5. Herrero, J., Diaz-Uriate, R. and Dopazo, J. (2003). Gene expression data preprocessing, Bioinformatics, 19, 655-656 https://doi.org/10.1093/bioinformatics/btg040
  6. Kadota K., Tominaga D., Akiyama Y. and Takahashi K. (2003). Detection outlying samples in microarray data: A critical assessment of the effect of outliers on sample classification, Chem-Bio Informatics Journal, 3, 30-45 https://doi.org/10.1273/cbij.3.30
  7. de Lichtenberg U., Jensen, L. J., Fausboll, A., Jensen, T. S., Bork, P. and Brunak, S. (2005). Comparison of computational methods for the identification of cell cycle-regulated genes, Bioinformatics, 21, 1164-171 https://doi.org/10.1093/bioinformatics/bti093
  8. Lindlof, A. and Olsson, B. (2003). Genetic network inference: the effects of preprocessing, BioSystems, 72, 229-239 https://doi.org/10.1016/S0303-2647(03)00164-3
  9. Liang, Y., Tayo, B., Cai, X. and Kelemen, A. (2005). Differential and trajectory methods for time course gene expression data, Bioinformatics, 21, 3009-3016 https://doi.org/10.1093/bioinformatics/bti465
  10. Spellman P., Sherlock, G., Zhang, M.Q., Iyer, V.R., Anders, K., Eisen, M.B., Brown, P.O, Botstein, D. and Futcher, B. (1998). Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization, Molecular Biology of the Cell, 9, 3273-3297 https://doi.org/10.1091/mbc.9.12.3273
  11. The Math Works, Inc. (2003). MATLAB/Bioinformatics toolbox, Version 1.0, Natick, MA