Browse > Article
http://dx.doi.org/10.5392/JKCA.10.10.102

Detection and Prediction of Alternative Splicing with One-leaf One-node Tree  

Park, Min-Seo (메사추세츠 대학교 컴퓨터과학과)
Publication Information
Abstract
Alternative splicing is an important process in gene expression. Alternative Splicing can lead to mutations and diseases. Most studies detect alternatively spliced genes with ESTs (Expressed Sequence Tags). However, reliance on ESTs might have some weaknesses in predicting alternative splicing. ESTs have been stored in the libraries. The EST libraries are often not clearly organized and annotated. We can pick erroneous ESTs. It is also difficult to predict whether or not alternative splicing exists for those genes where ESTs are not available. To address these issues and to improve the quality of detection and prediction for alternative splicing, we propose the One-leaf One-node Tree Algorithm that uses pre-mRNAs. It is achieved by codons, three nucleotides, as attributes for each chromosome in Arabidopsis thaliana. The proposed decision tree shows that alternative and normal splicing have different splicing patterns according to triplet nucleotides in each chromosome. Based on the patterns, alternative splicing of unlabeled genes can also be predicted.
Keywords
Alternative Splicing; One-leaf One-node Tree Algorithm; pre-mRNA;
Citations & Related Records
연도 인용수 순위
  • Reference
1 L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone, Classification and Regression Trees. Wadsworth International Group, 1984.
2 I. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques with JAVA Implementations. Academic Press, 2000.
3 A. Nabhan and A. Rafea, "Tuning statistical machine translation parameters using perplexity," in Proceedings of the 2005 IEEE International Conference on Information Reuse and Integration, pp.338-343, 2005.   DOI
4 E. Brand and R. Gerritsen, "Decision Trees," DBMS Online, 1988, http://www.dbmsmag. com/-9807m05.html.
5 K. Delisle, "Decision Trees and Evolutionary Programming," Artificial Intelligence Depot., Tech. Report, http://aidepot.com/Tutorial/ DecisionTrees .html.
6 C. Burge and S. Karlin, "Prediction of complete gene structures in human genomic DNA," Journal of Molecular Biology, Vol.268, pp.78-94, 1997.   DOI   ScienceOn
7 H. Zhang and C. Yu, "Tree-based analysis of microarray data for classifying breast cancer," Frontiers in Bioscience, Vol.7, pp.C63-C67, 2002.   DOI
8 W. Zhu, S. Schlueter, and V. Brendel, "Refined annotation of the Arabidopsis Thaliana genome by complete EST mapping," Plant Physiology, Vol.132, pp.469-484, 2003.   DOI   ScienceOn
9 C. Iseli, V. Jongeneel, and P. Bucher, "ESTScan: A program for detecting, evaluating, and reconstructing potential coding regions in EST sequences," in Proceedings of the Seventh ISMB, pp.138-148, 1999.
10 C. Jongeneel, "Searching the expressed sequence tag (EST) databases: panning for genes," Briefings in Bioinformatics, Vol.1, pp.76-92, 2000.   DOI
11 J. Collins, M. Goward, C. Cole, L. Smink, E. Huckle, S. Knowles, J. M. Bye, D. Beare, and I. Dunham, "Reevaluating human gene annotation: a second-generation analysis of chromosome 22," Genome Research, Vol.13, pp.27-36, 2003.   DOI   ScienceOn
12 D. Raghunandan, L. Guglielmo, D. K., and A. Animesh, "Clinical applications of DNA microarray analysis," Journal of Experimental Therapeutics and Oncology, Vol.3, pp.297-304, 2003.   DOI   ScienceOn
13 S. Mehta, "DNA Microarrays in Health Care & Drug Discovery," http://plasticdog.cheme.colum bia.edu/.
14 G. Hu, S. Madore, B. Moldever, T. Jatkoe, D. Balaban, J. Thomas, and Y. Want, "Predicting Splice Variant from DNA Chip Expression Data," Genome Research, Vol.11, pp.1237-1245, 2001.   DOI   ScienceOn
15 E. Garrett-Mayer and G. Parmigiani, "Clustering and Classification Methods for Gene Expression Data Analysis," Johns Hopkins University, Dept. of Biostatistics Working Papers, Vol.70, 2004.
16 T. Cover and P. Hart, "Nearest Neighbor Pattern Classification," in Proceedings of IEEE Transaction on Information Theory, pp.21-27, 1967.   DOI   ScienceOn
17 R. Fisher, "The use of multiple measurements in taxonomic problems," Annals of Eugenics, Vol.7, pp.178-188, 1936.
18 V. Vapnik, Statistical Learning Theory. New York, NY: John Wiley & Sons, 1998.
19 S. Stamm, J. Riethoven, V. Le Texier, C. Gopalakrishnan, V. Kumanduri, Y. Tang, N. Barbosa-Morais, and T. Thanaraj, "ASD: a bioinformatics resource on alternative splicing," Nucleic Acids Research, Vol.34, pp.D46–D55, 2006.   DOI   ScienceOn
20 http://www.ncbi.nlm.nih.gov.
21 http://www.arabidopsis.org.
22 B. Haas, A. Delcher, S. Mount, J. Wortman, R. Smith Jr, L. Hannick, R. Maiti, C. Ronning, D. Rusch, C. Town, S. Salzberg, and O. White, "Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies," Nucleic Acids Research, Vol.31, pp.5654-5666, 2003.   DOI   ScienceOn
23 M. Campbell, B. Haas, J. Hamilton, S. Mount, and C. Buell, "Comprehensive analysis of alternative splicing in rice and comparative analyses with Arabidopsis," BMC Genomics, Vol.7, p.327, 2006.   DOI   ScienceOn
24 R. Nurtdinov, I. Artamonova, A. Mironov, and M. Gelfand, "Low conservation of alternative splicing patterns in the human and mouse genomes," Human Molecular Genetic, Vol.12, pp.1313-1320, 2003.   DOI   ScienceOn
25 http://www.tigr.org
26 D. Black, "Mechanisms of alternative pre-messenger RNA splicing," Annual Review of Biochemistry, Vol.72, pp.291-336, 2003.   DOI   ScienceOn
27 K. Iida, M. Seki, T. Sakurai, M. Satou, K. Akiyama, T. Toyoda, A. Konagaya, and K. Shinozaki, "Genome-wide analysis of alternative pre-mRNA splicing in Arabidopsis Thaliana based on full-length cDNA sequences," Nucleic Acids Re-search, Vol.32, pp.5096-5103, 2004.   DOI   ScienceOn
28 M. Pertea, X. Lin, and S. Salzberg, "GeneSplicer: a new computational method for splice site prediction," Nucleic Acids Research, Vol.29, pp.1185-1190, 2001.   DOI   ScienceOn
29 B. Wang and V. Brendel, "Genomewide comparative analysis of alternative splicing in plants," in Proceedings of the National Academy of Science of the United States of America, pp.7175-7180, 2006.   DOI   ScienceOn
30 T. Chuang, F. Chen, and M. Chou, "A compareative method for identification of gene structures and alternatively spliced variant," Bioinformatics, Vol.20, pp.3064-3079, 2004.   DOI   ScienceOn
31 R. Sorek, R. Shemesh, Y. Cohen, O. Basechess, G. Ast, and R. Shamir, "A Non-EST-Based Method for Exon-Skipping Prediction," Genome Research, Vol.14, pp.1617-1623, 2004.   DOI   ScienceOn