Browse > Article
http://dx.doi.org/10.5808/GI.2008.6.4.166

In Silico Functional Assessment of Sequence Variations: Predicting Phenotypic Functions of Novel Variations  

Won, Hong-Hee (Samsung Biomedical Research Institute, Samsung Medical Center)
Kim, Jong-Won (Department of Laboratory Medicine and Genetics, Sungkyunkwan University School of Medicine, Samsung Medical Center)
Abstract
A multitude of protein-coding sequence variations (CVs) in the human genome have been revealed as a result of major initiatives, including the Human Variome Project, the 1000 Genomes Project, and the International Cancer Genome Consortium. This naturally has led to debate over how to accurately assess the functional consequences of CVs, because predicting the functional effects of CVs and their relevance to disease phenotypes is becoming increasingly important. This article surveys and compares variation databases and in silico prediction programs that assess the effects of CVs on protein function. We also introduce a combinatorial approach that uses machine learning algorithms to improve prediction performance.
Keywords
sequence variation; amino acid substitution; nonsynonymous single nucleotide polymorphism; missense mutation; prediction; protein function;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Sunyaev, S., Ramensky, V., Koch, I., et al. (2001) Prediction of deleterious human alleles. Hum. Mol. Genet. 10, 591-597   DOI   ScienceOn
2 Thomas, P.D., Campbell, M.J., Kejariwal, A., et al. (2003). PANTHER: a library of protein families and subfamilies indexed by function. Genome Res. 13, 2129-2141   DOI   ScienceOn
3 Han, J., Kraft, P., Nan, H., et al. (2008). A genome-wide association study identifies novel alleles associated with hair color and skin pigmentation. PLoS Genet. 4, e1000074   DOI   ScienceOn
4 Harley, J.B., Alarcon-Riquelme, M.E., Criswell, L.A., et al. (2008). Genome-wide association scan in women with systemic lupus erythematosus identifies susceptibility variants in ITGAM, PXK, KIAA1542 and other loci. Nat. Genet. 40, 204-210   DOI   ScienceOn
5 Jones, S., Zhang, X., Parsons, D.W., et al. (2008). Core signaling pathways in human pancreatic cancers revealed by global genomic analyses. Science 321, 1801-1806   DOI   ScienceOn
6 Porter, C.J., Talbot, C.C., and Cuticchia, A.J. (2000). Central mutation databases-a review. Hum. Mutat. 15, 36-44   DOI   ScienceOn
7 Kawabata, T., Ota, M., and Nishikawa, K. (1999). The protein mutant database. Nucleic Acids Res. 27, 355-357   DOI   ScienceOn
8 Ng, P.C., and Henikoff, S. (2002). Accounting for human polymorphisms predicted to affect protein function. Genome Res. 12, 436-446   DOI   ScienceOn
9 Ng, P.C., and Henikoff, S. (2006). Predicting the effects of amino acid substitutions on protein function. Annu. Rev. Genomics Hum. Genet. 7, 61-80   DOI
10 Stenson, P.D., Ball, E.V., Mort, M., et al. (2003). Human Gene Mutation Database (HGMD): 2003 update. Hum. Mutat. 21, 577-581   DOI   ScienceOn
11 Sulem, P., Gudbjartsson, D.F., Stacey, S.N., et al. (2007). Genetic determinants of hair, eye and skin pigmentation in Europeans. Nat. Genet. 39, 1443-1452   DOI   ScienceOn
12 Bao, L., Zhou, M., and Cui, Y. (2005). nsSNPAnalyzer: identifying disease-associated nonsynonymous single nucleotide polymorphisms. Nucleic Acids Res. 33(Web Server issue), W480-482   DOI   ScienceOn
13 Amos, C.I., Wu, X., Broderick, P., et al. (2008). Genomewide association scan of tag SNPs identifies a susceptibility locus for lung cancer at 15q25.1. Nat. Genet. 40, 616-622   DOI   ScienceOn
14 Care, M.A., Needham, C.J., Bulpitt, A.J., and Westhead, D.R. (2007). Deleterious SNP prediction: be mindful of your training data! Bioinformatics 23, 664-672   DOI   ScienceOn
15 Bao, L., and Cui, Y. (2005). Prediction of the phenotypic effects of non-synonymous single nucleotide polymorphisms using structural and evolutionary information. Bioinformatics 21, 2185-2190   DOI   ScienceOn
16 Bromberg, Y., and Rost, B. (2007). SNAP: predict effect of non-synonymous polymorphisms on function. Nucleic Acids Res. 35, 3823-3835   DOI
17 Bromberg, Y., and Rost, B. (2008a). Comprehensive in silico mutagenesis highlights functionally important residues in proteins. Bioinformatics 24, i207-212   DOI   ScienceOn
18 Frazer, K.A., Ballinger, D.G., Cox, D.R., et al. (2007). A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851-861   DOI   ScienceOn
19 Hamosh, A., Scott, A.F., Amberger, J.S., et al. (2005). Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res. 33(Database issue), D514-517   DOI   ScienceOn
20 Yip, Y.L., Scheib, H., Diemand, A.V., et al. (2004). The Swiss-Prot variant page and the ModSNP database: a resource for sequence and structure information on human protein variants. Hum. Mutat. 23, 464-470   DOI   ScienceOn
21 Greenman, C., Stephens, P., Smith, R., et al. (2007). Patterns of somatic mutation in human cancer genomes. Nature 446, 153-158   DOI   ScienceOn
22 Chasman, D., and Adams, R.M. (2001). Predicting the functional consequences of non-synonymous single nucleotide polymorphisms: structure-based assessment of amino acid variation. J. Mol. Biol. 307, 683-706   DOI   ScienceOn
23 Ferrer-Costa, C., Gelpi, J.L., Zamakola, L., et al. (2005). PMUT: a web-based tool for the annotation of pathological mutations on proteins. Bioinformatics 21, 3176-3178   DOI   ScienceOn
24 Ferrer-Costa, C., Orozco, M., and de la Cruz, X. (2002). Characterization of disease-associated single amino acid polymorphisms in terms of sequence and structure properties. J. Mol. Biol. 315, 771-786   DOI   ScienceOn
25 Ferrer-Costa, C., Orozco, M., and de la Cruz, X. (2004). Sequence-based prediction of pathological mutations. Proteins 57, 811-819   DOI   ScienceOn
26 Bromberg, Y., Yachdav, G., and Rost, B. (2008b). SNAP predicts effect of mutations on protein function. Bioinformatics 24, 2397-2398   DOI   ScienceOn
27 Campbell, P.J., Pleasance, E.D., Stephens, P.J., et al. (2008). Subclonal phylogenetic structures in cancer revealed by ultra-deep sequencing. Proc. Natl. Acad. Sci. U. S. A. 105, 13081-13086   DOI   ScienceOn
28 Ng, P.C., and Henikoff, S. (2001). Predicting deleterious amino acid substitutions. Genome Res. 11, 863-874   DOI   ScienceOn
29 Kim, H.J., Sohn, K.M., Shy, M.E., et al. (2007). Mutations in PRPS1, which encodes the phosphoribosyl pyrophosphate synthetase enzyme critical for nucleotide biosynthesis, cause hereditary peripheral neuropathy with hearing loss and optic neuropathy (cmtx5). Am. J. Hum. Genet. 81, 552-558   DOI   ScienceOn
30 Jiang, R., Yang, H., Zhou, L., et al. (2007). Sequence-based prioritization of nonsynonymous single-nucleotide polymorphisms for the study of disease mutations. Am. J. Hum. Genet. 81, 346-360   DOI   ScienceOn
31 Won, H.H., Kim, H.J., Lee, K.A., and Kim, J.W. (2008). Cataloging coding sequence variations in human genome databases. PLoS ONE 3, e3575   DOI   ScienceOn
32 WTCCC. (2007). Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661-678   DOI   ScienceOn
33 Thomas, P.D., and Kejariwal, A. (2004). Coding single- nucleotide polymorphisms associated with complex vs. Mendelian disease: evolutionary evidence for differences in molecular effects. Proc. Natl. Acad. Sci. U. S. A. 101, 15398-15403   DOI   ScienceOn
34 Tenesa, A., Farrington, S.M., Prendergast, J.G., et al. (2008). Genome-wide association scan identifies a colorectal cancer susceptibility locus on 11q23 and replicates risk loci at 8q24 and 18q21. Nat. Genet. 40, 631-637   DOI   ScienceOn
35 Sunyaev, S., Ramensky, V., and Bork, P. (2000). Towards a structural basis of human non-synonymous single nucleotide polymorphisms. Trends Genet. 16, 198-200   DOI   ScienceOn
36 Ramensky, V., Bork, P., and Sunyaev, S. (2002). Human non-synonymous SNPs: server and survey. Nucleic Acids Res. 30, 3894-3900   DOI   ScienceOn
37 Stenson, P.D., Ball, E., Howells, K., et al. (2008). Human Gene Mutation Database: towards a comprehensive central mutation database. J. Med. Genet. 45, 124-126   DOI   ScienceOn
38 Sherry, S.T., Ward, M.H., Kholodov, M., et al. (2001). dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 29, 308-311   DOI   ScienceOn
39 Sjoblom, T., Jones, S., Wood, L.D., et al. (2006). The consensus coding sequences of human breast and colorectal cancers. Science 314, 268-274   DOI   ScienceOn
40 Ng, P.C., and Henikoff, S. (2003). SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res. 31, 3812-3814   DOI   ScienceOn
41 Krawczak, M., Ball, E.V., Fenton, I., et al. (2000). Human gene mutation database-a biomedical information and research resource. Hum. Mutat. 15, 45-51   DOI   ScienceOn
42 Mailman, M.D., Feolo, M., Jin, Y., et al. (2007). The NCBI dbGaP database of genotypes and phenotypes. Nat. Genet. 39, 1181-1186   DOI