DOI QR코드

DOI QR Code

Identification of 1,531 cSNPs from Full-length Enriched cDNA Libraries of the Korean Native Pig Using in Silico Analysis

  • Published : 2009.06.30

Abstract

Sequences from the clones of full-length enriched cDNA libraries serve as valuable resources for functional genomics related studies, genome annotation and SNP discovery. We analyzed 7,392 high-quality chromatograms (Phred value ${\geq}$30) obtained from sequencing the 5' ends of clones derived from full-length enriched cDNA libraries of Korean native pigs including brainstem, liver, cerebellum, neocortex and spleen libraries. In addition, 50,000 EST sequence trace files obtained from GenBank were combined with our sequences to identify cSNPs in silico. The process generated 11,324 contigs, of which 2,895 contigs contained at least one SNP and among them 610 contigs had a minimum of one sequence from Korean native pigs. Of 610 contigs, we randomly selected 262 contigs and performed in silico analysis for the identification of cSNPs. From the results, we identified 1,531 putative coding single nucleotide polymorphisms (cSNPs) and the SNP detection frequency was one SNP per 465 bp. A large-scale sequencing result of clones from full-length enriched cDNA libraries and identified cSNPs will serve as a useful resource to functional genomics related projects such as a pig HapMap project in the near future.

Keywords

References

  1. Barker, G., Batley, J., Sullivan, H.O., Edwards, K.J., and Edwards, D. (2002). Redundancy based detection of sequence polymorphisms in expressed sequence tag data using autoSNP. Bioinformatics 19, 421-422 https://doi.org/10.1093/bioinformatics/btf881
  2. Brookes, A.J. (1999). The essence of SNPs. Gene 234, 177-186 https://doi.org/10.1016/S0378-1119(99)00219-X
  3. Buetow, K.H., Edmonson, M.N., and Cassidy, A.B. (1999). Reliable identification of large number of candidate SNPs from public EST data. Nat. Genet. 21, 323-325 https://doi.org/10.1038/6851
  4. Chen, C.H., Lin, E.C., Cheng, W.T.K., Sun, H.S., Mersmann, H.J., and Ding, S.T. (2006). Abundantly expressed genes in pig adipose tissue: an expressed sequence tag approach. J. Anim. Sci. 84, 2673-2683 https://doi.org/10.2527/jas.2005-737
  5. Dimmic, M.W., Sunyaev, S., and Bustamante, C. (2005). Inferring SNP function using evolutionary, structural and computational methods. Pac. Symp. Biocomput. 10, 382-384
  6. Dirisala, V.R., Kim, J., Park, K., Kim, N., Lee, K.T., Oh, S.J., Oh, J.H., Kim, N.S., Um, S.J., Lee, H.T., Kim, K.I., and Park, C. (2005). cSNP mining from full-length enriched cDNA libraries of the Korean native pig. Kor. J. Genet. 27, 329-335
  7. Dirisala, V.R., Kim, J., Park, K., Lee, H.T., and Park, C. (2007). Discovery of cSNPs in Pig Using Full-length Enriched cDNA Libraries of the Lorean Native Pig as a source of Genetic Diversity. BBE. 12, 424-432 https://doi.org/10.1007/BF02931066
  8. Ewing, B., and Green, P. (1998a). Base calling of automated sequencing tracers using phred. II. Error probabilities. Genome. Res. 8, 186-194
  9. Ewing, B., Hillier, L., Wendl, M., and Green, P. (1998b). Base-calling of automated sequencer traces using Phred. I. Accuracy assessment. Genome. Res. 8, 175-185
  10. Fahrenkrug, S.C., Freking, B.A., Smith, T.P.L., Rohrer, G.A., and Keele, J.W. (2002). Single nucleotide polymorphism (SNP) discovery in porcine expressed genes. Anim. Genet. 33, 186-195 https://doi.org/10.1046/j.1365-2052.2002.00846.x
  11. Fitzsimmons, C.J., Savolainen, P., Amini, B., Hjalm, G., Lunderberg, J., and Andersson, L. (2004). Detection of sequence polymorphisms in red junglefowl and white leghorn ESTs. Anim. Genet. 35, 391-396 https://doi.org/10.1111/j.1365-2052.2004.01184.x
  12. Fujisaki, S., Sugiyama, A., Eguchi, T., Watanabe, Y., Hiraiwa, H., Honma, D., Saito, T., and Yasue, H. (2004). Analysis of a full-length cDNA library constructed from swine olfactory bulb for elucidation of expressed genes and their transcription initiation sites. J. Vet. Med. Sci. 66, 15-23 https://doi.org/10.1292/jvms.66.15
  13. Garg, K., Green, P., and Nickerson, D.A. (1999). Identification of candidate coding region single nucleotide polymorphisms in 165 human genes using assembled expressed sequence tags. Genome Res. 9, 1087-1092 https://doi.org/10.1101/gr.9.11.1087
  14. Glazier A.M., Nadeau J.H., and Aitman, T.J. (2002). Finding genes that underlie complex traits. Science 298, 2345-2349 https://doi.org/10.1126/science.1076641
  15. Gordon, D., Abajian, C., and Green, P. (1998). Consed: a graphical tool for sequence finishing. Genome Res. 8, 195-202
  16. Grapes, L., Rudd, S., Fernando, R.L., Megy, K., Rocha, D., and Rothschild, M.F. (2006). Prospecting for pig single nucleotide polymorphisms in the human genome: have we struck gold? J. Anim. Breed. Genet. 123, 145-151 https://doi.org/10.1111/j.1439-0388.2006.00587.x
  17. Gu, Z., Hillier, L., and Kwok, P.Y. (1998). Single-nucleotide polymorphism hunting in cyberspace. Hum. Mutat. 12, 221-225 https://doi.org/10.1002/(SICI)1098-1004(1998)12:4<221::AID-HUMU1>3.0.CO;2-I
  18. Guryev, V., Berezikov, E., Malik, R., Plasterk, R.H., and Cuppen, E. (2004). Single nucleotide polymorphisms associated with rat expressed sequences. Genome Res. 14, 1438-1443 https://doi.org/10.1101/gr.2154304
  19. Hawken, R.J., Barris, W.C., McWilliam, S.M., and Dalrymple, B.P. (2004). An interactive bovine in silico SNP database (IBISS). Mamm. Genome 15, 819-827
  20. Kim, H., Shmidt, C.J., Decker, K.S., and Emara, M.G. (2003). A double-screening method to identify reliable candidate non-synonymous SNPs from chicken EST data. Anim. Genet. 34, 249-254 https://doi.org/10.1046/j.1365-2052.2003.01003.x
  21. Kim, J.H., Yim, S.H., Jeong, Y.B., Jung, S.H., Xu, H.D., Shin, S.H., and Chung, Y.J. (2008). Comparison of Normalization Methods for Defining Copy Number Variation Using Whole-genome SNP Genotyping Data. G&I 6, 231-234 https://doi.org/10.5808/GI.2008.6.4.231
  22. Kim, T.H., Kim, K.S., Choi, B.H., Yoon, D.H., Jang, G.W., Lee, K.T., Chung, H.Y., Lee, H.Y., Park, H.S., and Lee, J.W. (2005). Genetic structure of pig breeds from Korea and China using microsatellite loci analysis. J. Anim. Sci. 83, 2255-2263 https://doi.org/10.2527/2005.83102255x
  23. Kim, T.H., Kim, N.S., Lim, D., Lee, K.T., Oh, J.H., Park, H.S., Jang, G.W., Kim, H.Y., Jeon, M., Choi, B.H., Lee, H.Y., Chung, H.Y., and Kim, H. (2006). Generation and analysis of large-scale expressed sequence tags (ESTs) from a full-length enriched cDNA library of porcine backfat tissue. BMC Genomics 7, 36 https://doi.org/10.1186/1471-2164-7-36
  24. Kim, Y.H., and Kim, H. (2007). Application of Random Forests to Association Studies Using Mitochondrial Single Nucleotide Polymorphisms. G&I 5, 168-173
  25. Kollers, S., M$\acute{e}$gy, K., and Rocha, D. (2005), Analysis of public single nucleotide polymorphisms in commercial pig populations, Anim. Genet. 36, 426-431 https://doi.org/10.1111/j.1365-2052.2005.01329.x
  26. Komar, A.A. (2007). SNPs, silent but not invisible. Science 315, 466-467 https://doi.org/10.1126/science.1138239
  27. Lee, M.A., Keane, O.M., Glass, B.C., Manley, T.R., Cullen, N.G., Dodds, K.G., McCulloh, A.F., Morris, C.A., Schreiber, M., Warren, J., Zadissa, A., Wilson T., and McEwan, J.C. (2006a). Establishment of a pipeline to analyse non-synonymous SNPs in Bos taurus. BMC Genomics 7, 298 https://doi.org/10.1186/1471-2164-7-298
  28. Lee, S.H., Park, E.W., Cho, Y.M., Lee, J.W., Kim, H.Y., Lee, J.H., Oh, S.J., Cheon, I.C., and Yoon, D.H. (2006b), Confirming single nucleotide polymorphisms from expressed sequence tag datasets derived from three cattle cDNA libraries. J. Biochem. Mol. Biol. 39, 183-188 https://doi.org/10.5483/BMBRep.2006.39.2.183
  29. Panitz, F., Stengaard, H., Hornshoj, H., Gorodkin, J., Hedegaard, J., Cirera, S., Thomsen, B., Madsen, L.B., Hoj, A., Vingborg, R.K., Zahn, B., Wang, X., Wang, X., Wernersson, R., Jorgensen, C.B., Scheibye-Knudsen, K., Arvin, T., Lumholdt, S., Sawera, M., Green, T., Nielsen, B.J., Havgaard, J.H., Brunak, S., Fredholm, M., Bendixen, C. (2007). SNP mining porcine ESTs with MAVIANT, a novel tool for SNP evaluation and annotation. Bioinformatics 23, i387-i391 https://doi.org/10.1093/bioinformatics/btm192
  30. Park, K., Dirisala, V.R., Oh, Y., Choi, H., Lee, K.T., Kim, J.H., Lee, H.T., Seo, K.H., and Park, C. (2009). Reporting 678 putative cSNPs from full-length enriched cDNA sequences of the Korean native pig. J. Anim. Breed Genet. 126(2), 127-133 https://doi.org/10.1111/j.1439-0388.2008.00765.x
  31. Picoult-Newberg, L., Idekar, T.E., Pohl, M.G., Taylor, S.L., Donaldson, M.A., Nickerson, D.A., and Boyce-Jacino, M. (1999). Mining SNPs from EST databases. Genome Res. 9, 167-174
  32. Porter, V. (1993) Pigs, A Handbook to the Breeds of the World. Helm information Ltd., UK
  33. Rothschild, M.F. (2003). Advances in pig genomics and functional gene discovery. Comp. Funct. Genom. 4, 266-270 https://doi.org/10.1002/cfg.261
  34. Rothschild, M.F., Hu, Z.L., and Jiang, Z. (2007). Advances in QTL Mapping in pigs. Int. J. Biol. 3, 192-197
  35. Sambrook, J., Fritsch, E., and Maniatis, T. (1989). Molecular cloning: A laboratory manual. 2nd ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, USA
  36. The International SNP Map Working Group. (2001). A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms. Nature 409, 928-933 https://doi.org/10.1038/35057149
  37. Tuggle, C.K., Wang, Y., and Couture, O. (2007). Advances in Swine Transcriptomics. Int. J. Biol. Sci. 3, 132-152
  38. Uenishi, H., Eguchi, T., Suzuki, K., Sawazaki, T., Toki, D., Shinkai, H., Okumura, N., Hamasina, N., and Awata, T. (2004). PEDE (Pig EST Data Explorer): Construction of a database for ESTs derived from porcine full-length cDNA libraries. Nucl. Acids Res. 32, 484-488 https://doi.org/10.1093/nar/gkh037
  39. Uenishi, H., Eguchi-Ogawa, T., Shinkai, H., Okumura, N., Suzuki, K., Toki, D., Hamasima, N., and Awata, T. (2007). PEDE (Pig EST Data Explorer) has been expanded into pig expression data explorer, including 10147 porcine full-length cDNA sequences. Nucleic Acids Res. 35, D650-D653 https://doi.org/10.1093/nar/gkl954
  40. Useche, F.J., Gao, G., Harafey, M., and Rafalski, A. (2001). High-throughput identification, database storage and analysis of SNPs in EST sequences. Genome Inform. Ser. Workshop Genome Inform. 12, 194-203
  41. Wang, D.G., Fan, J.B., Siao, C.J., Berno, A., Young, P., Sapolsky, R., Ghandour, G., Perkins, N., Winchester, E., Spencer, J., Krugylyak, L., Stein, L., Hsie, L., Topaloglou, T. Hubbell, E., Robinson, E., Mittmann, M., Morris, M.S., Shen, N., Kilburn, D., Rioux, J., Nusbaum, C., Rozen, S., Hudson, T.J., and Lander, E.S. (1998). Large-scale identification, mapping, genotyping of single nucleotide polymorphisms in the human genome. Science 280, 1077-1082 https://doi.org/10.1126/science.280.5366.1077
  42. Zimdahl, H., Nyakatura, G., Brandt, P., Schulz, H., Hummel, O., Fatmann, B., Brett, D., Droege, M., Monti, J., Lee, Y.A., Sun, Y., Zhao, S., Winter, E.E., Pontig, C.P., Chen, Y., Kasprzyk, A., Birney, E., Ganten, D., and Hubner, N. (2004). A SNP map of rat genome generated from cDNA sequences. Science 303, 807 https://doi.org/10.1126/science.1092427