Functional Annotation and Analysis of Korean Patented Biological Sequences Using Bioinformatics

  • Lee, Byung Wook (Department of Biosystems, Korea Advanced Institute of Science and Technology) ;
  • Kim, Tae Hyung (National Genome Information Center, Korea Research Institute of Bioscience and Biotechnology) ;
  • Kim, Seon Kyu (National Genome Information Center, Korea Research Institute of Bioscience and Biotechnology) ;
  • Kim, Sang Soo (Department of Bioinformatics, Soongsil University) ;
  • Ryu, Gee Chan (National Genome Information Center, Korea Research Institute of Bioscience and Biotechnology) ;
  • Bhak, Jong (National Genome Information Center, Korea Research Institute of Bioscience and Biotechnology)
  • 투고 : 2005.12.27
  • 심사 : 2006.03.03
  • 발행 : 2006.04.30

초록

A recent report of the Korean Intellectual Property Office(KIPO) showed that the number of biological sequence-based patents is rapidly increasing in Korea. We present biological features of Korean patented sequences though bioinformatic analysis. The analysis is divided into two steps. The first is an annotation step in which the patented sequences were annotated with the Reference Sequence (RefSeq) database. The second is an association step in which the patented sequences were linked to genes, diseases, pathway, and biological functions. We used Entrez Gene, Online Mendelian Inheritance in Man (OMIM), Kyoto Encyclopedia of Genes and Genomes (KEGG), and Gene Ontology (GO) databases. Through the association analysis, we found that nearly 2.6% of human genes were associated with Korean patenting, compared to 20% of human genes in the U.S. patent. The association between the biological functions and the patented sequences indicated that genes whose products act as hormones on defense responses in the extra-cellular environments were the most highly targeted for patenting. The analysis data are available at http://www.patome.net

키워드

과제정보

연구 과제 주관 기관 : Korean Ministry of Science and Technology

참고문헌

  1. Altschul, S. F., Madden, T. L., Schäffer, A. A., Zhang, J., Zhang, Z., et al. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389-3402 https://doi.org/10.1093/nar/25.17.3389
  2. Benson, D. A., Karsch-Mizrachi, I., Lipman, D. J., Ostell, J., and Wheeler, D. L. (2005) GenBank. Nucleic Acids Res. 33, D34-D38 https://doi.org/10.1093/nar/gni032
  3. Collins, F. S., Green, E. D., Guttmacher, A. E., and Guyer, M. S. (2003) A vision for the future of genomics research. Nature 422, 835-847 https://doi.org/10.1038/nature01626
  4. Dufresne, G. and Duval, M. (2004) Genetic sequences: how are they patented? Nat. Biotech. 22, 231-232 https://doi.org/10.1038/nbt0204-231
  5. Gene Ontology Consortium (2004) The gene ontology (GO) database and informatics resource. Nucleic Acids Res. 32, D258-D261 https://doi.org/10.1093/nar/gkh036
  6. Hamosh, A., Scott, A. F., Amberger, J. S., Bocchini, C. A., and McKusick, V. A. (2005) Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res. 33, D514-D517 https://doi.org/10.1093/nar/gki033
  7. Jensen, K and Murray, F. (2005) Property landscape of the human genome. Science 14, 239-240
  8. Jones, R. (2003) Errors in patent application sequence listings. Nat. Biotech. 21, 1239-1240 https://doi.org/10.1038/nbt1003-1239
  9. Kanehisa, M., Goto. S., Hattori, M., Aoki-Kinoshita, K. F., Itoh, M., et al. (2006) From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res. 34, D354-D357 https://doi.org/10.1093/nar/gkj102
  10. Kim, S. K. and Lee, B. W. (2005) Patome: Database of Patented Bio-sequences. Genomics & Informatics 3, 94-97
  11. Kim, T. H., Jeon, Y. J., Yi, J. M., Kim, D. S., Huh, J. W., et al. (2004) The distribution and expression of HERV families in the human genome. Mol. Cells 18, 87-93
  12. Maglott, D., Ostell, J., Pruitt, K. D., and Tatusova, T. (2005) Entrez Gene: gene-centered information at NCBI. Nucleic Acids Res. 33, D54-D58 https://doi.org/10.1093/nar/gni052
  13. Pruitt, K. D., Tatusova, T., and Maglott, D. R. (2005) NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 33, D501-D504 https://doi.org/10.1093/nar/gki476
  14. Rouse, R. J. D., Castagnetto, J., and Niedner, R. H. (2005) Pat-Gen--a consolidated resource for searching genetic patented sequences. Bioinformatics 21, 1707-1708 https://doi.org/10.1093/bioinformatics/bti202
  15. Xie, H., Wasserman, A., Levine, Z., Novik, A., Grebinskiy, V., et al. (2002) Large-scale protein annotation through gene ontology. Genome Res. 12, 785-794 https://doi.org/10.1101/gr.86902
  16. Xu, G., Webster, A., and Doran, E. (2002) Patented sequence databases. World Patent Info. 24, 95-101 https://doi.org/10.1016/S0172-2190(02)00004-2
  17. Yoo, H., Ramanathan, C., and Barcelon-Yang, C. (2005) Intellectual property management of sequence information from a patent searching perspective. World Patent Info. 27, 203-211 https://doi.org/10.1016/j.wpi.2005.02.001