Bioinformatics Resources of the Korean Bioinformation Center (KOBIC)

  • Accepted : 2010.12.02
  • Published : 2010.12.31


The Korean Bioinformation Center (KOBIC) is a national bioinformatics research center in Korea. We developed many bioinformatics algorithms and applications to facilitate the biological interpretation of OMICS data. Here we present an introduction to major bioinformatics resources of databases and tools developed at KOBIC. These resources are classified into three main fields: genome, proteome, and literature. In the genomic resources, we constructed several pipelines for next generation sequencing (NGS) data processing and developed analysis algorithms and web-based database servers including miRGator, ESTpass, and CleanEST. We also built integrated databases and servers for microarray expression data such as MDCDP. As for the proteome data, VnD database, WDAC, Localizome, and CHARMM_HM web servers are available for various purposes. We constructed IntoPub server and Patome database in the literature field. We continue constructing and maintaining the bioinformatics infrastructure and developing algorithms.



  1. Arakawa, K., Kono, N., Yamada, Y., Mori, H., and Tomita, M. (2005). KEGG-based pathway visualization tool for complex omics data. In Silico Biol. 5, 419-423.
  2. Barrell, D., Dimmer, E., Huntley, R.P., Binns, D., O'Donovan, C., and Apweiler, R. (2009). The GOA database in 2009--an integrated Gene Ontology Annotation resource. Nucl. Acids Res. 37, D396-403.
  3. Burke, J., Davison, D., and Hide, W. (1999). d2_cluster: a validated method for clustering EST and full-length cDNAsequences. Genome Res. 9, 1135-1142.
  4. Cho, S., Jun, Y., Lee, S., Choi, H.S., Jung, S., Jang, Y., Park, C., Kim, S., and Kim, W. (2011). miRGator v2.0 : an integrated system for functional investigation of microRNAs. Nucl. Acids Res. 39, D158-162.
  5. Finn, R.D., Mistry, J., Tate, J., Coggill, P., Heger, A., Pollington, J.E., Gavin, O.L., Gunasekaran, P., Ceric, G., Forslund, K., Holm, L., Sonnhammer, E.L., Eddy, S.R., and Bateman, A. (2010). The Pfam protein families database. Nucl. Acids Res. 38, D211-222.
  6. Huang, X., and Madan, A. (1999). CAP3: A DNA sequence assembly program. Genome Res. 9, 868-877.
  7. Hunter, S., Apweiler, R., Attwood, T.K., Bairoch, A., Bateman, A., Binns, D., Bork, P., Das, U., Daugherty, L., Duquenne, L., Finn, R.D., Gough, J., Haft, D., Hulo, N., Kahn, D., Kelly, E., Laugraud, A., Letunic, I., Lonsdale, D., Lopez, R., Madera, M., Maslen, J., McAnulla, C., McDowall, J., Mistry, J., Mitchell, A., Mulder, N., Natale, D., Orengo, C., Quinn, A.F., Selengut, J.D., Sigrist, C.J., Thimma, M., Thomas, P.D., Valentin, F., Wilson, D., Wu, C.H., and Yeats, C. (2009). InterPro: the integrative protein signature database. Nucl. Acids Res. 37, D211-215.
  8. Kall, L., Krogh, A., and Sonnhammer, E.L. (2007). Advantages of combined transmembrane topology and signal peptide prediction--the Phobius web server. Nucl. Acids Res. 35, W429-432.
  9. Kelso, J., Visagie, J., Theiler, G., Christoffels, A., Bardien, S., Smedley, D., Otgaar, D., Greyling, G., Jongeneel, C.V., McCarthy, M.I., Hide, T., and Hide, W. (2003). eVOC: a controlled vocabulary for unifying gene expression data. Genome Res. 13, 1222-1230.
  10. Lee, B., and Lee, D. (2009). Protein comparison at the domain architecture level. BMC Bioinformatics 10 Suppl 15, S5.
  11. Lee, B., and Shin, G. (2009). CleanEST: a database of cleansed EST libraries. Nucl. Acids Res. 37, D686-689.
  12. Lee, B., Hong, T., Byun, S.J., Woo, T., and Choi, Y.J. (2007). ESTpass: a web-based server for processing and annotating expressed sequence tag (EST) sequences. Nucl. Acids Res. 35, W159-162.
  13. Lee, B., Kim, T., Kim, S.K., Lee, K.H., and Lee, D. (2007). Patome: a database server for biological sequence annotation and analysis in issued patents and published patent applications. Nucl. Acids Res. 35, D47-50.
  14. Lee, S., Lee, B., Jang, I., Kim, S., and Bhak, J. (2006). Localizome: a server for identifying transmembrane topologies and TM helices of eukaryotic proteins utilizing domain information. Nucl. Acids Res. 34, W99-103.
  15. Lee, S., Seo, C.H., Lim, B., Yang, J.O., Oh, J., Kim, M., Lee, B., and Kang, C. (2010). Accurate quantification of transcriptome from RNA-Seq data by effective length normalization. Nucl. Acids Res. 38, 1-10.
  16. Marguerat, S., and Bahler, J. (2010). RNA-seq: from technology to biology. Cell Mol. Life Sci. 67, 569-579.
  17. Pruitt, K.D., Tatusova, T., and Maglott, D.R. (2007). NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucl. Acids Res. 35, D61-65.
  18. Radeva, M., Hofmann, T., Altenberg, B., Mothes, H., Richter, K.K., Pool-Zobel, B., and Greulich, K.O. (2008). The database dbEST correctly predicts gene expression in colon cancer patients. Curr. Pharm. Biotechnol. 9, 510-515.
  19. Sequeira, E., McEntyre, J., and Lipman, D. (2001). PubMed Central decentralized. Nature 410, 740.
  20. Yang, J.O., Oh, S., Ko, G., Park, S.J., Kim, W.Y., Lee, B., and Lee, S. (2011). VnD: a structure-centric database of disease-related SNPs and drugs. Nucl. Acids Res. 39, D939-944.