DOI QR코드

DOI QR Code

Mutational Data Loading Routines for Human Genome Databases: the BRCA1 Case

  • Van Der Kroon, Matthijs (Centro de Investigacion en Metodos de Produccion de Software (PROS), Universidad Politecnica de Valencia) ;
  • Ramirez, Ignacio Lereu (Centro de Investigacion en Metodos de Produccion de Software (PROS), Universidad Politecnica de Valencia) ;
  • Levin, Ana M. (Centro de Investigacion en Metodos de Produccion de Software (PROS), Universidad Politecnica de Valencia) ;
  • Pastor, Oscar (Centro de Investigacion en Metodos de Produccion de Software (PROS), Universidad Politecnica de Valencia) ;
  • Brinkkemper, Sjaak (Department of Information and Computing Sciences Utrecht University)
  • 투고 : 2010.11.08
  • 심사 : 2010.12.15
  • 발행 : 2010.12.31

초록

The last decades a large amount of research has been done in the genomics domain which has and is generating terabytes, if not exabytes, of information stored globally in a very fragmented way. Different databases use different ways of storing the same data, resulting in undesired redundancy and restrained information transfer. Adding to this, keeping the existing databases consistent and data integrity maintained is mainly left to human intervention which in turn is very costly, both in time and money as well as error prone. Identifying a fixed conceptual dictionary in the form of a conceptual model thus seems crucial. This paper presents an effort to integrate the mutational data from the established genomic data source HGMD into a conceptual model driven database HGDB, thereby providing useful lessons to improve the already existing conceptual model of the human genome.

키워드

참고문헌

  1. AHNERT, S., FINK, T., AND ZINOVYEV, A. 2008. How much non-coding DNA do eukaryotes require? Journal of Theoretical Biology 252, 4, 587-592. https://doi.org/10.1016/j.jtbi.2008.02.005
  2. ALBERTS, B., BRAY, D., HOPKIN, K., JOHNSON, A., LEWIS, J., RAFF, M., ROBERTS, K., AND WALTER, P. 2003. Essential Cell Biology, 2nd ed. Garland Science USA.
  3. AMBERGER, J., BOCCHINI, C., SCOTT, A., AND HAMOSH, A. 2008. McKusick's Online Mendelian Inheritance in Man (OMIM (R)). Nucleic Acids Research.
  4. ASHBURNER, M., BALL, C., AND BLAKE, J. 2000. Gene ontology: tool for the unification of biology. Nature genetics 25, 1, 25-30. https://doi.org/10.1038/75556
  5. BAMFORD, S., DAWSON, E., FORBES, S., CLEMENTS, J., PETTETT, R., DOGAN, A., FLANAGAN, A., TEAGUE, J., FUTREAL, P., STRATTON, M., ET AL. 2004. The COSMIC (Catalogue of Somatic Mutations in Cancer) database and website. British journal of cancer 91, 2, 355-358. https://doi.org/10.1038/sj.bjc.6601894
  6. EVERMANN, J. AND WAND, Y. 2004. Ontology bases object-oriented domain modeling: Fundamental concepts. Requirements Engineering 10, 2, 146-160.
  7. EYRE, T., DUCLUZEAU, F., SNEDDON, T., POVEY, S., BRUFORD, E., AND LUSH, M. 2006. The HUGO gene nomenclature database, 2006 updates. Nucleic Acids Research 34, suppl 1, D319. https://doi.org/10.1093/nar/gkj147
  8. FOKKEMA, I., DEN DUNNEN, J., AND TASCHNER, P. 2005. LOVD: Easy creation of a locus-specific sequence variation database using an "LSDB-in-a-box" approach. Human mutation 26, 2, 63-68. https://doi.org/10.1002/humu.20201
  9. HAMOSH, A., SCOTT, A., AMBERGER, J., VALLE, D., AND MCKUSICK, V. 2000. Online Mendelian Inheritance in Man (OMIM) Hum. Mutat 15, 57-61. https://doi.org/10.1002/(SICI)1098-1004(200001)15:1<57::AID-HUMU12>3.0.CO;2-G
  10. HAPMAP, C. 2003. The International HapMap Project. Nature 426, 6968, 789-796. https://doi.org/10.1038/nature02168
  11. KENT, W. 2002. Blat the blast like alignment tool. Genome Research 12, 656-664.
  12. KHOO, U., NGAN, H., CHEUNG, A., CHAN, K., LU, J., CHAN, V., LAU, S., ANDRULIS, I., AND OZCELIK, H. 2000. Mutational analysis of brca1 and brca2 genes in chinese ovarian cancer identifies 6 novel germline mutations. Human Mutation 16, 1, 88-89.
  13. LANGSTON, A., STANFORD, J., WICKLUND, K., THOMPSON, J., BLAZEJ, R., AND OSTRANDER, E. 1996. Germ-line brca1 mutations in selected men with prostate cancer. American Journal of Human Genetics 58, 881-885.
  14. MATTICK, J. 2003. Challenging the dogma: the hidden layer of non-protein-coding RNAs in complex organisms. Bioessays 25, 10, 930-939. https://doi.org/10.1002/bies.10332
  15. MATTICK, J. 2004. RNA regulation: a new genetics? Nature Reviews Genetics 5, 4, 316-323. https://doi.org/10.1038/nrg1321
  16. MIKI, Y. ET AL. 1994. Brca1 mutations in primary breast and ovarian carcinomas. Science 266, 5182, 120-122. https://doi.org/10.1126/science.7939630
  17. OKAYAMA, T., TAMURA, T., GOJOBORI, T., TATENO, Y., IKEO, K., MIYAZAKI, S., FUKAMI-KOBAYASHI, K., AND SUGAWARA, H. 1998. Formal design and implementation of an improved ddbj dna database with a new schema and object-oriented library. Bioinformatics 14, 6, 472. https://doi.org/10.1093/bioinformatics/14.6.472
  18. PANGULURI, R., DUNSTON, G., BRODY, L., MODALI, R., UTLEY, K., ADAMSCAMPBELL, L., DAY, A., AND WHITFIELD-BROOME, C. 1999. Brca1 mutations in african americans. Human Genetics 105, 1-2, 28-31. https://doi.org/10.1007/s004399900085
  19. PASTOR, O. 2008. Conceptual modeling meets the human genome. Conceptual modeling-ER 2008 5231, 1-11.
  20. PASTOR, O., CASAMAYOR, J., CELMA, M., PASTOR, M., MOTA, L., AND LEVIN, A. 2010. The conceptual schema of the human genoma: Looking at bioinformatics from an information systems perspective. Tech. Rep. TECPROS-12-01, PROS Research Center, Camino de Vera S/N, 46022, Valencia, Valencia, Spain. Sept.
  21. PASTOR, O., LEVIN, A., CASAMAYOR, J., CELMA, M., VIRRUETA, A., AND ERASO, L. 2010. Model driven-based engineering applied to the interpretation of the human genome, 1st ed. Lecture Notes in Computer Science, vol. 6520. Springer-Verlag, Chapter 10.
  22. PASTOR, O., LEVIN, A., CELMA, M., CASAMAYOR, J., SCHATTKA, L. E., VILLANUEVA, M., AND PEREZALONSO, M. 2010. Enforcing conceptual modeling to improve the understanding of human genome. In Research Challenges in Information Science (RCIS), 2010 Fourth International Conference on. IEEE Press, 85-92.
  23. PASTOR, O. AND MOLINA, J. 2007. Model-driven architecture in practice: a software production environment based on conceptual modeling. Springer-Verlag. Berlin-Heidelberg.
  24. PASTOR, O., PASTOR, M., AND BURRIEL, V. 2010. Conceptual modeling of human genome mutations: a dichotomy between what we have and what we should have. In Proceedings of Bioinformatics 2010. BIOSTEC Bioinformatics, 160-166.
  25. PATON, W., KHAN, S., HAYES, A., MOUSSOUNI, F., BRASS, A., EILBECK, K., GLOBE, C., HUBBARD, C., AND OLIVER, S. 2000. Proceedings of the IVth Int. Conference on Research Challenges in Information Science. Vol. 6. Bioinformatics, Chapter Conceptual Modeling of Genomic Information, 548-557.
  26. PERTSEMLIDIS, A. AND FONDON, J. 2001. Having a blast with bioinformatics (and avoiding blastphemy). Genome Biology 2, 10, 1-10.
  27. PRUITT, K., TATUSOVA, T., AND MAGLOTT, D. 2006. NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic acids research.
  28. SHATTUCK-EIDENS, D. ET AL. 2009. Brca1 sequence analysis in women at high risk for susceptibility mutations. The Journal of the American Medical Association 278, 15, 1242-1250.
  29. SHERRY, S., WARD, M., KHOLODOV, M., BAKER, J., PHAN, L., SMIGIELSKI, E., AND SIROTKIN, K. 2001. dbSNP: the NCBI database of genetic variation. Nucleic Acids Research 29, 1, 308. https://doi.org/10.1093/nar/29.1.308
  30. SMITH, B., WILLIAMS, J., AND SCHULZE-KREMER, S. 2003. The ontology of the gene ontology. American Medical Informatics Association Annual Symposium Proceedings, 609-613.
  31. STEIN, L. 2002. Creating a bioinformatics nation. Nature 417, 6885, 119-121. https://doi.org/10.1038/417119a
  32. STENSON, P., BALL, E., MORT, M., PHILLIPS, A., SHIEL, J., THOMAS, N., ABEYSINGHE, S., KRAWCZAK, M., AND COOPER, D. 2003. The human gene mutation database (hgmd): 2003 update. Human Mutation 21, 6, 577-581. https://doi.org/10.1002/humu.10212
  33. VAN DER HOUT, A. ET AL. 2006. A dgge system for comprehensive mutation screening of brca1 and brca2: application in a dutch cancer clinic. Human Mutation 27, 7, 654-666. https://doi.org/10.1002/humu.20340
  34. VAN DER KROON, M., RAMIREZ, I. L., LEVIN, A., PASTOR, O., AND BRINKKEMPER, S. 2009. Mutational data loading routines for human genome databases: the brca1 case. Report UU-CS-2009-020, Department of Information and Computing Sciences, Utrecht University.
  35. ZHAO, Z., FU, Y., HEWET-EMMETT, D., AND BOERWINKLE, E. 2003. Investigating single nucleotide polymorphism (snp) density in the human genome and its implications for molecular evolution. Gene 312, 207-213. https://doi.org/10.1016/S0378-1119(03)00670-X