Browse > Article

Implementation of an Information Management System for Nucleotide Sequences based on BSML using Active Trigger Rules  

Park Sung Hee (충북대학교 전자계산학과)
Jung Kwang Su (충북대학교 전자계산학과)
Ryu Keun Ho (충북대학교 전기전자컴퓨터공학부)
Abstract
Characteristics of biological data including genome sequences are heterogeneous and various. Although the need of management systems for genome sequencing which should reflect biological characteristics has been raised, most current biological databases provide restricted function as repositories for biological data. Therefore, this paper describes a management system of nucleotide sequences at the level of biological laboratories. It includes format transformation, editing, storing and retrieval for collected nucleotide sequences from public databases, and handles sequence produced by experiments. It uses BSML based on XML as a common format in order to extract data fields and transfer heterogeneous sequence formats. To manage sequences and their changes, version management system for originated DNA is required so as to detect transformed new sequencing appearance and trigger database update. Our experimental results show that applying active trigger rules to manage changes of sequences can automatically store changes of sequences into databases.
Keywords
Bioinformatics; Nucleotide Sequence; Sequence Management; Sequence Version; BSML;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Pearson W.R., Lipman D.J., 'Improved tools for biological sequence comparison,' Proc. Narl. Acad. Sci. vol 85 pp. 2444-2448, 1988   DOI   ScienceOn
2 D. Fenyo, The Biopolymer Markup Language, Oxford University Press, 1999
3 The Genomic Workspace User Manual 4.0, Technical Memo, Rescentris, Ltd., 2003
4 Altschul, S. F., Carrol, R. J., and Lipman, D. J.(1990). Basic local alignment search tool. J. Mol. Biol., Vol. 215, pp. 403, 1990
5 Karlin, S. & Altschul, S.F, 'Methods for assessing the statistical significance of molecular sequence features by using general scorling schemes,' Proc. Natl. Acad. Sci. USA 87, 1990
6 F. Achard, G. Vaysseix, XML, bioinformatics and data integration, Society Technical Committee on Data Engineering, 1999
7 J. Ostell, The NCBI software tools. In Nucleic Acid and Protein Analysis : A Practical Approach, M. Bishop and C. Rawlings, Eds. Oxford: IRL Press, pp. 31-43, 1996
8 D. W. Mount, 'Bioinformatics : Sequence and Genome Analysis,' Cold Spring Harbor Laboratory Press, 2001
9 R. Staden, D. P. Judge, J. K. Bonfield SEQUENCE ASSEMBLY AND FINISHING. A Practical Guide to the Analysis of Genes and Proteins. Second Edition Eds. Andreas D. Baxevanis and B. F. Francis Ouellette. John Wiley & Sons, New York, NY, USA, 2001
10 J. Bonfiled, K. F. Beal , M. Jordan, Y. Cheng, R. Staden, The Staden Package Manual, Medical Research Council Labortory of Molecular Biology, 2001
11 G. Stoesser, W. Baker, A. V.D Broek, E. Camon, M. Garcia-Pastor, C. Kanz, T. Kulikova, V. Lombard, R. Lopez, H. Parkinson, N. Redaschi, P. Sterk, P. Stoehr, M. Ann T., 'The EMBL nucleotide sequence database,' Nucl. Acids. Res. Vol.29, pp. 17-21, 2001   DOI   ScienceOn
12 R. Staden, K. F. Beal, J. K. Bonfield The Staden Package, 1998. Computer Methods in Molecular Biology, pages 115-130, vol. 132 : Bioinformatics Methods and Protocols Eds Stephen Misener and Steve A. Krawetz. The Humana Press Inc., Totowa, NJ 07512
13 D. A. Benson, I. K. Mizrachi, D. J. Lipman, J. Ostell, B. A. Rapp, D. L. Wheeler 'GenBank' Nucl. Acids. Res. Vol : 30, pp. 17-20, 2002   DOI   ScienceOn
14 B. James, K. Beal, K. F. Betts, J. Matthew, S. Rodger. Trev : a DNA trace editor and viewer. Bioinformatics Vol.18, pp. 194-195, 2002   DOI   ScienceOn
15 S. I. Letovsky, Bioinformatics Databases and Systems, Kluwer Academic Publishers, 2000
16 J. Widom, S. Ceri, Introduction to Active Database Systems. Active Database Systems : Triggers and Rules For Advanced Database Processing, Morgan Kaufmann (1996)1-41
17 J. Spitzner, Bioinformatics Sequence Markup Language Manual, LabBook Inc., 1997
18 D. L. Wheeler, D. M. C. A. E. Lash, D. D. Leipe, T. L. Madden, J. U. Pontius, G. D. Schuler, L. M. Schriml, T. A. Tatusova, L. Wagner, B. A. Rapp, Database resources of the National Center for Biotechnology Information : 2002 update, Nucl. Acids. Res. Vol : 30. pp. 13-16, 2002   DOI   ScienceOn
19 A.D. Baxevanis, B.F.F. Ouellette, Bioinformatics : A Practical Guide to the Analysis of Genes and Proteins, pp. 45-59, Wiley-Liss, Inc, 2001
20 S. H. Park, Y. Han, K. H. Ryu, 'Building Genime and Protein Sequence Information Management System,' 7th KOSTI Workshop on Korean Infrastructure for Science and Technology Information, pp. 234-247, 2002
21 J. Ostell, S.J. Wheelan, J.A. Kans, The NCBI data model. Chapter 2 in Bioinformatics : A Practical Guide to the Analysis of Genes and Proteins, 2nd ed., edited by Baxevanis, A.D. and Ouellette, B.F.F. New York : John Wiley & Sons, pp. 19-43, 2001
22 R. Elmasri, S. B. Navathe, 'Fundamentals of Database Systems,' Addison-Wesley, 2000
23 S. H. Park, K. H. Ryu, H. S. Son, A Protein Structural Information Management Based on Spatial Concepts and Active Trigger Rules, DEXA03 : Database and Expert Systems Applications, LNCS2736 : 413-422, 2003
24 R. H. Li, S. H. Park, K. S. Jeong, K. H. Ryu, Integrated data modeling of protein structures using a fact constellation model based on a XML mediated warehouse system, ISMB 03', Australia, Jun 29-July 3, 2003
25 K. S. Jung, S. H. Park, K. H. Ryu, H. S. Son, Sequence Version Management System based on Trigger, Korean Society for Bioinformatics Annual Meeting, Vol.1, pp. 134-141, 2002
26 S. H. Park, K. H. Ryu, B.J. Jeong, H. S. Son, Version Management of a genomic sequence database using active rules and temporal concepts, ISMB 03', Australia, Jun 29, July 3, 2003