Browse > Article
http://dx.doi.org/10.3745/KIPSTD.2009.16D.5.643

Design of Efficient Storage Exploiting Structural Similarity in Microarray Data  

Yun, Jong-Han (세종대학교 검퓨터공학과)
Shin, Dong-Kyu (세종대학교 컴퓨터공학과)
Shin, Dong-Il (세종대학교 컴퓨터공학과)
Abstract
As one of typical techniques for acquiring bio-information, microarray has contributed greatly to development of bioinformatics. Although it is established as a core technology in bioinformatics, it has difficulty in sharing and storing data because data from experiments has huge and complex type. In this paper, we propose a new method which uses the feature that microarray data format in MAGE-ML, a standard format for exchanging data, has frequent structurally similar patterns. This method constructs compact database by simplifying MAGE-ML schema. In this method, Inlining techniques and newly proposed classification techniques using structural similarity of elements are used. The structure of database becomes simpler and number of table-joins is reduced, performance is enhanced using this method.
Keywords
Structural Similarity; MAGE-ML; Inlining Technique; Semi-Structured Data; Microarray Data; Bioinformatics;
Citations & Related Records
연도 인용수 순위
  • Reference
1 H. Schoning, "Tamino - A DBMS designed for XML", In Proceedings of the 17th International Conference on Data E ngineering 2-6, pp.149-154, 2001.   DOI
2 K. Runapongsa and J. M. Patel, "Storing and Querying XML Data in Object-Relational DBMS", In EDBT 2002 Workshop on XML-Based Data Management and Multimedia Engineering, LNCS 2490, pp.266-285, 2002.   DOI   ScienceOn
3 S. Ambler, D. A. Chapam, "Agile Database Techniques: Effective Strategies for the Agile Software Developer", WILEY, 2003.
4 XSLT (XML Stylesheet Language Transformations), http://www.w3.org/Style/XSL/
5 A. Catherine Bal1, A. B. Ihab. Awad, Janos Demeter, Jeremy Gollub, Joan M. Hebert, Tina Hernandez-Boussard, Heng Jin, C. Matese John , Michael Nitzberg, Farrell Wymore, K. Zachariah, O. Patrick Brown and Gavin Sherlock. "The Stanford Microarray Database accommodates additional microarray platforms and data formats", Nucleic Acids Research, pp.33, 2005.   DOI   ScienceOn
6 Randy Z. Wu, Steve N. Bailey and David M. Sabatini, “Cell-biological applications of transfected-cell microarrays”, TrendsinCellBiology12, pp.485-488, 2002,   DOI   ScienceOn
7 P. T. Spellman, et al, “Design and implementation of microarray gene expression markup language (MAGE-ML)”, GenomeBiol233(9)RESEARCH.1-0046. 9, 2002.   DOI
8 J. Shanmugasundaram, K. Tufte, G. He, C. Zhang, D. Detwitz and J. Naughton, "Relational databases for querying xml documents: Limitations and opportunities", In Proc. Intl. Conf. on 25th VLDB, 1999.
9 W. Martin, R.M. Horton, Magebuilder, "A schema translation tool for generating MAGE-ML from tabular microarray data", Bioinformatics Conference, CSB 2003, pp.431-432, 2003.   DOI
10 S. Abiteboul, P. Buneman, D. Suciu, 1st ED. “Data on the web”, MorganKaufmann, 2000.
11 U. Sarkans, H. Parkinson, G. G. Lara, A. Oezcimen, A. Sharma, N. Abeygunawardena, S. Contrino,E. Holloway, P. Rocca-Serra, G. Mukherjee, M. Shojatalab, M. Kapushesky, S. A. Sansone, A. Farne, T. Rayner, A. Brazma, "The ArrayExpress gene expression database: a software engineering and implementation perspective", Bioinformatics. 21(8), pp.495-501, 2005.   DOI   ScienceOn
12 I. Tatarinov and S. Viglas, "Storing and Querying Ordered XML Using a Relational Database System", In Proceedings of the 2002 ACM SIGMOD international conference on Management of data, pp.204-215, 2001.   DOI
13 JAXB (Java Architecture for XML Binding), http://java.sun. com/xml/downloads/jaxb.html