Browse > Article
http://dx.doi.org/10.3745/KIPSTD.2003.10D.4.603

Technique for extracting reusable XML Schema from schema-less XML Documents  

Cho, Jung-Gil (남서울대학교 컴퓨터학과)
Koo, Yeon-Seol (충북대학교 컴퓨터과학과)
Abstract
According to development of Web, an amount of XML documents has been increasing. So, many researches are proceeding to verify XML data coming from clients and to store or query efficiently database. In order to verify, store and query, DTD or XML Schema of XML documents is necessary. However, Schemaless XML documents couldn't be operated since they do not have either DTD or in Schema. In this paper, we extract XML schema in order to verify XML data and store or query efficiently database from either well-formed XML or XML Schemaless documents. XML Schema extracting technique which is proposed in this paper extract Schema graph using simulation and dataguide that is a extracting technique for semistructured characteristics of XML data. Also, we propose extracting technique for XML Schema using pattern tables that are considerated with Schema graph and reusability.
Keywords
XML; well-formed; XML Schema; Dataguide; Simulation;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 J. McHugh, S. Abiteboul, R. Goldman, D. Quass and J. Widom, 'Lore : A Database Management System for Semistructured Data', SIGMOD Recod, 26(3), September, 1997-09-00   DOI   ScienceOn
2 Christof Bornhovd, 'Semantic Metadata for the Integration of Web-based Data for Electronic Commerce,' IEEE, Nov., 1999   DOI
3 R. Goldman, J. Widom, 'DataGuide : Enabling Query Formulation and Optimization In Semistructured Databases', In Proceedings of the Conference on VLDB, 1998
4 Roy Goldman, Jason McHugh, Jennifer Widom, 'From Semistructured Data to XML : Migrating the Lore Data Model and Query Language', WebDB(Informal Proceedings), 1999
5 박경현, 이경휴, 류근호, 'DTD가 없는 XML 데이터의 효율적인 저장 기법', 정보처리학회논문지D, 제8-D권 제5호, pp. 495-506, 2001
6 M. Garofalakis, A. Gionis, R. Rastogi, S. Seshadri, K.Shim, 'XTRACT : A System for Extracting Document Type Descriptors from XML Documents', In Proc. of the ACM SIGMOD international Conf. on Management of Data, Dallas,Texas, 2000   DOI
7 A. Brazma, 'Efficient identification of regular expressions from representative examples', In Proc. of the Ann. Conf. on Computational Learing Theory(COLT), 1993   DOI
8 RJT Netproductions, 'Simple Sample DTD/XML Generator,' http://rtiess.tripod.com/dtdxml.htm, Apr., 2002
9 IBM, 'MPEG-7 Schema Page,' http://pmedia.i2.ibm.com:8000/mpeg7/schema, Arpil, 2002
10 P. Kilpelainen, H. Mannila, and E. Ukkonen, 'MDL learning of unions od simple pattern languages from positive examples', In Proc. of the European Conf. on Computational Learing Theroy(Eurocolt), 1995
11 Jon Duckett, et al., 'Professional XML Schema,' Wrox, 2002
12 XML for ASP.NET Developers, 'XSD Schema Generator,' http://www.xmlforasp.net/codeSection.aspx?csID=16, May, 2001
13 S. Nestorov, S. Abiteboul, R. Motwani, 'Extracting Schema from Semistructured Data', In SIGMOD, pp.295-306, 1998   DOI
14 조정길, 조윤기, 구연설, '구조적 상이성 분석에 기반한XML 문서 변환 시스템의 설계 및 구현', 정보처리학회논문지D, 제9-D권 제2호, pp.297-306, 2002   과학기술학회마을   DOI
15 S. Abiteboul, P. Bunneman, D. Suciu, 'Data on the Web : From Relations to Semistructured Data and XML,' Morgan Kaufmann, 1999
16 P. Buneman, S. Davidson, G. Hillebrand and D.Suciu, 'A Query language and optimization techniques for unstructured data', In SIGMOD, Montreal, 1996   DOI   ScienceOn
17 박경현, 최은선, 이종연, 박정석, 류근호, '최대/최소 경계 스키마 추출 기법을 이용한 XML문서의 DTD추출', 컴퓨터정보통신연구논문지, 2000
18 H. Garcia-Molina, J. Hammer, K. Ireland, Y. Papakonstantinou, J. Ullman and J. Widom, 'Integration and Accessing Heterogeneous Information Sources in TSIMMIS', Proceedings of the AAAI Symposium on Information Gathering, pp. 61-64, 1995