• Title/Summary/Keyword: XML Structure

Search Result 499, Processing Time 0.026 seconds

A Clustering Technique using Common Structures of XML Documents (XML 문서의 공통 구조를 이용한 클러스터링 기법)

  • Hwang, Jeong-Hee;Ryu, Keun-Ho
    • Journal of KIISE:Databases
    • /
    • v.32 no.6
    • /
    • pp.650-661
    • /
    • 2005
  • As the Internet is growing, the use of XML which is a standard of semi-structured document is increasing. Therefore, there are on going works about integration and retrieval of XML documents. However, the basis of efficient integration and retrieval of documents is to cluster XML documents with similar structure. The conventional XML clustering approaches use the hierarchical clustering algorithm that produces the demanded number of clusters through repeated merge, but it have some problems that it is difficult to compute the similarity between XML documents and it costs much time to compare similarity repeatedly. In order to address this problem, we use clustering algorithm for transactional data that is scale for large size of data. In this paper we use common structures from XML documents that don't have DTD or schema. In order to use common structures of XML document, we extract representative structures by decomposing the structure from a tree model expressing the XML document, and we perform clustering with the extracted structure. Besides, we show efficiency of proposed method by comparing and analyzing with the previous method.

XML Schema Evolution Approach Assuring the Automatic Propagation to XML Documents (XML 문서에 자동 전파하는 XML 스키마 변경 접근법)

  • Ra, Young-Gook
    • The KIPS Transactions:PartD
    • /
    • v.13D no.5 s.108
    • /
    • pp.641-650
    • /
    • 2006
  • XML has the characteristics of self-describing and uses DTD or XML schema in order to constraint its structure. Even though the XML schema is only at the stage of recommendation yet, it will be prevalently used because DTD is not itself XML and has the limitation on the expression power. The structure defined by the XML schema as well as the data of the XML documents can vary due to complex reasons. Those reasons are errors in the XML schema design, new requirements due to new applications, etc. Thus, we propose XML schema evolution operators that are extracted from the analysis of the XML schema updates. These schema evolution operators enable the XML schema updates that would have been impossible without supporting tools if there are a large number of XML documents complying the U schema. In addition, these operators includes the function of automatically finding the update place in the XML documents which are registered to the XSE system, and maintaining the XML documents valid to the XML schema rather than merely well-formed. This paper is the first attempt to update XML schemas of the XML documents and provides the comprehensive set of schema updating operations. Our work is necessary for the XML application development and maintenance in that it helps to update the structure of the XML documents as well as the data in the easy and precise manner.

Adaptive Path Index for Efficient U Query Processing (효율적인 XML 질의 처리를 위한 적응형 경로 인덱스)

  • 민준기;심규석;정진완
    • Journal of KIISE:Databases
    • /
    • v.31 no.1
    • /
    • pp.61-71
    • /
    • 2004
  • XML can describe a wide range of data, from regular to irregular and from flat to deeply nested. Thus, XML is rapidly emerging as the do facto standard for the Web document format since XML supports an efficient data exchange and integration. Also, to retrieve the data represented by XML, several XML query languages are proposed. XML query languages such as XPath and XQuery use path expressions to traverse irregularly structured data which comprise B% elements. To evaluate path expressions, various path indexes are proposed. However, traditional path indexes are constructed by utilizing only the XML data structure. Therefore, in this paper, we propose an adaptive path index which utilizes the XML data structure as well as query workloads. To improve the query performance, the adaptive path index proposed by this paper manages the frequently used paths and the structural summary of the XML data using a hash tree and a graph structure. Experimental results show that the adaptive path index improves the query performance typically 2 to 69 times compared with the existing indexes.

An XML Structure Translation System using Schema Structure Data Mapping (스키마 구조 데이타 매핑을 이용한 XML 구조변환 시스템)

  • 송종철;김창수;정회경
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.10 no.5
    • /
    • pp.406-418
    • /
    • 2004
  • Last days, various kinds of applications and system were individually introduced into specific groups or enterprises by different objective without considering interoperability among those. However, the environment for data processing is changing rapidly in these days. And now the necessity is growing to integrate and couple applications and system in the process dimension for more flexible and quicker data processing on these application programs and system. When integrating these application programs or system, an integration based on XML is recommended as it is one of good methods which will the additional cost and satisfy the requirements of the integration. This is because the XML is not only device-independent data type which can be used any platform, but also it uses XSLT, the document conversion standard established by W3C, which allows easy data conversion from one to another type on occasion of demands. This paper studies a design and implementation of system to convert XML structure. This system shows the structure of source- side providing data and destination-side processing data with using XML schema that defines structural information of a XML document. And this system defines the structure relationship of desired form as mapping structural information and data. This system creates the XSLT document that defines conversion rule between two structures based information which is defined. The XSLT document which is created as described above will convert data to be appropriate to the structure of the destination- side. By implementing this system, it is able to apply a document into various kinds of structure without considering specific system or platform and it is able to construct XSLT document to which meaning of desired form can be given. This paper aims to offer a process conversion between documents and to improve interoperability and scalability, so that we can contribute to build XML document processing environment

A Unification Algorithm for DTDs of XML Documents having a Similar Structure (유사 구조를 가지는 XML 문서들의 DTD 통합 알고리즘)

  • 유춘식;우선미;김용성
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.10
    • /
    • pp.1400-1411
    • /
    • 2004
  • There are many cases that many XML documents have different DTDs in spite of having a similar structure and being logically same kind of document. For this reason, It occurs a problem that these XML documents have different database schema and are stored in different databases. So, in this paper, we propose an algorithm that unifies DTDs of these XML documents using the finite automata and the tree structure. The finite automata is suitable for representing repetition operators and connectors of DTD, and is a simple representation method for DTD. By using the finite automata, we are able to reduce the complexity of algorithm. And we apply a proposed algorithm to unify DTDs of science journals.

XML-based Modeling for Semantic Retrieval of Syslog Data (Syslog 데이터의 의미론적 검색을 위한 XML 기반의 모델링)

  • Lee Seok-Joon;Shin Dong-Cheon;Park Sei-Kwon
    • The KIPS Transactions:PartD
    • /
    • v.13D no.2 s.105
    • /
    • pp.147-156
    • /
    • 2006
  • Event logging plays increasingly an important role in system and network management, and syslog is a de-facto standard for logging system events. However, due to the semi-structured features of Common Log Format data most studies on log analysis focus on the frequent patterns. The extensible Markup Language can provide a nice representation scheme for structure and search of formatted data found in syslog messages. However, previous XML-formatted schemes and applications for system logging are not suitable for semantic approach such as ranking based search or similarity measurement for log data. In this paper, based on ranked keyword search techniques over XML document, we propose an XML tree structure through a new data modeling approach for syslog data. Finally, we show suitability of proposed structure for semantic retrieval.

An Efficient Transformation Technique from Relational Schema to Redundancy Free XML Schema (관계형 스키마로부터 중복성이 없는 XML 스키마로의 효율적인 변환 기법)

  • Cho, Jung-Gil
    • Journal of Internet Computing and Services
    • /
    • v.11 no.6
    • /
    • pp.123-133
    • /
    • 2010
  • XML has been become the new standard for publishing and exchanging data on the Web. However, most business data is still stored and maintained in relational database management systems. As such, there is an increasing need to efficiently publish relational data as XML data for Internet-based applications. The most important issue in the transformation is to reflect structural and semantic relations of RDB to XML schema exactly. Most transformation approaches have been done to resolve the issue, but those methods have several problems. In this paper, we discuss algorithm in transforming a relational database schema into corresponding XML schema in XML Schema. We aim to achieve not only explicit/implicit referential integrity relation information but also high level of nested structure while introducing no data redundancy for the transformed XML schema. To achieve these goals, we propose a transformation model which is redundancy free and then we improve the XML Schema structure by exploring more nested structure.

XML Schema Transformation Considering Semantic Constraint (의미적 제약조건을 고려한 XML 스키마의 변환)

  • Cho, Jung-Gil
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.3
    • /
    • pp.53-63
    • /
    • 2011
  • Many techniques have been proposed to store and query XML data efficiently. One way achieving this goal is using relational database by transforming XML data into relational format. It is important to transform schema to preserve the content, the structure and the constraints of the semantics information of the XML document. Especially, key constraints are an important part of database theory. Therefore, the proposal technique has considered the semantics of XML as expressed by primary keys and foreign keys. And, the proposal technique can preserve not only XML data constraints but also the content and the structure and the semantics of XML data thru transformation process. Transforming information is the content and the structure of the document(the parent-child relationship), the functional dependencies, semantics of the document as captured by XML key and keyref constraints. Because of XML schema transformation ensures that preserving semantic constraints, the advantages of these transformation techniques do not need to use the stored procedure or trigger which these data ensures data integrity in the relational database. In this paper, there is not chosen the ID/IDREF key which supported in DTD, the inheritance relationship, the implicit referential integrity.

Design and Implementation of an Internet Bidding System Based on XML (XML 기반 인터넷 입찰시스템 설계 및 구현)

  • 박성은;이용규
    • The Journal of Society for e-Business Studies
    • /
    • v.7 no.2
    • /
    • pp.69-81
    • /
    • 2002
  • The problem of the previous proprietary e-business systems is that they are not built upon well defined standards, which causes difficulties in extension of the system and interoperation among them. Therefore, a new e-business standard, ebXML, and related XML standards such as SOAP and XML Signature have been recommended for e-business systems. In this paper, as an application of the new XML standards, we design and implement a new internet bidding system. We use XML Schema for defining the document structure of the bidding system. DOM is used for structure search and XSL is used to represent styles. We use SOAP to handle distributed objects and XML Signature to provide data integrity. Due to the adoption of e-business standards, the developed system has advantages in interoperability and extensibility compared to previous systems.

  • PDF

Clustering XML Documents Considering The Weight of Large Items in Clusters (클러스터의 주요항목 가중치 기반 XML 문서 클러스터링)

  • Hwang, Jeong-Hee
    • The KIPS Transactions:PartD
    • /
    • v.14D no.1 s.111
    • /
    • pp.1-8
    • /
    • 2007
  • As the web document of XML, an exchange language of data in the advanced Internet, is increasing, a target of information retrieval becomes the web documents. Therefore, there we researches on structure, integration and retrieval of XML documents. This paper proposes a clustering method of XML documents based on frequent structures, as a basic research to efficiently process query and retrieval. To do so, first, trees representing XML documents are decomposed and we extract frequent structures from them. Second, we perform clustering considering the weight of large items to adjust cluster creation and cluster cohesion, considering frequent structures as items of transactions. Third, we show the excellence of our method through some experiments which compare which the previous methods.