• Title/Summary/Keyword: Document structure

Search Result 595, Processing Time 0.032 seconds

A Methodology for Automatic Hierarchy Definition of Sentences in Engineering Documents (엔지니어링 문서의 문장 자동 계층정의 방법론)

  • Park, Sang-Il;Kim, Bong-Geun;Kim, Kyeong-Hwan;Lee, Sang-Ho
    • Journal of the Computational Structural Engineering Institute of Korea
    • /
    • v.22 no.4
    • /
    • pp.323-330
    • /
    • 2009
  • This paper proposes a methodology for automatic hierarchy classification of subtitles in a engineering document by the a fact that heading symbols of subtitles represent a hierarchical structure of the document. The proposed methodology is composed of two methods: extracting subtitles from plan text document and determining hierarchical structure of the subtitles. The subtitles in a document is extracted by comparing heading symbol patterns with predefined heading symbol groups, and the depth levels of the subtitles are determined by analyzing relative location of subtitles according to change of the heading symbol patterns. A prototype module, which can transform a plain text document into a structured XML document in accordance with a hierarchical structure of subtitles, is developed based on the proposed methodology, and the performance of the module is analyzed with 20 engineering documents.

Security Elevation of XML Document Using DTD Digital Signature (DTD 전자서명을 이용한 XML문서의 보안성 향상)

  • 김형균;오무송
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2002.11a
    • /
    • pp.592-596
    • /
    • 2002
  • Can speak that DTD is meta data that define meaning of expressed data on XML document. Therefore, In case DTD information is damaged this information to base security of XML document dangerous. Not that attach digital signature on XML document at send-receive process of XML document in this research, proposed method to attach digital signature to DTD. As reading DTD file to end first, do parsing, and store abstracted element or attribute entitys in hash table. Read hash table and achieve message digest if parsing is ended. Compose and create digital signature with individual key after achievement. When sign digital, problem that create entirely other digest cost because do not examine about order that change at message digest process is happened. This solved by method to create DTD's digital signature using DOM that can embody tree structure for standard structure and document.

  • PDF

An Efficient Application of XML Schema Matching Technique to Structural Calculation Document of Bridge (XML 스키마 매칭 기법의 교량 구조계산서 적용 방안)

  • Park, Sang Il;Kim, Bong-Geun;Lee, Sang-Ho
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.32 no.1D
    • /
    • pp.51-59
    • /
    • 2012
  • An efficient application method of XML schema matching technique to the document structure of structural calculation document (SCD) of bridge is proposed. With 30 case studies, a parametric study on weightings of name, sibling, child, and parent elements of XML scheme component that are used in the similarity measure of XML schema matching technique has been performed, and suitable weighting to analyze document structure of SCD is suggested. A simplified formula for quantification of similarity is also introduced to reduce computation time in huge scale document structure of SCDs. Numerical experiments show that the suggested method can increase the accuracy of XML schema matching by 10% with suitable weighting parameters, and can maintain almost the same accuracy without weighting parameters compared to previous studies. In addition, computation time can be reduced dramatically when the proposed simplified formula for the quantification of similarity is used. In the numerical experiments of testing 20 practical SCDs of bridges, the suggested method is superior to previous studies in the accuracy of analyzing document structure and 4 to 460 times faster than the previous results in computation time.

Design and implementation of integrated e-catalog system based on web services (웹 서비스 기반의 통합형 전자 카탈로그 시스템 설계 및 구현)

  • Im San-Song;Na Cheol-Hun;Jung Hoe-Kyung
    • Journal of Internet Computing and Services
    • /
    • v.6 no.2
    • /
    • pp.153-163
    • /
    • 2005
  • We proposed electronic catalog document structure which can process information of goods configurationally and electronic catalog standard format to support various catalog document format and structure. using XML. We designed and implemented electronic catalog system that can use by user who takes part in the transaction the electronic catalog document defined in electronic commerce integration style. This system was advantageous in getting useful information as it was having a common electronic catalog document for all business transactions using interoperability of the Web Services.

  • PDF

The structure conversion mechanism for XML document Interchange in CORBA-based Distributed Environment (CORBA기반 분산환경에서 XML문서 교환을 위한 구조변환기법)

  • 박민기;이재완
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2002.11a
    • /
    • pp.396-400
    • /
    • 2002
  • In the distributed environment based on the network, we can get the necessary information by sharing or exchanging an information document. However, there is the difficulty for sharing and distributing of the information document because of the heterogeneity at the dist(equation omitted). In this paper, we propose(equation omitted) interchange structure based on CORBA for disposing of the information document and provide the mechanism for interchanging IDL to DTD by designing the converter within it.

  • PDF

Document Clustering using Non-negative Matrix Factorization and Fuzzy Relationship (비음수 행렬 분해와 퍼지 관계를 이용한 문서군집)

  • Park, Sun;Kim, Kyung-Jun
    • Journal of Advanced Navigation Technology
    • /
    • v.14 no.2
    • /
    • pp.239-246
    • /
    • 2010
  • This paper proposes a new document clustering method using NMF and fuzzy relationship. The proposed method can improve the quality of document clustering because the clustered documents by using fuzzy relation values between semantic features and terms to distinguish well dissimilar documents in clusters, the selected cluster label terms by using semantic features with NMF, which is used in document clustering, can represent an inherent structure of document set better. The experimental results demonstrate that the proposed method achieves better performance than other document clustering methods.

Analysis of Indexing Schemes for Structure-Based Retrieval (구조 기반 검색을 위한 색인 구조에 대한 분석)

  • 김영자;김현주;배종민
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.5
    • /
    • pp.601-616
    • /
    • 2004
  • Information retrieval systems for structured documents provide multiple levels of retrieval capability by supporting structure-based queries. In order to process structure-based queries for structured documents, information for structural nesting relationship between elements and for element sequence must be maintained. This paper presents four index structures that can process various query types about structures such as structural relationships between elements or element occurrence order. The proposed algorithms are based on the concept of Global Document Instance Tree.

  • PDF

Document Clustering Method using Coherence of Cluster and Non-negative Matrix Factorization (비음수 행렬 분해와 군집의 응집도를 이용한 문서군집)

  • Kim, Chul-Won;Park, Sun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.13 no.12
    • /
    • pp.2603-2608
    • /
    • 2009
  • Document clustering is an important method for document analysis and is used in many different information retrieval applications. This paper proposes a new document clustering model using the clustering method based NMF(non-negative matrix factorization) and refinement of documents in cluster by using coherence of cluster. The proposed method can improve the quality of document clustering because the re-assigned documents in cluster by using coherence of cluster based similarity between documents, the semantic feature matrix and the semantic variable matrix, which is used in document clustering, can represent an inherent structure of document set more well. The experimental results demonstrate appling the proposed method to document clustering methods achieves better performance than documents clustering methods.

Design and Implementation of BADA-IV/XML Query Processor Supporting Efficient Structure Querying (효율적 구조 질의를 지원하는 바다-IV/XML 질의처리기의 설계 및 구현)

  • 이명철;김상균;손덕주;김명준;이규철
    • The Journal of Information Technology and Database
    • /
    • v.7 no.2
    • /
    • pp.17-32
    • /
    • 2000
  • As XML emerging as the Internet electronic document language standard of the next generation, the number of XML documents which contain vast amount of Information is increasing substantially through the transformation of existing documents to XML documents or the appearance of new XML documents. Consequently, XML document retrieval system becomes extremely essential for searching through a large quantity of XML documents that are storied in and managed by DBMS. In this paper we describe the design and implementation of BADA-IV/XML query processor that supports content-based, structure-based and attribute-based retrieval. We design XML query language based upon XQL (XML Query Language) of W3C and tightly-coupled with OQL (a query language for object-oriented database). XML document is stored and maintained in BADA-IV, which is an object-oriented database management system developed by ETRI (Electronics and Telecommunications Research Institute) The storage data model is based on DOM (Document Object Model), therefore the retrieval of XML documents is executed basically using DOM tree traversal. We improve the search performance using Node ID which represents node's hierarchy information in an XML document. Assuming that DOW tree is a complete k-ary tree, we show that Node ID technique is superior to DOM tree traversal from the viewpoint of node fetch counts.

  • PDF

An Indexing Scheme for Efficient Retrieval and Update of Structured Documents Based on GDIT (GDIT를 기반으로 한 구조적 문서의 효율적 검색과 갱신을 위한 인덱스 설계)

  • Kim, Young-Ja;Bae, Jong-Min
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.2
    • /
    • pp.411-425
    • /
    • 2000
  • Information retrieval systems for structured documents which are written in SGML or XML support partial retrieval of document. In order to efficiently process queries based on document structures, low memory overhead for indexing, quick response time for queries, supports to powerful types of user queries, and minimal updates of index structure for document updates are required. This paper suggests the Global Document Instance Tree(GDIT) and proposes an effective indexing scheme and query processing algorithms based on the GDIT. The indexing scheme keeps up indexing and retrieval effciency and also guarantees minimal updates of the index structure when document structures are updated.

  • PDF