• Title/Summary/Keyword: XML document

Search Result 840, Processing Time 0.028 seconds

Classification Techniques for XML Document Using Text Mining (텍스트 마이닝을 이용한 XML 문서 분류 기술)

  • Kim Cheon-Shik;Hong You-Sik
    • Journal of the Korea Society of Computer and Information
    • /
    • v.11 no.2 s.40
    • /
    • pp.15-23
    • /
    • 2006
  • Millions of documents are already on the Internet, and new documents are being formed all the time. This poses a very important problem in the management and querying of documents to classify them on the Internet by the most suitable means. However, most users have been using the document classification method based on a keyword. This method does not classify documents efficiently, and there is a weakness in the category of document that includes meaning. Document classification by a person can be very correct sometimes and often times is required. Therefore, in this paper, We wish to classify documents by using a neural network algorithm and C4.5 algorithms. We used resume data forming by XML for a document classification experiment. The result showed excellent possibilities in the document category. Therefore, We expect an applicable solution for various document classification problems.

  • PDF

The Path Inverted Index Technique for XML Document Retrieval (XML 문서 검색을 위한 경로 역 색인 기법)

  • Moon, Kyung-Won;Hwang, Byung-Yeon
    • The KIPS Transactions:PartD
    • /
    • v.17D no.2
    • /
    • pp.103-110
    • /
    • 2010
  • Recently, many XML document management systems using the advantage of RDBMS have been actively developed for the storage, processing and retrieval of XML documents. However, fractional pattern-matching query such as the LIKE operations cannot take the advantage of the index of RDBMS because these operations have deteriorated retrieval performance through its inefficient comparison processing. The hierarchical XML storage technique which stores XML documents in RDBMS efficiently, and the path inverted index technique are proposed in this paper. It regards the element of an XML document as a keyword, and focuses on organizing a posting file with path identifiers and sequences to reduce the retrieval time of path based query. Through simulations, our methods have shown about 60% better performance than the conventional method using RDBMS in searching.

Design and Implementation of EDI Document Exchange system based on XML (XML에 기반한 EDI 문서교환 시스템 설계 및 구현)

  • Im, Young-Tae;Han, Woo-Yong;Jung, Hoe-Kyung
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.11S
    • /
    • pp.3603-3612
    • /
    • 2000
  • This paper presents is for the design and implementation of EDI document exchange system based on XML To create a customized document of the users' choice, it designed and created the transaction processor and the template manager, and to make it accessable with the original EDI, a converter function is included, Also, on this system, this protocol stores EDI message structure that needed to exchange as XML format and controls it as DOM API for user can use previous system, And provides interface for user can create template files with converter and transfer necessary elements that can be chosen by user. For this purpose, This system proposes a shows structure information and document converting mechanism solution of EDI documents based on by using XML which does not show proper document conversion mechanism solution in other system so far.

  • PDF

A Study on Resolution of Validity in XML Document (XML 문서의 유효성 문제 해결에 관한 연구)

  • Hong, Seong-Pyo;Song, Gi-Beom;Bang, Keug-In;Lee, Joon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2003.05a
    • /
    • pp.564-567
    • /
    • 2003
  • XML has weakness problems on document modulation and elimination of data Because of the XML gives priority to present data format, XML electrical signature, XML cryptography, or XML access control is provided to overcome those weakness problems. However, structured XML efficiency contravention problem occurred from XML encryption and absence of protection from DTD attack are still remains unsolved. In this paper, we provide XML scheme that satisfies both efficiency and encryption. DTD is unnecessary because XML scheme supports formatting(Well-Formed XML) XML documents and it also include meta information. Because of the XML scheme has possibility to generate each XML document dynamically and self efficiency investigator rule, it has an advantage on extendability of DID based encryption of XML documents.

  • PDF

An Efficient Application of XML Schema Matching Technique to Structural Calculation Document of Bridge (XML 스키마 매칭 기법의 교량 구조계산서 적용 방안)

  • Park, Sang Il;Kim, Bong-Geun;Lee, Sang-Ho
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.32 no.1D
    • /
    • pp.51-59
    • /
    • 2012
  • An efficient application method of XML schema matching technique to the document structure of structural calculation document (SCD) of bridge is proposed. With 30 case studies, a parametric study on weightings of name, sibling, child, and parent elements of XML scheme component that are used in the similarity measure of XML schema matching technique has been performed, and suitable weighting to analyze document structure of SCD is suggested. A simplified formula for quantification of similarity is also introduced to reduce computation time in huge scale document structure of SCDs. Numerical experiments show that the suggested method can increase the accuracy of XML schema matching by 10% with suitable weighting parameters, and can maintain almost the same accuracy without weighting parameters compared to previous studies. In addition, computation time can be reduced dramatically when the proposed simplified formula for the quantification of similarity is used. In the numerical experiments of testing 20 practical SCDs of bridges, the suggested method is superior to previous studies in the accuracy of analyzing document structure and 4 to 460 times faster than the previous results in computation time.

Design and Implementation of XML Document Search System for the Cyber University (가상대학 XML 문서 검색시스템의 설계 및 구현)

  • Kong Beom-Yong;Hwang Byung-Kon;Cho Sae-Hong
    • Journal of Digital Contents Society
    • /
    • v.3 no.2
    • /
    • pp.131-142
    • /
    • 2002
  • This paper announces that, with the emergence of the imaginary university for the remote education, which is based on Web due to the education change, the introduction of the Multimedia contents system using computers and the Internet becomes serious. For this reason, this paper suggests search system for managing the imaginary university documents. For the success of the document management of the imaginary university, document search is regarded as an important issue as well as the executive support and search system of the document management. This paper plans and embodies the imaginary university XML document search system to establish the foundation of increasing efficiency o( affairs using XML document, by aiming at the fact that it will be highlighted as a representative application field as well as the efficiency of the executive affairs, by applying of Multimedia documentary creation system, which is about the imaginary university XML documentary that it is seen distinctly an effect of administrative work being a consequence of new technique application of computers.

  • PDF

Development of Ontology for Intelligent Document Transformation System (지능형 문서변환시스템을 위한 온톨로지구축)

  • Lim, Sung-Shin;Lee, Seok-Yong;Park, Nam-Kyu;Seo, Chang-Gab
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • v.9 no.2
    • /
    • pp.1128-1131
    • /
    • 2005
  • The document transformation system is more widely used in order to transform business documents efficiently in diverse organization. In established researches on document transformation systems have been carried mainly focused on XML however, it is not only transformed XML form but also EDI or local form in realistic import and export process. Particularly, in the most completed research relate on document transformation, they used ontology to get rid of non-efficiency in the connection of XML schema by manual. Hence, those researches are lack of features, which are construct and modify the domain ontology automatically and the size wasn't enough to realize itself. In this paper we study development of ontology and basic system, which is critical in intelligent document conversion system. And we develop an ontology with editor can be modified and complemented by users, as well as we make it used in real import and export business process.

  • PDF

An Experimental Study on the Performance of Element-based XML Document Retrieval (엘리먼트 기반 XML 문서검색의 성능에 관한 실험적 연구)

  • Yoon, So-Young;Moon, Sung-Been
    • Journal of the Korean Society for information Management
    • /
    • v.23 no.1 s.59
    • /
    • pp.201-219
    • /
    • 2006
  • This experimental study suggests an element-based XML document retrieval method that reveals highly relevant elements. The models investigated here for comparison are divergence and smoothing method, and hierarchical language model. In conclusion, the hierarchical language model proved to be most effective in element-based XML document retrieval with regard to the improved exhaustivity and harmed specificity.

Automatic Generation of XML Documents Using Rule-Based Document Classifier (규칙기반 문서 분류기를 이용한 XML 문서 의 자동생성)

  • 김효정;민미경
    • Proceedings of the Korea Multimedia Society Conference
    • /
    • 2000.11a
    • /
    • pp.125-128
    • /
    • 2000
  • 인터넷 중심의 정보화 사회가 되면서 기존의 문서는 대부분 전자 문서로 대치되어 가고 있다. 전자 문서간의 호환과 표준화를 위하여 XML(eXtensible Markup Language)이 웹 문서의 표준으로 지정되었으나, 현재까지 사용되고 있는 문서들이 XML 형태의 문서가 아니므로 이를 수동으로 변환해야 하는 어려움이 있다. 본 논문에서는 규칙기반 분서 분류기(Rule-Based Document Classifier)를 설계하여 다양한 형태의 문서를 자동으로 분류하고 그룹화한다. 그룹화된 문서를 이용하여 자동으로 DTD(Document Type Definition)를 생성하고, 자동 생성된 DTD를 이용하여 XML 형태의 문서로 자동 변환할 수 있는 자동 XML 변환기를 제시한다. 이러한 방법은 문서들을 자동으로 분류하고, 문서의 행태에 변화가 있을 때에도 유사한 문서로 분류할수 있을 뿐만 아니라 문서를 재분류할 때 DTD의 중복 생성을 줄일 수 있는 등의 장점을 갖는다.

  • PDF

Extracting Logical Structure from Web Documents (웹 문서로부터 논리적 구조 추출)

  • Lee Min-Hyung;Lee Kyong-Ho
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.10
    • /
    • pp.1354-1369
    • /
    • 2004
  • This paper presents a logical structure analysis method which transforms Web documents into XML ones. The proposed method consists of three phases: visual grouping, element identification, and logical grouping. To produce a logical structure more accurately, the proposed method defines a document model that is able to describe logical structure information of topic-specific document class. Since the proposed method is based on a visual structure from the visual grouping phase as well as a document model that describes logical structure information of a document type, it supports sophisticated structure analysis. Experimental results with HTML documents from the Web show that the method has performed logical structure analysis successfully compared with previous works. Particularly, the method generates XML documents as the result of structure analysis, so that it enhances the reusability of documents.

  • PDF