• Title/Summary/Keyword: dynamic document

Search Result 121, Processing Time 0.021 seconds

Index Graph : An IR Index Structure for Dynamic Document Database (인덱스 그래프 : 동적 문서 데이터베이스를 위한 IR 인덱스 구조)

  • 박병권
    • The Journal of Information Systems
    • /
    • v.10 no.1
    • /
    • pp.257-278
    • /
    • 2001
  • An IR(information retrieval) index for dynamic document databases where insertion, deletion, and update of documents happen frequently should be frequently updated. As the conventional structure of IR index is, however, focused on the information retrieval purpose, its structure is inefficient to handle dynamic update of it. In this paper, we propose a new structure for IR Index, we call it Index Graph, which is organized by connecting multiple indexes into a graph structure. By analysis and experiment, we prove the Index Graph is superior to the conventional structure of IR index in the performance of insertion, deletion, and update of documents as well as the performance of information retrieval.

  • PDF

A Prime Numbering Scheme with Sibling-Order Value for Efficient Labeling in Dynamic XML Documents (동적 XML 문서에서 효과적인 레이블링을 위해 형제순서 값을 갖는 프라임 넘버링 기법)

  • Lee, Kang-Woo;Lee, Joon-Dong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.12 no.5
    • /
    • pp.65-72
    • /
    • 2007
  • Labeling schemes which don't consider about frequent update in dynamic XML documents need relabeling process to reflect the changed label information whenever the tree of XML document is update. There is disadvantage of considerable expenses in the dynamic XML document which can occurs frequent update. To solve this problem, we suggest prime number labeling scheme that doesn't need relabeling process. However the prime number labeling scheme does not consider that it needs to update the sibling order of nodes in the tree of XML document. This update process needs much costs because the most of the tree of XML document has to be researched and rewritten. In this paper, we propose the prime number labeling scheme with sibling order value that can maintain the sibling order without researching or rewriting the tree of XML documents.

  • PDF

Dynamic Text Categorizing Method using Text Mining and Association Rule

  • Kim, Young-Wook;Kim, Ki-Hyun;Lee, Hong-Chul
    • Journal of the Korea Society of Computer and Information
    • /
    • v.23 no.10
    • /
    • pp.103-109
    • /
    • 2018
  • In this paper, we propose a dynamic document classification method which breaks away from existing document classification method with artificial categorization rules focusing on suppliers and has changing categorization rules according to users' needs or social trends. The core of this dynamic document classification method lies in the fact that it creates classification criteria real-time by using topic modeling techniques without standardized category rules, which does not force users to use unnecessary frames. In addition, it can also search the details through the relevance analysis by calculating the relationship between the words that is difficult to grasp by word frequency alone. Rather than for logical and systematic documents, this method proposed can be used more effectively for situation analysis and retrieving information of unstructured data which do not fit the category of existing classification such as VOC (Voice Of Customer), SNS and customer reviews of Internet shopping malls and it can react to users' needs flexibly. In addition, it has no process of selecting the classification rules by the suppliers and in case there is a misclassification, it requires no manual work, which reduces unnecessary workload.

A Study on Development of SGML Repository System Based on DTD-dependent Schema (DTD 의존 스키마에 기반한 SGML 문서 저장 시스템 개발에 관한 연구)

  • Kim, Hyeon-Gi;No, Dae-Sik;Gang, Hyeon-Gyu
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.5
    • /
    • pp.1153-1165
    • /
    • 1999
  • In various fields of information technology, it is growing up the needs about dynamic content management systems to store and manage SGML(Standard Generalized Markup language) documents in a database system. In this paper, we consider the issue of storing SGML documents that having complex hierarchical structure into a database system, and then propose a data model based on ODMG(Object Database Management Group) object model in order to store SGML documents without loss of information. Because the proposed data model reflects physical element structure and logical entity structure of SGML documents, it is able to store the SGML document in a database system at the system at the element- level granularity without any information loss. And also the proposed data model can be adapted among ODMG-compliant object database management systems. Finally, we will discuss on the implementation details of SGML repository system supports the functionality of automatic database schema creation for any DTD(Document Type Definition0, the functionality of storing the SGML document, the functionality of dynamic document assembly from stored database objects to SGML document, and the functionality of indexing and searching for database objects.

  • PDF

An Efficient Updates Processing Using Labeling Scheme In Dynamic Ordered XML Trees (동적 순서 XML 트리에서 레이블링 기법을 이용한 효율적인 수정처리)

  • Lee, Kang-Woo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.12
    • /
    • pp.2219-2225
    • /
    • 2008
  • Labeling schemes which don't consider about frequent update in dynamic XML documents need relabeling process to reflect the changed label information whenever the tree of XML document is update. There is disadvantage of considerable expenses in the dynamic XML document which can occurs frequent update. To solve this problem, we suggest prime number labeling scheme that doesn't need relabeling process. However the prime number labeling scheme does not consider that it needs to update the sibling order of nodes in the XML tree of document. This update process needs much costs because the most of the XML tree of document has to be relabeling and recalculation. In this paper, we propose the prime number labeling scheme with sibling order value that can maintain the sibling order without relabeling or recalculation the XML tree of documents.

Design of E-Document Management System Using Dynamic Group Key based on OOXML (OOXML기반의 동적 그룹키를 이용한 전자문서 관리 시스템의 설계)

  • Lee, Young-Gu;Kim, Hyun-Chul;Jung, Taik-Yeong;Jun, Moon-Seog
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.34 no.12B
    • /
    • pp.1407-1417
    • /
    • 2009
  • We propose a e-document management system that can provide segmented page information on a document according to different levels of authority from access control environment. The proposed system creates hierarchy identifier using a one-way hash chain and therefore does not need to own key information for all users as in existing system. Also by creating group keys by compounding hash chain hierarchy identifier with randomly formed group identifier, the system can flexibly respond to dynamic changes from group member movements while at the same time resolving the problems of key formation and management in document encoding technique using symmetric key for each page. Lastly as a result of comparative analysis through an experiment with existing e-document management systems, the proposed system showed superiority in the efficiency of encoding and decoding document and the speed of encoding and decoding by the pages.

A Document Summarization System Using Dynamic Connection Graph (동적 연결 그래프를 이용한 자동 문서 요약 시스템)

  • Song, Won-Moon;Kim, Young-Jin;Kim, Eun-Ju;Kim, Myung-Won
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.1
    • /
    • pp.62-69
    • /
    • 2009
  • The purpose of document summarization is to provide easy and quick understanding of documents by extracting summarized information from the documents produced by various application programs. In this paper, we propose a document summarization method that creates and analyzes a connection graph representing the similarity of keyword lists of sentences in a document taking into account the mean length(the number of keywords) of sentences of the document. We implemented a system that automatically generate a summary from a document using the proposed method. To evaluate the performance of the method, we used a set of 20 documents associated with their correct summaries and measured the precision, the recall and the F-measure. The experiment results show that the proposed method is more efficient compared with the existing methods.

A Study on Dynamic Formatting Method (동적 포맷팅 방식에 관한 연구)

  • 임광택;이수연
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.18 no.5
    • /
    • pp.730-738
    • /
    • 1993
  • This paper proposes a dynamic formatting method for processing large amounts of document in a device independent manner. And it is very useful for cross-referencing among pages in a single document and for presenting multiple pages simultaneously. The method can be applied usefully to hypertext's application such as establishing a link and a cross-reference among pages in a multiple document. We implemented an electronic publishing system of WYSIWYG type using X window system and Motif graphical user interface.

  • PDF

A Three-Step Preprocessing Algorithm for Enhanced Classification of E-Mail Recommendation System (이메일 추천 시스템의 분류 향상을 위한 3단계 전처리 알고리즘)

  • Jeong Ok-Ran;Cho Dong-Sub
    • The Transactions of the Korean Institute of Electrical Engineers D
    • /
    • v.54 no.4
    • /
    • pp.251-258
    • /
    • 2005
  • Automatic document classification may differ significantly according to the characteristics of documents that are subject to classification, as well as classifier's performance. This research identifies e-mail document's characteristics to apply a three-step preprocessing algorithm that can minimize e-mail document's atypical characteristics. In the first 5go, uncertain based sampling algorithm that used Mean Absolute Deviation(MAD), is used to address the question of selection learning document for the rule generation at the time of classification. In the subsequent stage, Weighted vlaue assigning method by attribute is applied to increase the discriminating capability of the terms that appear on the title on the e-mail document characteristic level. in the third and last stage, accuracy level during classification by each category is increased by using Naive Bayesian Presumptive Algorithm's Dynamic Threshold. And, we implemented an E-Mail Recommendtion System using a three-step preprocessing algorithm the enable users for direct and optimal classification with the recommendation of the applicable category when a mail arrives.