• 제목/요약/키워드: Document Model

검색결과 844건 처리시간 0.025초

Workflow based storing model for XML documents (워크플로우 기반의 XML문서의 변경저장 모델)

  • 배혜림
    • The Journal of Society for e-Business Studies
    • /
    • 제9권1호
    • /
    • pp.139-154
    • /
    • 2004
  • Recent business environments require a company to communicate frequently with other companies. This makes it essential to use XML (eXtensible Markup Language) documents for integration of information systems. Especially in e-Business environments, change management of documents is important to record and trace the history of the documents. In this paper, we propose a new method of storing a document by detecting changes automatically and reconstructing the document version when a user requests. In addition, based on the method, we also propose a model to recover a document into previous state even when system errors occur. A prototype system is implemented on top of a workflow system and an experiment is carried out to present efficiency of our method. Our approach provides efficiency of storage space and convenient management of documents.

  • PDF

Document ranking methods using term dependencies from a thesaurus (시소러스의 연관성 정보를 이용한 문서의 순위 결정 방법)

  • 이준호
    • Journal of the Korean Society for information Management
    • /
    • 제10권2호
    • /
    • pp.3-22
    • /
    • 1993
  • In recent years various document ranking methods such as Relevance. R-Distance and K-Distance have been developed wh~ch can be used in thesaurus-based boolean retrieval systems. They give high quality document rankings in many cases by using term dependence lnformatlon from a thesaurus. However, they suffer from several problems resulting from inefficient and Ineffective evaluation of boolean operators AND. OR and NOT. In this paper we propose new thesaurus-based document ranking methods called KB-FSM and KB-EBM by exploitmg the enhanced fuzzy set model and the extended boolean model. The proposed methods overcome the problems of the previous methods and use term dependencies from a thesaurs effectively. We also show through performance comparison that KB-FSM and KBEBM provide higher retrieval effectiveness than Relevance. R-D~stance and K-Distance.

  • PDF

Improving the Performance of Document Similarity by using GPU Parallelism (GPU 병렬성을 이용한 문서 유사도 계산 성능 개선)

  • Park, Il-Nam;Bae, Byung-Gurl;Im, Eun-Jin;Kang, Seung-Shik
    • The KIPS Transactions:PartB
    • /
    • 제19B권4호
    • /
    • pp.243-248
    • /
    • 2012
  • In the information retrieval systems like vector model implementation and document clustering, document similarity calculation takes a great part on the overall performance of the system. In this paper, GPU parallelism has been explored to enhance the processing speed of document similarity calculation in a CUDA framework. The proposed method increased the similarity calculation speed almost 15 times better compared to the typical CPU-based framework. It is 5.2 and 3.4 times better than the methods by using CUBLAS and Thrust, respectively.

Biomedical Ontologies and Text Mining for Biomedicine and Healthcare: A Survey

  • Yoo, Ill-Hoi;Song, Min
    • Journal of Computing Science and Engineering
    • /
    • 제2권2호
    • /
    • pp.109-136
    • /
    • 2008
  • In this survey paper, we discuss biomedical ontologies and major text mining techniques applied to biomedicine and healthcare. Biomedical ontologies such as UMLS are currently being adopted in text mining approaches because they provide domain knowledge for text mining approaches. In addition, biomedical ontologies enable us to resolve many linguistic problems when text mining approaches handle biomedical literature. As the first example of text mining, document clustering is surveyed. Because a document set is normally multiple topic, text mining approaches use document clustering as a preprocessing step to group similar documents. Additionally, document clustering is able to inform the biomedical literature searches required for the practice of evidence-based medicine. We introduce Swanson's UnDiscovered Public Knowledge (UDPK) model to generate biomedical hypotheses from biomedical literature such as MEDLINE by discovering novel connections among logically-related biomedical concepts. Another important area of text mining is document classification. Document classification is a valuable tool for biomedical tasks that involve large amounts of text. We survey well-known classification techniques in biomedicine. As the last example of text mining in biomedicine and healthcare, we survey information extraction. Information extraction is the process of scanning text for information relevant to some interest, including extracting entities, relations, and events. We also address techniques and issues of evaluating text mining applications in biomedicine and healthcare.

A Design and Implementation of the Tree-based Document Editing System for XML Application (XML 어플리케이션을 위한 트리 기반 문서 편집 시스템의 설계 및 구현)

  • Kim, Young-Chul;Kang, Chun-Kil
    • The KIPS Transactions:PartD
    • /
    • 제11D권4호
    • /
    • pp.959-966
    • /
    • 2004
  • This paper describes a design and implementation of the tree-based document editing system for XML application, available at the structure-oriented environment. This system converts DTD to ASTD( Syntax Tree Definition) to support syntax-directed editing for valid document, considers the extensibility to add new tools and supports multiple entry parser for real-time document validation. It is expected that this paper contributes related XML application document editing system development model.

대학도서관 문헌제공봉사의 현황분석과 강화방안

  • 윤희윤
    • Journal of Korean Library and Information Science Society
    • /
    • 제29권
    • /
    • pp.27-63
    • /
    • 1998
  • The purpose of this study is to analyze the document delivery service(DDS) of the academic libraries and suggest its improvement model in Korea. DDS means providing copies of information requests in any format and from any source. And DDS is gaining in importance as libraries turn to 'just-in-time' access rather than 'just-in-case' collection to meet user information needs. By good fortune, rising journal subscription prices, declining financial resources, canceling some of journal subscriptions, electronic transmission technologies, and the rise of commercial document delivery services have allowed libraries to begin to deliver articles to users in a much more rapid and acceptable time frame. Therefore, the library paradigm for the 2000s must be the creation of new document delivery structures which capitalize on the access tolls and structures created by librarians during the past generations. First of all, library-based document service requires a close review of existing library-to-library delivery mechanisms, application of technology to transfer of facsimiles of materials and facilitated use of existing fee-based document sources. The ideal document delivery system would feature a transparent, seamless electronic service incorporating searching and browsing identification and marking of desired items, and transmission and fulfillment of requests. And requested items would be supplied from library collection, commercial suppliers, or other sources. But the future of DDS will succeed when physical resources, policies, personnel, and practices are organized to provide timely information delivery to users.

  • PDF

Design and Implementation of Standard Document Management System (XML.을 적용한 표준 문서 관리 시스템의 설계 및 구현)

  • 이준섭;유정연;권석훈;나재열;이규철;구경철;박기식;박치항
    • Journal of the Korean Society for Library and Information Science
    • /
    • 제35권1호
    • /
    • pp.77-99
    • /
    • 2001
  • The Request of the information exchange is increasing because of the advanced rapid science and technology. But a different system environment has occurred many problems on the information exchange. The information exchange on based XML is a solution to the problem. It takes effect in the standard document management application that is make standard document to cooperate with many researchers mutually. This paper is design and implementation of system model for efficient exchange, store, search and manage document on based XML document in established course of standard document.

  • PDF

Acceptance of The National Tax Service Electronic Document System of Tax Officials (세무공무원의 국세청전자문서시스템 수용에 관한 연구)

  • Hong, Soon-Bok
    • The Journal of the Korea Contents Association
    • /
    • 제12권11호
    • /
    • pp.174-182
    • /
    • 2012
  • The purpose of the study is to establish a successful the national tax service electronic document system and suggest an application method by investigating relationships among perceived availability, perceived usefulness and system acceptance suggested by Technology Acceptance Model(TAM). As a result, perceived availability and perceived usefulness were playing roles as mediums on the process of accepting an tax electronic document system, and information quality of the electronic document system turned out to influence on perceived availability and perceived usefulness. And perceived availability turned out to influence on perceived usefulness. This suggests that the more tax officials feel comfortable in using an tax electronic document system, the more useful they perceive it is.

Word-Level Embedding to Improve Performance of Representative Spatio-temporal Document Classification

  • Byoungwook Kim;Hong-Jun Jang
    • Journal of Information Processing Systems
    • /
    • 제19권6호
    • /
    • pp.830-841
    • /
    • 2023
  • Tokenization is the process of segmenting the input text into smaller units of text, and it is a preprocessing task that is mainly performed to improve the efficiency of the machine learning process. Various tokenization methods have been proposed for application in the field of natural language processing, but studies have primarily focused on efficiently segmenting text. Few studies have been conducted on the Korean language to explore what tokenization methods are suitable for document classification task. In this paper, an exploratory study was performed to find the most suitable tokenization method to improve the performance of a representative spatio-temporal document classifier in Korean. For the experiment, a convolutional neural network model was used, and for the final performance comparison, tasks were selected for document classification where performance largely depends on the tokenization method. As a tokenization method for comparative experiments, commonly used Jamo, Character, and Word units were adopted. As a result of the experiment, it was confirmed that the tokenization of word units showed excellent performance in the case of representative spatio-temporal document classification task where the semantic embedding ability of the token itself is important.

Methodology for Search Intent-based Document Recommendation

  • Lee, Donghoon;Kim, Namgyu
    • Journal of the Korea Society of Computer and Information
    • /
    • 제26권6호
    • /
    • pp.115-127
    • /
    • 2021
  • It is not an easy task for a user to find the correct documents that a user really wanted at once from a vast amount of the search results. For this reason, various methods of recommending documents by taking the user's preferences into consideration based on the user's document browsing history have been proposed. However, the document recommendation methodology based on the document browsing history also has a limitation that only the information the user has viewed is utilized, but the intent of the user searching for the document is not fully utilized. Therefore, we propose a document recommendation method based on the user's search intent that utilizes information on "Why" the user reads the document, instead of the information on "Who" reads the document. In order to confirm the feasibility of the proposed methodology, an experiment was conducted by analyzing 239,438 actual user's search history of one of the most popular e-commerce platform companies in Korea. As a result, our methodology showed superior performance compared to the existing content-based or simple browsing history-based recommendation model.