• 제목/요약/키워드: Document Model

검색결과 847건 처리시간 0.025초

전문용어기반 eDocument 관리 방안에 관한 연구 (A Study on eDocument Management Using Professional Terminologies)

  • 김명옥
    • 한국전자거래학회지
    • /
    • 제7권2호
    • /
    • pp.21-38
    • /
    • 2002
  • Document retrieval (DR) has been a serious issue for long in the field of Office Information Management. Nowadays, our daily work is becoming heavily dependent on the usage of information collected from the internet, and the DR methods on the Web has become an important issue which is studied more than any other topic by many researchers. The main purpose of this study is to develop a model to manage business documents by integrating three major methodologies used in the field of electronic library and information retrieval: Metadata, Thesaurus, and Index/Reversed Index. In addition, we have added a new concept of eDocument, which consists of metadata about unit documents and/or unit document themselves. eDocument is introduced as a way to utilize existing document sources. The core concepts and structures of the model were introduced, and the architecture of the eDocument management system has been proposed. Test (simulation) result of the model and the direction for the future studies were also mentioned.

  • PDF

Document Summarization Model Based on General Context in RNN

  • Kim, Heechan;Lee, Soowon
    • Journal of Information Processing Systems
    • /
    • 제15권6호
    • /
    • pp.1378-1391
    • /
    • 2019
  • In recent years, automatic document summarization has been widely studied in the field of natural language processing thanks to the remarkable developments made using deep learning models. To decode a word, existing models for abstractive summarization usually represent the context of a document using the weighted hidden states of each input word when they decode it. Because the weights change at each decoding step, these weights reflect only the local context of a document. Therefore, it is difficult to generate a summary that reflects the overall context of a document. To solve this problem, we introduce the notion of a general context and propose a model for summarization based on it. The general context reflects overall context of the document that is independent of each decoding step. Experimental results using the CNN/Daily Mail dataset show that the proposed model outperforms existing models.

Document Classification Model Using Web Documents for Balancing Training Corpus Size per Category

  • Park, So-Young;Chang, Juno;Kihl, Taesuk
    • Journal of information and communication convergence engineering
    • /
    • 제11권4호
    • /
    • pp.268-273
    • /
    • 2013
  • In this paper, we propose a document classification model using Web documents as a part of the training corpus in order to resolve the imbalance of the training corpus size per category. For the purpose of retrieving the Web documents closely related to each category, the proposed document classification model calculates the matching score between word features and each category, and generates a Web search query by combining the higher-ranked word features and the category title. Then, the proposed document classification model sends each combined query to the open application programming interface of the Web search engine, and receives the snippet results retrieved from the Web search engine. Finally, the proposed document classification model adds these snippet results as Web documents to the training corpus. Experimental results show that the method that considers the balance of the training corpus size per category exhibits better performance in some categories with small training sets.

Paperless 해운 물류를 위한 UNeDocs 적용 방안 연구 (The study on a plan for applying UNeDocs to Maritime Logistics to achieve its paperless logistics)

  • 안경림
    • 디지털산업정보학회논문지
    • /
    • 제5권2호
    • /
    • pp.199-208
    • /
    • 2009
  • Mosts of export/import cargo has been moving using maritime transport means. Korea had been driven the system automation project using EDI document since the mid-1990s. However, this automation system comes upon about 40-50% against overall maritime business process, manual or paper document processing work is existing as ever. International e-business environment also has changing into electronic form document transaction from paper document-based transaction. International standardization organization, UN/CEFACT proposed UNeDocs for paperless jtransaction. UNeDocs is a specification to define XML data model as well as electronic form. With UNeDocs, it is not necessary to generate the duplexed data, and it can support user convenient and guarantee the flexibility. This paper defines the UNeDocs data model for EDI and Off-Line processing at the current maritime business. Then, it have to check XML syntax and structure for the defined data model through quality of document check system. Also, it explains the applying plan about the defined UNeDocs data model. It is possible to support paperless transaction as defining UNeDocs-based standard data model and converting into paper document, XML and EDI document using UNeDocs data model.

Query Space Exploration Model Using Genetic Algorithm

  • Lee, Jae-Hoon;Lee, Sung-Joo
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제3권2호
    • /
    • pp.222-226
    • /
    • 2003
  • Information retrieval must be able to search the most suitable document that user need from document set. If foretell document adaptedness by similarity degree about QL(Query Language) of document, documents that search person does not require are searched. In this paper, showed that can search the most suitable document on user's request searching document of the whole space using genetic algorithm and used knowledge-base operator to solve various model's problem.

Fine-Grained Mobile Application Clustering Model Using Retrofitted Document Embedding

  • Yoon, Yeo-Chan;Lee, Junwoo;Park, So-Young;Lee, Changki
    • ETRI Journal
    • /
    • 제39권4호
    • /
    • pp.443-454
    • /
    • 2017
  • In this paper, we propose a fine-grained mobile application clustering model using retrofitted document embedding. To automatically determine the clusters and their numbers with no predefined categories, the proposed model initializes the clusters based on title keywords and then merges similar clusters. For improved clustering performance, the proposed model distinguishes between an accurate clustering step with titles and an expansive clustering step with descriptions. During the accurate clustering step, an automatically tagged set is constructed as a result. This set is utilized to learn a high-performance document vector. During the expansive clustering step, more applications are then classified using this document vector. Experimental results showed that the purity of the proposed model increased by 0.19, and the entropy decreased by 1.18, compared with the K-means algorithm. In addition, the mean average precision improved by more than 0.09 in a comparison with a support vector machine classifier.

용어간 종속성을 이용한 문서 순위 매기기에 의한 확률적 정보 검색 (A probabilistic information retrieval model by document ranking using term dependencies)

  • 유현조;이정진
    • 응용통계연구
    • /
    • 제32권5호
    • /
    • pp.763-782
    • /
    • 2019
  • 텍스트 문서 집합에 대한 정보검색에서는 주어진 질의에 부합하는 각 문서의 적합도 확률을 계산하고 이 확률이 높은 것부터 낮은 순으로 문서 순위를 정하여 사용자에게 제공한다, 각 문서의 적합도 확률 계산에 많이 사용되는 모형은 단어들이 확률적으로 독립이라는 가정 하에 확률을 추정한다. 이 모형은 단어들의 결합 확률을 계산하는 것이 현실적으로 어렵다는 점에서 많이 이용되고 있지만 질의에 사용되는 단어들이 대개 서로 관련성을 가지고 있다는 사실을 고려하고 있지 않다. 본 논문에서는 단어 자질들의 의존 구조를 고려하여 문서의 적합도 확률을 계산하기 위하여 단어들의 결합 패턴의 확률을 다항분포 모형으로 가정하고, 최대 엔트로피 방법으로 확률을 추정하여 문서 순위를 매기는 정보검색 모형을 제안한다. 여러 가지 다항분포 상황에서 시뮬레이션 실험을 한 결과 변수들의 독립을 가정한 모형보다 더 우수한 추정 결과를 보여 준다. 실제 LETOR OHSUMED 데이터 이용한 문서 순위 매기기 실험의 결과도 더 나은 검색 결과를 보여 준다.

공통 Phrase의 관계 그래프와 Suffix Tree 문서 모델을 이용한 문서 군집화 기법 (Document Clustering with Relational Graph Of Common Phrase and Suffix Tree Document Model)

  • 조윤호;이상근
    • 한국콘텐츠학회논문지
    • /
    • 제9권2호
    • /
    • pp.142-151
    • /
    • 2009
  • 기존의 문서 군집화 기법 NSTC은 문서 군집화 과정 내에서 TF-IDF를 이용하여 문서간 유사도를 측정한다. 본 논문에서는 TF-IDF가 아닌, 공통 Phrase의 관계 그래프를 이용한 새로운 문서간 유사도 측정을 제안한다. 이 방법은 문서 집합 내의 공통 Phrase들의 관계를 나타낸 관계 그래프를 통해 공통 Phrase의 가중치를 부여하는 방법을 제시한다. 또한 실험을 통해 NSTC와 비교하여 본 논문에서 제안한 문서간 유사도 측정 기법이 문서 군집화에 더욱 효과적임을 보였다.

대학도서관의 전자문헌제공 모형구축에 관한 연구 (A Study on Electronic Document Delivery Service in Academic Libraries)

  • 이화연
    • 정보관리연구
    • /
    • 제28권1호
    • /
    • pp.34-61
    • /
    • 1997
  • 본고에서는 국내 대학도서관과 국외 프로젝트 등 전자문헌제공의 현황분석을 기반으로, 대학도서관이 수행해야 할 전자문헌제공의 단계별 모형을 제시하고자 한다. 모형의 단계는 이용자의 신청단계, 전자문헌 구축단계, 전자문헌 전송단계로 구성되며, 대학도서관은 전자문헌제공을 각 단계별로 실시하여 최신 정보를 요구하는 이용자의 정보 욕구를 충족시킬 수 있다.

  • PDF

정비작업의 생산성 향상을 위한 전자문서자동화시스템 모형 - 건설장비 정비작업을 중심으로 - (Electronic Document Automation System Model for Improving Productivity in maintenance work - in Inspection Process of Construction Equipment Maintenance -)

  • 공명달
    • 대한안전경영과학회지
    • /
    • 제19권3호
    • /
    • pp.49-58
    • /
    • 2017
  • This paper suggests a specific model that could efficiently improve the interaction and the interface between MES(Manufacturing Execution System) server and POP(Point of Production) terminal through electronic document server and electronic pen, bluetooth receiver and form paper in disassembly and process inspection works. The proposed model shows that the new method by electronic document automation system can more efficiently perform to reduce processing time for maintenance work, compared with the current approach by handwritten processing system. It is noted in case of the method by electronic document automation system that the effects of proposed model are as follows; (a) While the processing time per equipment for maintenance by the current method was 300 minutes, the processing time by the new method was 50 minutes. (b) While the processing error ratio by the current method was 20%, the error ratio by the new method was 1%.