• Title/Summary/Keyword: engineering document

Search Result 1,253, Processing Time 0.033 seconds

Deep Learning Document Analysis System Based on Keyword Frequency and Section Centrality Analysis

  • Lee, Jongwon;Wu, Guanchen;Jung, Hoekyung
    • Journal of information and communication convergence engineering
    • /
    • v.19 no.1
    • /
    • pp.48-53
    • /
    • 2021
  • Herein, we propose a document analysis system that analyzes papers or reports transformed into XML(Extensible Markup Language) format. It reads the document specified by the user, extracts keywords from the document, and compares the frequency of keywords to extract the top-three keywords. It maintains the order of the paragraphs containing the keywords and removes duplicated paragraphs. The frequency of the top-three keywords in the extracted paragraphs is re-verified, and the paragraphs are partitioned into 10 sections. Subsequently, the importance of the relevant areas is calculated and compared. By notifying the user of areas with the highest frequency and areas with higher importance than the average frequency, the user can read only the main content without reading all the contents. In addition, the number of paragraphs extracted through the deep learning model and the number of paragraphs in a section of high importance are predicted.

Design and Implementation of the Document HTML System for Preserving Content Integrity

  • Hyun Cheon Hwang;Ji Su Park;Jin Gon Shon
    • Journal of Information Processing Systems
    • /
    • v.19 no.3
    • /
    • pp.334-346
    • /
    • 2023
  • An electronic document based on PDF has been widely used in customer communication between an enterprise and a customer to deliver personalized content. However, electronic documents based on PDF in the form of paper layouts are not suitable for mobile environments because of low readability and lack of interactive interaction. Even though HTML is an essential language in a mobile environment, electronic document based on PDF is still used as it has a content integrity verification feature with a digital signature. It means that a user is sacrificing user experience in a mobile environment for content integrity and using paper-layout electronic documents. In this research, we design the Document HTML specification by setting the Document HTML conformance, adding the extended meta tags, and signing the message digest with a digital signature based on public key infrastructure (PKI). Furthermore, we implemented the Document HTML system, which has REST API services to generate and verify the Document HTML, and did experimental verification of the theory. As a result, we have confirmed that the Document HTML has both content integrity and user experience on mobile. Furthermore, the Document HTML is expected to be an alternative document format to deliver personalized content from an enterprise to a customer in a mobile environment instead of the paper layout electronic document such as PDF.

A Methodology for Automatic Hierarchy Definition of Sentences in Engineering Documents (엔지니어링 문서의 문장 자동 계층정의 방법론)

  • Park, Sang-Il;Kim, Bong-Geun;Kim, Kyeong-Hwan;Lee, Sang-Ho
    • Journal of the Computational Structural Engineering Institute of Korea
    • /
    • v.22 no.4
    • /
    • pp.323-330
    • /
    • 2009
  • This paper proposes a methodology for automatic hierarchy classification of subtitles in a engineering document by the a fact that heading symbols of subtitles represent a hierarchical structure of the document. The proposed methodology is composed of two methods: extracting subtitles from plan text document and determining hierarchical structure of the subtitles. The subtitles in a document is extracted by comparing heading symbol patterns with predefined heading symbol groups, and the depth levels of the subtitles are determined by analyzing relative location of subtitles according to change of the heading symbol patterns. A prototype module, which can transform a plain text document into a structured XML document in accordance with a hierarchical structure of subtitles, is developed based on the proposed methodology, and the performance of the module is analyzed with 20 engineering documents.

PDF Publication Solution based on Web (웹을 기반으로 한 PDF 출판 솔류션에 관한 연구)

  • Lee Jae-Deuk
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.28 no.2
    • /
    • pp.109-116
    • /
    • 2005
  • In the previous C/S publishing system, the editor or contributor can arbitrarily modify the document created by the author, in which case it is difficult to identify the changes made in the document. Another shortcoming is in that when the document is in need of tracking or editing, the client must have the respective editing system. To solve this problem, the gist of the document must be preserved along with the document itself, and the process of handling the document must be standardized. Publishing on the web ensures a more stable and accurate result in processing documents. The significance of web publishing is made clear, when we consider the importance of information per se and the growing demand for immediate publication in the present day. The need for a simple and straightforward apache-based PDF publishing system, in which HTML and CSS are supported, and a converting engine provides PDF standard security application support, is prominent. This provides a library in which one can directly create a PDF via Windows, Linux, or Unix without having to rely on a client, allowing high-speed PDF creation. The development of a web-accessed PDF converting engine forms the basis for e-transactions, online brochures, electronic B/L, and many other industrial sectors.

Document Clustering Methods using Hierarchy of Document Contents (문서 내용의 계층화를 이용한 문서 비교 방법)

  • Hwang, Myung-Gwon;Bae, Yong-Geun;Kim, Pan-Koo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.10 no.12
    • /
    • pp.2335-2342
    • /
    • 2006
  • The current web is accumulating abundant information. In particular, text based documents are a type used very easily and frequently by human. So, numerous researches are progressed to retrieve the text documents using many methods, such as probability, statistics, vector similarity, Bayesian, and so on. These researches however, could not consider both subject and semantic of documents. So, to overcome the previous problems, we propose the document similarity method for semantic retrieval of document users want. This is the core method of document clustering. This method firstly, expresses a hierarchy semantically of document content ut gives the important hierarchy domain of document to weight. With this, we could measure the similarity between documents using both the domain weight and concepts coincidence in the domain hierarchies.

Enhancing Document Clustering Using Term Re-weighting Based on Semantic Features (의미특징 기반의 용어 가중치 재산정을 이용한 문서군집의 성능 향상)

  • Park, Sun;Kim, Kyungjun;Kim, Kyung Ho;Lee, Seong Ro
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.17 no.2
    • /
    • pp.347-354
    • /
    • 2013
  • In this paper, we propose a enhancing document clustering method using term re-weighting by the expanded term. The proposed method extracts the important terms of documents in cluster using semantic features, which it can well represent the topics of document to expand term using WordNet. Besides, the method can improve the performance of document clustering using re-weighting terms based on the expanded terms. The experimental results demonstrate appling the proposed method to document clustering methods achieves better performance than the normal document clustering methods.

XML document editing system that is creation for structural digital document (구조화된 전자문서 생성을 위한 사용자 중심의 XML 문서편집 시스템)

  • 최일선;이용준;정회경
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.7 no.3
    • /
    • pp.513-518
    • /
    • 2003
  • Established XML at February, 1998 in W3C by solution about document processing and exchange and reusability to be shortcoming that early web happens using nonstructural document. Existing electron transaction is changing in electronic business form between corporation through XML base message exchange using XML. Necessity about solution that can masticate structured electron transaction of XML base that is used in electron transaction between corporation rose. Structured electron transaction of XML base that is used in electron transaction in treatise that see hereupon efficiently study about XML document editing system that integrate XML Schema editor to masticate XML Schema document that define edit and XML instance editor of user central that can write a book and structure of XML document efficiently do.

EDI Document Processing System for Port Logistics (항만 물류처리를 위한 EDI 문서 처리 시스템)

  • Ham, Jong-Wan;Ban, Tae-Hak;Jung, Hoe-Kyung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.5
    • /
    • pp.1081-1086
    • /
    • 2011
  • Last port logistics for the EDI(Electronic Data Interchange) document processing system using a rapid increase in the complaint is handled. However, the existing system in a way that the script processing EDI documents, but the complexity of the script writing and document processing efficiency, lower consumption due to increased demand for processing has not kept up. Therefore, we changed the script in a way how to handle the binary system was designed and implemented. Also used for port logistics has developed 12 types of EDI documents. Accordingly, the document processing speed compared to existing methods are improved twelvefold port logistics system for processing EDI documents are expected to be utilized.

Data Model for Document Exchange of Construction Projects (건설 프로젝트 문서교환을 위한 데이터모델)

  • An Sun-Ju;Son Bo-Sik;Lee Hyun-Soo
    • Proceedings of the Korean Institute Of Construction Engineering and Management
    • /
    • autumn
    • /
    • pp.569-572
    • /
    • 2003
  • A construction process involves many designers, engineers, contractors, consulting engineers and government officials. Thus, it is essential to promote collaboration among such participants through of effective document exchange. There have been efforts to Improve efficiency of document exchange through Web. Also XML/EDI is recommended by its method. So the purpose of the research was to establish the data model for document information management in document exchange for construction participants using web-based XML/EDI. This research proposed a method of modeling document information for systemic management of document information that exchanged by XML/EDI in central and explained concept of application document information that stored in database. This research classified construction document according to its information relation.

  • PDF

Document Replacement Policy by Site Popularity in Web Cache (웹 캐시에서 사이트의 인기도에 의한 도큐먼트 교체정책)

  • Yoo, Hang-Suk;Jang, Tea-Mu
    • Journal of Korea Game Society
    • /
    • v.3 no.1
    • /
    • pp.67-73
    • /
    • 2003
  • Most web caches save documents temporarily into themselves on the basis of those documents. And when a corresponding document exists within the cache on wei s request, web cache sends the document to corresponding user. On the contrary, when there is not any document within the cache, web cache requests a new document to the related server to copy the document into the cache and then rum it back to user. Here, web cache uses a replacement policy to change existing document into a new one due to exceeded capacity of cache. Typical replacement policy includes document-based LRU or LFU technique and other various replacement policies are used to replace the documents within cache effectively. However, these replacement policies function only with regard to the time and frequency of document request, not considering the popularity of each web site. Based on replacement policies with regard to documents on frequent requests and the popularity of each web site, this paper aims to present the document replacement policies with regard to the popularity of each web site, which are suitable for latest network environments to enhance the hit-ratio of cache and efficiently manage the contents of cache by effectively replacing documents on intermittent requests by new ones.

  • PDF