• Title/Summary/Keyword: engineering document

Search Result 1,253, Processing Time 0.031 seconds

A Design of Book Retrieval System for Electronic Commerce in based Web (웹 기반의 전자상거래를 위한 도서검색 시스템 설계)

  • Ha, Chu-Ja;Jeong, Jong-Geun;Park, Jong-Hun;Kim, Chul-Won
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • v.9 no.1
    • /
    • pp.659-662
    • /
    • 2005
  • XML is standard of web document, and is used in language for document data exchange. XML document is used as example that change existing document to XML or makes new document by XML increases and XML search system to search XML document efficiently accordingly is requiring. This paper describes design and implementation of query processing system for translating XML elements and data between XML documents and relational database and consist of XML to DB processor, DB to XML processor and XML document management processor. Through this, described for design and embodiment of efficient XML document search system of JAVA base using XQL that is proposed in language of quality of XML document.

  • PDF

Integration between XML-based Document Information and Bridge Information Model-based Structural Design Information (교량정보모델 기반의 설계정보와 XML 기반의 문서정보 통합)

  • Jeong Yeon-Suk;Kim Bong-Geun;Jeong Won-Seok;Lee Sang-Ho
    • Proceedings of the Computational Structural Engineering Institute Conference
    • /
    • 2006.04a
    • /
    • pp.208-215
    • /
    • 2006
  • This study provides a new operation strategy which can guarantee the data consistency of engineering information among the various intelligent information systems. We present the strategies for the operation of bridges engineering information and the construction methodology of integrated database. The two core standard techniques are adopted to construct the integrated database. One of these standards is the Standard for the Exchange of Product Model Data (STEP) for CAD/CAE information and the other is the Extensible Markup Language (XML) for engineering document information. This study can transform a document me into a data type for web-based application modules which assist end-users in searching and retrieval of engineering document data. In addition, relaying algorithm is developed to integrate the two different information, e.g. CAD/CAE information and engineering document information. The pilot application modules for management and maintenance of existing bridge are also developed to show application of the strategy.

  • PDF

Development of Similarity-Based Document Clustering System (유사성 계수에 의한 문서 클러스터링 시스템 개발)

  • Woo Hoon-Shik;Yim Dong-Soon
    • Proceedings of the Society of Korea Industrial and System Engineering Conference
    • /
    • 2002.05a
    • /
    • pp.119-124
    • /
    • 2002
  • Clustering of data is of a great interest in many data mining applications. In the field of document clustering, a document is represented as a data in a high dimensional space. Therefore, the document clustering can be accomplished with a general data clustering techniques. In this paper, we introduce a document clustering system based on similarity among documents. The developed system consists of three functions: 1) gatherings documents utilizing a search agent; 2) determining similarity coefficients between any two documents from term frequencies; 3) clustering documents with similarity coefficients. Especially, the document clustering is accomplished by a hybrid algorithm utilizing genetic and K-Means methods.

  • PDF

The Engineering Change Document Management using SGML in PDM (SGML을 활용한 PDM에서의 설계변경문서관리)

  • Kim, Joon-Oh;Kim, Sunn-Ho
    • IE interfaces
    • /
    • v.10 no.2
    • /
    • pp.79-90
    • /
    • 1997
  • Documents in a traditional PDM(Product Data Management) system have been managed in a form of scanned document files or electronic documents developed by specific tools. Though each tool manages documents with its own systematical methods, it has drawbacks in data search, data integration and interchange, etc. For this reason, in this research we propose an efficient document management system for PDM by using the SGML(Standard Generalized Markup Language), one of CALS and ISO standards for document interchanges. Among documents to be managed in PDM, the engineering change notification (ECN) is taken into account. The DTD (Document Type Definition) has been constucted based on the logical analysis of the documents format, In addition, based on the DTD, DB classes have been designed by object-oriented paradigms and a prototype for document input/output and search has been developed using UniSQL ORDBMS (Object-Relational DBMS) and PowerBuilder under the client/server environment.

  • PDF

A Method for Automatic Check of Omitted Design Item in Structural Calculation Document of Steel Box Bridges (강박스 교량을 대상으로 한 구조계산서의 누락된 설계항목 검토 자동화 방법론)

  • Park, Sang-Il;An, Hyun-Jung;Kim, Bong-Geun;Lee, Sang-Ho
    • Proceedings of the Computational Structural Engineering Institute Conference
    • /
    • 2007.04a
    • /
    • pp.813-818
    • /
    • 2007
  • A method for automatic check of omitted design item in structural calculation document of steel box bridges is proposed. A method for automatic check of omitted design item in structural calculation document of steel box bridges is proposed. Information processing for the proposed method is divided into two steps: automatic generation of document structure in XML Schema Definition (XSD) format and extract omitted design items by using the XML Schema matching technique. The automatic omitted element filter is developed on the basis of the proposed method, and the accuracy of the developed module is examined with case study subjected to existing structural calculation document samples.

  • PDF

Document Layout Analysis Based on Fuzzy Energy Matrix

  • Oh, KangHan;Kim, SooHyung
    • International Journal of Contents
    • /
    • v.11 no.2
    • /
    • pp.1-8
    • /
    • 2015
  • In this paper, we describe a novel method for document layout analysis that is based on a Fuzzy Energy Matrix (FEM). A FEM is a two-dimensional matrix that contains the likelihood of text and non-text and is generated through the use of Fuzzy theory. The key idea is to define an Energy map for the document to categorize text and non-text. The proposed mechanism is designed for execution with a low-resolution document image, and hence our method has a fast processing speed. The proposed method has been tested on public ICDAR 2009 datasets to conduct a comparison against other state-of-the-art methods, and it was also tested with Korean documents. The results of the experiment indicate that this scheme achieves superior segmentation accuracy, in terms of both precision and recall, and also requires less time for computation than other state-of-the-art document image analysis methods.

Document Classification Model Using Web Documents for Balancing Training Corpus Size per Category

  • Park, So-Young;Chang, Juno;Kihl, Taesuk
    • Journal of information and communication convergence engineering
    • /
    • v.11 no.4
    • /
    • pp.268-273
    • /
    • 2013
  • In this paper, we propose a document classification model using Web documents as a part of the training corpus in order to resolve the imbalance of the training corpus size per category. For the purpose of retrieving the Web documents closely related to each category, the proposed document classification model calculates the matching score between word features and each category, and generates a Web search query by combining the higher-ranked word features and the category title. Then, the proposed document classification model sends each combined query to the open application programming interface of the Web search engine, and receives the snippet results retrieved from the Web search engine. Finally, the proposed document classification model adds these snippet results as Web documents to the training corpus. Experimental results show that the method that considers the balance of the training corpus size per category exhibits better performance in some categories with small training sets.

Stroke Width-Based Contrast Feature for Document Image Binarization

  • Van, Le Thi Khue;Lee, Gueesang
    • Journal of Information Processing Systems
    • /
    • v.10 no.1
    • /
    • pp.55-68
    • /
    • 2014
  • Automatic segmentation of foreground text from the background in degraded document images is very much essential for the smooth reading of the document content and recognition tasks by machine. In this paper, we present a novel approach to the binarization of degraded document images. The proposed method uses a new local contrast feature extracted based on the stroke width of text. First, a pre-processing method is carried out for noise removal. Text boundary detection is then performed on the image constructed from the contrast feature. Then local estimation follows to extract text from the background. Finally, a refinement procedure is applied to the binarized image as a post-processing step to improve the quality of the final results. Experiments and comparisons of extracting text from degraded handwriting and machine-printed document image against some well-known binarization algorithms demonstrate the effectiveness of the proposed method.

Joint Hierarchical Semantic Clipping and Sentence Extraction for Document Summarization

  • Yan, Wanying;Guo, Junjun
    • Journal of Information Processing Systems
    • /
    • v.16 no.4
    • /
    • pp.820-831
    • /
    • 2020
  • Extractive document summarization aims to select a few sentences while preserving its main information on a given document, but the current extractive methods do not consider the sentence-information repeat problem especially for news document summarization. In view of the importance and redundancy of news text information, in this paper, we propose a neural extractive summarization approach with joint sentence semantic clipping and selection, which can effectively solve the problem of news text summary sentence repetition. Specifically, a hierarchical selective encoding network is constructed for both sentence-level and document-level document representations, and data containing important information is extracted on news text; a sentence extractor strategy is then adopted for joint scoring and redundant information clipping. This way, our model strikes a balance between important information extraction and redundant information filtering. Experimental results on both CNN/Daily Mail dataset and Court Public Opinion News dataset we built are presented to show the effectiveness of our proposed approach in terms of ROUGE metrics, especially for redundant information filtering.

Document Summarization using Semantic Feature and Hadoop (하둡과 의미특징을 이용한 문서요약)

  • Kim, Chul-Won
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.9
    • /
    • pp.2155-2160
    • /
    • 2014
  • In this paper, we proposes a new document summarization method using the extracted semantic feature which the semantic feature is extracted by distributed parallel processing based Hadoop. The proposed method can well represent the inherent structure of documents using the semantic feature by the non-negative matrix factorization (NMF). In addition, it can summarize the big data document using Hadoop. The experimental results demonstrate that the proposed method can summarize the big data document which a single computer can not summarize those.