• Title/Summary/Keyword: Related document

Search Result 620, Processing Time 0.027 seconds

Investigation on Uncertainty in Construction Bid Documents

  • Shrestha, Rabin;Lee, JeeHee
    • International conference on construction engineering and project management
    • /
    • 2022.06a
    • /
    • pp.67-73
    • /
    • 2022
  • Construction bid documents contain various errors or discrepancies giving rise to uncertainties. The errors/discrepancies/ambiguities in the bid document, if not identified and clarified before the bid, may cause dispute and conflict between the contracting parties. Given the fact that bid document is a major resource in estimating construction costs, inaccurate information in bid document can result in over/under estimating. Thus, any questions from bidders related to the errors in the bid document should be clarified by employers before bid submission. This study aims to examine the pre-bid queries, i.e., pre-bid request for information (RFI), from state DoTs of the United States to investigate error types most frequently encountered in bid documents. For the study, around 200 pre-bids RFI were collected from state DoTs and were classified into several error types (e.g., coordination error, errors in drawings). The analysis of the data showed that errors in bill of quantities is the most frequent error in the bid documents followed by errors in drawing. The study findings addressed uncertainty types in construction bid documents that should be checked during a bid process, and, in a broader sense, it will contribute to advancing the construction management body of knowledge by clarifying and classifying bid risk factors at an early stage of construction projects.

  • PDF

Automated networked knowledge map using keyword-based document networks (키워드 기반 문서 네트워크를 이용한 네트워크형 지식지도 자동 구성)

  • Yoo, Keedong
    • Knowledge Management Research
    • /
    • v.19 no.3
    • /
    • pp.47-61
    • /
    • 2018
  • A knowledge map, a taxonomy of knowledge repositories, must have capabilities supporting and enhancing knowledge user's activity to search and select proper knowledge for problem-solving. Conventional knowledge maps, however, have been hierarchically categorized, and could not support such activity that must coincide with the user's cognitive process for knowledge utilization. This paper, therefore, aims to verify and develop a methodology to build a networked knowledge map that can support user's activity to search and retrieve proper knowledge based on the referential navigation between content-relevant knowledge. This paper deploys keywords as the semantic information between knowledge, because they can represent the overall contents of a given document, and because they can play the role of semantic information on the link between related documents. By aggregating links between documents, a document network can be formulated: a keyword-based networked knowledge map can be finally built. Domain expert-based validation test was also conducted on a networked knowledge map of 50 research papers, which confirmed the performance of the proposed methodology to be outstanding with respect to the precision and recall.

A Syntax-Directed XML Document Editor using Abstract Syntax Tree (추상구문트리를 이용한 구문지향 XML 문서 편집기)

  • Kim Young-Chul;You Do Kyu
    • Journal of Internet Computing and Services
    • /
    • v.6 no.2
    • /
    • pp.117-126
    • /
    • 2005
  • The current text based XML document systems are editing text and don't perform syntax check. As a result, the validity of an edited XML document can't be decided it is well-formed or valid until it is parsed. This paper describes a design and implementation of the syntax-directed editing system for XML documents. Because this is tree-based system, it is easy to extend XML document. And this system is designed to validate XML documents in real-time, It is expected that this paper contributes XML related application developments.

  • PDF

The Study on the Application of Electronic Document Standards tn construction CALS/EC (건설CALS/EC에서의 전자문서 표준 적용방안 연구)

  • Jung, Sung-Yoon;Choi, Won-Sik;Ok, Hyun;Kim, Sung-Jin
    • Proceedings of the CALSEC Conference
    • /
    • 2003.09a
    • /
    • pp.90-96
    • /
    • 2003
  • The purpose of this study is to develop the standards for construction documents. The standards will facilitate efficient exchange and sharing of construction documents among the participants such as owners, designers, constructors and other related parts throughout construction work process. For this reason, this study is intended to define the requirements for establishing XML-based electronic document standardization system, and to prepare the plan for developing XML electronic document in a consistent way.

  • PDF

The Document Clustering using LSI of IR (LSI를 이용한 문서 클러스터링)

  • 고지현;최영란;유준현;박순철
    • Proceedings of the Korea Society for Industrial Systems Conference
    • /
    • 2002.06a
    • /
    • pp.330-335
    • /
    • 2002
  • The most critical issue in information retrieval system is to have adequate results corresponding to user requests. When all documents related with user inquiry retrieve, it is not easy not only to find correct document what user wants but is limited. Therefore, clustering method that grouped by corresponding documents has widely used so far. In this paper, we cluster on the basis of the meaning rather than the index term in the existing document and a LSI method is applied by this reason. Furthermore, we distinguish and analyze differences from the clustering using widely-used K-Means algorithm for the document clustering.

  • PDF

Related Documents Classification System by Similarity between Documents (문서 유사도를 통한 관련 문서 분류 시스템 연구)

  • Jeong, Jisoo;Jee, Minkyu;Go, Myunghyun;Kim, Hakdong;Lim, Heonyeong;Lee, Yurim;Kim, Wonil
    • Journal of Broadcast Engineering
    • /
    • v.24 no.1
    • /
    • pp.77-86
    • /
    • 2019
  • This paper proposes using machine-learning technology to analyze and classify historical collected documents based on them. Data is collected based on keywords associated with a specific domain and the non-conceptuals such as special characters are removed. Then, tag each word of the document collected using a Korean-language morpheme analyzer with its nouns, verbs, and sentences. Embedded documents using Doc2Vec model that converts documents into vectors. Measure the similarity between documents through the embedded model and learn the document classifier using the machine running algorithm. The highest performance support vector machine measured 0.83 of F1-score as a result of comparing the classification model learned.

Extension of legacy gear design systems using XML and XSLT (XML과 XSLT를 이용한 레거시 기어 설계 시스템의 확장에 관한 연구)

  • 정태형;박승현
    • Proceedings of the Korean Society of Machine Tool Engineers Conference
    • /
    • 2001.10a
    • /
    • pp.257-262
    • /
    • 2001
  • As computer-related technologies have been developed, legacy design systems have not been appropriate for new computing environment. Therefore, it is necessary that most of them are either modified or newly developed. However, this requires quite much amount of cost and time. This paper presents a method of extending legacy design system without modification using XML and XSLT. In order to apply the developed method, a good example of legacy design systems, AGMA gear rating system has been extended so as to be suitable for the distributed computing environment. An XML document for AGMA gear rating process is defined. It is transformed to the form of the input document of AGMA gear rating system by XSLT processor according to the transformation rules defined in the AGMA gear rating XSLT document. After that, AGMA gear rating system reads this input document and generates an output document in the server. These operations are automatically executed by the external legacy system controller without user interactions. Using these operations, AGMA gear rating web service has been developed based on SOAP and WSDL to provide the functions of legacy AGMA gear rating system through the distributed network. Any system or user can implement AGMA gear rating process independently to the platform type, without making it for oneself, by simply referring the AGMA gear rating web service via Internet.

  • PDF

Fast, Flexible Text Search Using Genomic Short-Read Mapping Model

  • Kim, Sung-Hwan;Cho, Hwan-Gue
    • ETRI Journal
    • /
    • v.38 no.3
    • /
    • pp.518-528
    • /
    • 2016
  • The searching of an extensive document database for documents that are locally similar to a given query document, and the subsequent detection of similar regions between such documents, is considered as an essential task in the fields of information retrieval and data management. In this paper, we present a framework for such a task. The proposed framework employs the method of short-read mapping, which is used in bioinformatics to reveal similarities between genomic sequences. In this paper, documents are considered biological objects; consequently, edit operations between locally similar documents are viewed as an evolutionary process. Accordingly, we are able to apply the method of evolution tracing in the detection of similar regions between documents. In addition, we propose heuristic methods to address issues associated with the different stages of the proposed framework, for example, a frequency-based fragment ordering method and a locality-aware interval aggregation method. Extensive experiments covering various scenarios related to the search of an extensive document database for documents that are locally similar to a given query document are considered, and the results indicate that the proposed framework outperforms existing methods.

The Chosun Governor General Office's Administration regarding Official Documents (조선총독부 공문서(公文書) 제도 -기안(起案)에서 성책(成冊)까지의 과정을 중심으로-)

  • Lee, Seung-il
    • The Korean Journal of Archival Studies
    • /
    • no.9
    • /
    • pp.3-40
    • /
    • 2004
  • In this article, the elements usually included in the official documents issued by the Chosun Governor General office, the process of a certain document being put together and legally authorized, and its path of circulation and preservation are all examined. In order to create an official document of the Governor General office with legal authorization, a draft of a bill had to go through several discussions and a subsequent agreement before it was finally approved. Personnels involved in the discussion stage had the authority to ask for modifications and retouching of the draft, and the modifying process were all recorded in order to make clear who was responsible for a certain change or who objected to what at any given stage of the process. The approved version of an official document was called the 'Completed one(成案), and it was issued after the contents were turned into a fair copy by the office that originated the draft in the first place. With the original finalized version left in custody of that office, the fair copy was handed over to the Document department which was responsible for issuing outgoing documents. After the document was issued and the contained orders were carried out, the originally involved offices began to classify the documents according to their own standards and measures for safekeeping, but it was the Document department that was mainly responsible for document preservation. The Document department classified the documents according to related offices, nature of the documents(편찬류별), and most suitable preservation methods(보존종별). The documents were made into books, and documents to be permanently destroyed were handed over to the Account office where they would be demolished. The manners of document processing of the Chosun Governor General office was in fact a modified version of the manners of the Japanese government. Modifications were made so that the process would be more suitable to the situations and environment of the Chosun society. The office's managing process was inherited by the Chosun government after the Liberation, and cast a significant impact upon the document managing manners of the Korean authorities. The official document administration of the Chosun Governor General office marked both the beginning of the colony document administration, and also the beginning of a modernized document managing system.

Multi-Vector Document Embedding Using Semantic Decomposition of Complex Documents (복합 문서의 의미적 분해를 통한 다중 벡터 문서 임베딩 방법론)

  • Park, Jongin;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.19-41
    • /
    • 2019
  • According to the rapidly increasing demand for text data analysis, research and investment in text mining are being actively conducted not only in academia but also in various industries. Text mining is generally conducted in two steps. In the first step, the text of the collected document is tokenized and structured to convert the original document into a computer-readable form. In the second step, tasks such as document classification, clustering, and topic modeling are conducted according to the purpose of analysis. Until recently, text mining-related studies have been focused on the application of the second steps, such as document classification, clustering, and topic modeling. However, with the discovery that the text structuring process substantially influences the quality of the analysis results, various embedding methods have actively been studied to improve the quality of analysis results by preserving the meaning of words and documents in the process of representing text data as vectors. Unlike structured data, which can be directly applied to a variety of operations and traditional analysis techniques, Unstructured text should be preceded by a structuring task that transforms the original document into a form that the computer can understand before analysis. It is called "Embedding" that arbitrary objects are mapped to a specific dimension space while maintaining algebraic properties for structuring the text data. Recently, attempts have been made to embed not only words but also sentences, paragraphs, and entire documents in various aspects. Particularly, with the demand for analysis of document embedding increases rapidly, many algorithms have been developed to support it. Among them, doc2Vec which extends word2Vec and embeds each document into one vector is most widely used. However, the traditional document embedding method represented by doc2Vec generates a vector for each document using the whole corpus included in the document. This causes a limit that the document vector is affected by not only core words but also miscellaneous words. Additionally, the traditional document embedding schemes usually map each document into a single corresponding vector. Therefore, it is difficult to represent a complex document with multiple subjects into a single vector accurately using the traditional approach. In this paper, we propose a new multi-vector document embedding method to overcome these limitations of the traditional document embedding methods. This study targets documents that explicitly separate body content and keywords. In the case of a document without keywords, this method can be applied after extract keywords through various analysis methods. However, since this is not the core subject of the proposed method, we introduce the process of applying the proposed method to documents that predefine keywords in the text. The proposed method consists of (1) Parsing, (2) Word Embedding, (3) Keyword Vector Extraction, (4) Keyword Clustering, and (5) Multiple-Vector Generation. The specific process is as follows. all text in a document is tokenized and each token is represented as a vector having N-dimensional real value through word embedding. After that, to overcome the limitations of the traditional document embedding method that is affected by not only the core word but also the miscellaneous words, vectors corresponding to the keywords of each document are extracted and make up sets of keyword vector for each document. Next, clustering is conducted on a set of keywords for each document to identify multiple subjects included in the document. Finally, a Multi-vector is generated from vectors of keywords constituting each cluster. The experiments for 3.147 academic papers revealed that the single vector-based traditional approach cannot properly map complex documents because of interference among subjects in each vector. With the proposed multi-vector based method, we ascertained that complex documents can be vectorized more accurately by eliminating the interference among subjects.