• Title/Summary/Keyword: Automatic Document Generation

Search Result 50, Processing Time 0.027 seconds

Automatic Generation of Information Extraction Rules Through User-interface Agents (사용자 인터페이스 에이전트를 통한 정보추출 규칙의 자동 생성)

  • 김용기;양재영;최중민
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.4
    • /
    • pp.447-456
    • /
    • 2004
  • Information extraction is a process of recognizing and fetching particular information fragments from a document. In order to extract information uniformly from many heterogeneous information sources, it is necessary to produce information extraction rules called a wrapper for each source. Previous methods of information extraction can be categorized into manual wrapper generation and automatic wrapper generation. In the manual method, since the wrapper is manually generated by a human expert who analyzes documents and writes rules, the precision of the wrapper is very high whereas it reveals problems in scalability and efficiency In the automatic method, the agent program analyzes a set of example documents and produces a wrapper through learning. Although it is very scalable, this method has difficulty in generating correct rules per se, and also the generated rules are sometimes unreliable. This paper tries to combine both manual and automatic methods by proposing a new method of learning information extraction rules. We adopt the scheme of supervised learning in which a user-interface agent is designed to get information from the user regarding what to extract from a document, and eventually XML-based information extraction rules are generated through learning according to these inputs. The interface agent is used not only to generate new extraction rules but also to modify and extend existing ones to enhance the precision and the recall measures of the extraction system. We have done a series of experiments to test the system, and the results are very promising. We hope that our system can be applied to practical systems such as information-mediator agents.

Automatic Generation of Interactive 3D PDF Document in a 3D Viewer Environment (CAD 뷰어 기반 대화형 3D PDF 문서 생성 자동화)

  • Park, Kyeong-Ho;Choi, Young;Yang, Sang-Wook;Song, In-Ho
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.25 no.4
    • /
    • pp.77-85
    • /
    • 2008
  • PDF is widely accepted as a standard document format and now it supports 3D contents as well. Within the engineering application areas, this new 3D feature may be used to support sharing of 3D documents and thus collaboration between engineering departments, suppliers and partners. In this paper, we describe a system that automatically generates formatted engineering documents including 3D data converted from 3D applications such as commercial 3D CAD viewer. The system consists of two major modules. One is U3D conversion module and the other is PDF conversion module. U3D conversion module extracts geometry, view data, assembly and disassembly information from 3D viewer and converts to U3D format, currently in IDTF text file format. PDF conversion module generates a PDF file and inserts U3D data, various annotation information, and scripts for custom generated operations such as assembly and disassembly in the PDF document.

Automatic Test case Generation Mechanism from the Decision Table of Requirement Specification Techniques based on Metamodel (메타모델 기반 요구사항 명세 기법인 의사 결정표를 통한 자동 테스트 케이스 생성 메커니즘)

  • Hyun Seung Son
    • Journal of Advanced Navigation Technology
    • /
    • v.27 no.2
    • /
    • pp.228-234
    • /
    • 2023
  • As the increasing demand for high-quality software, there is huge requiring for quality certification of international standards, industrial functional safety (IEC 61508), automotive (ISO 26262), embedded software guidelines for weapon systems, etc., in the industry. Software companies are very difficult to systematically acquire the quality certification in terms of cost and manpower of Startup, venture small-sized companies. For their companies one test case automatic generation is considered as a core technique to evaluate or improve software quality. This paper proposes a test case automatic generation method based on the design decision table for system and software design verification. We apply the proposed method with OMG's standard techniques of metamodel and model transformation for automatically generating test cases. To do this, we design the metamodels of design decision table (Model) and test case document (Text) and define model transformation to automatically generate test cases, which will expect to easily work MC/DC coverage.

Automatic Document Title Generation with RNN and Reinforcement Learning (RNN과 강화 학습을 이용한 자동 문서 제목 생성)

  • Cho, Sung-Min;Kim, Wooseng
    • Journal of Information Technology Applications and Management
    • /
    • v.27 no.1
    • /
    • pp.49-58
    • /
    • 2020
  • Lately, a large amount of textual data have been poured out of the Internet and the technology to refine them is needed. Most of these data are long text and often have no title. Therefore, in this paper, we propose a technique to combine the sequence-to-sequence model of RNN and the REINFORCE algorithm to generate the title of the long text automatically. In addition, the TextRank algorithm was applied to extract a summarized text to minimize information loss in order to protect the shortcomings of the sequence-to-sequence model in which an information is lost when long texts are used. Through the experiment, the techniques proposed in this study are shown to be superior to the existing ones.

Improving Abstractive Summarization by Training Masked Out-of-Vocabulary Words

  • Lee, Tae-Seok;Lee, Hyun-Young;Kang, Seung-Shik
    • Journal of Information Processing Systems
    • /
    • v.18 no.3
    • /
    • pp.344-358
    • /
    • 2022
  • Text summarization is the task of producing a shorter version of a long document while accurately preserving the main contents of the original text. Abstractive summarization generates novel words and phrases using a language generation method through text transformation and prior-embedded word information. However, newly coined words or out-of-vocabulary words decrease the performance of automatic summarization because they are not pre-trained in the machine learning process. In this study, we demonstrated an improvement in summarization quality through the contextualized embedding of BERT with out-of-vocabulary masking. In addition, explicitly providing precise pointing and an optional copy instruction along with BERT embedding, we achieved an increased accuracy than the baseline model. The recall-based word-generation metric ROUGE-1 score was 55.11 and the word-order-based ROUGE-L score was 39.65.

An Efficient Design Pattern Framework for Automatic Code Generation based on XML (코드 자동 생성을 위한 XML 기반의 효율적인 디자인패턴 구조)

  • Kim, Un-Yong;Kim, Yeong-Cheol;Ju, Bok-Gyu;Choe, Yeong-Geun
    • The KIPS Transactions:PartD
    • /
    • v.8D no.6
    • /
    • pp.753-760
    • /
    • 2001
  • Design Patterns are design knowledge for solving issues related to extensibility and maintainability which are independent from problems concerned by application, but despite vast interest in design pattern, the specification and application of patterns is generally assumed to rely on manual implementation. As a result, we need to spend a lot of time to develop software program not only because of being difficult to analyze and apply to a consistent pattern, but also because of happening the frequent programing faults. In this paper, we propose a notation using XML for describing design pattern and a framework using design pattern. We will also suggest a source code generation support system, and show a example of the application through this notation and the application framework. We may construct more stable system and be generated a compact source code to a user based on the application of structured documentations with XML.

  • PDF

Automatic Generation of Explanatory 2D Vector Drawing from 3D CAD Data for Technical Documents (기술문서 작성을 위한 3 차원 CAD 데이터의 도해저작 알고리즘)

  • Shim H.S.;Yang S.W.;Choi Y.;Cho S.W.
    • Proceedings of the Korean Society of Precision Engineering Conference
    • /
    • 2005.06a
    • /
    • pp.177-180
    • /
    • 2005
  • Three dimensional shaded images are standard visualization method for CAD models on the computer screen. Therefore, much of the effort in the visualization of CAD models has been focused on how conveniently and realistically CAD models can be displayed on the screen. However, shaded 3D CAD data images captured from the screen may not be suitable for some application areas. Technical document, either in the paper or electronic form, can more clearly describe the shape and annotate parts of the model by using projected 2D line drawing format viewed from a user defined view direction. This paper describes an efficient method for generating such a 2D line drawing data in the vector format. The algorithm is composed of silhouette line detection, hidden line removal and cleaning processes.

  • PDF

Development of CBTC Car-borne Software with Model-Based Design and Its Applications (모델기반 설계를 통한 CBTC 차상장치 소프트웨어 개발 및 적용)

  • Quan, Zhong-Hua;Choi, Sun-Ah;Choi, Dong-Hyuk;Cho, Chan-Ho;Park, Gie-Soo;Ryou, Myung-Seon
    • Proceedings of the KSR Conference
    • /
    • 2011.05a
    • /
    • pp.910-917
    • /
    • 2011
  • CBTC(Communication Based Train Control) car-borne equipment, a part of the communication based train control system, mainly consists of automatic train protection(ATP) functions, automatic train operation(ATO) functions as well as the interface functions with other equipment including CBTC wayside equipment and train control management system etc. The CBTC car-borne software implementing ATP/ATO functions is a real-time embedded software requiring a high level of safety and reliability. To satisfy the requirements of the CBTC car-borne software, the model-based design techniques are applied with SCADE(Safety-Critical Application Development Environment) to the development of the CBTC car-borne software. In this paper, we illustrate the process modeling the car-borne ATP/ATO functions satisfying system requirement specification with system requirement management, modeling and document generation tools etc. supported by SCADE. In addition, the developed models corresponding to the ATP/ATO functions are applied to the train with CBTC car-borne equipment through its corresponding EN-50128 standards-compliant C code generated by the code generator. It is shown from the test result that the ATP/ATO models developed by SCADE work well while the trains are running in driverless operation mode.

  • PDF

Design and Implementation of USN Middleware using DTD GenerationTechnique (DTD 자동 생성 기법을 이용한 USN 미들웨어 설계 및 구현)

  • Nam, Si-Byung;Kwon, Ki-Hyeon;Yu, Myung-Han
    • Journal of the Korea Society of Computer and Information
    • /
    • v.17 no.3
    • /
    • pp.41-50
    • /
    • 2012
  • In the monitoring system based on web service application, it is faced with the problems like code reproduction, difficult scalability and error recovery derived from the frequent change of data structure. So we propose a technique of monitoring system by DTD(Document Type Definition) automatic generation. This technique is to use dynamic server-side script to cope with the change of sensor data structure, generate the DTD dynamically. An it also adapt the AJAX(Asynchronous JavaScript and XML) for XML data parsing, it can support mass data transmission and exception processing for data loss and damage. This technique shows the result of recovery time is decreased about 44.8ms in case of temporary data failure by comparing to the conventional XML method.

Automatic Generation of Structured Hyperdocuments from Multi-Column Document Images (복잡환 다단 문서 영상으로부터 구조화된 하이퍼문서의 자동 생성)

  • 이지연;강희중;이성환
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 1999.10b
    • /
    • pp.458-460
    • /
    • 1999
  • 본 논문에서는 다양한 객체를 포함한 다단 문서 영상을 원본 문서와 거의 유사한 형태의 HTML 문서로 변환할 수 있는 방법을 제안한다. 또한 논문이나 매뉴얼, 책의 한 단원 등 여러장의 입력 문서의 경우, 문서의 논리적인 구조 분석을 수행하고 장이나 절 등의 섹션 제목들을 계층화하여 다단 문서의 변환과 동시에 구조화된 목차 페이지도 함께 자동 생성하는 방법을 제안한다. 제안된 다단 문서 변환 알고리즘을 잡지, 신문, 광고지, 매뉴얼 등, 비정형화된 문서에 적용한 결과, 원본 문서의 형태와 구조에 큰 변함없이 유사하게 변환되었고, 논리적인 구조 분석 및 섹션 제목들의 계층화 작업 또한 정확히 수행되어 구조화된 목차 페이지의 자동 생성이 가능하였다.

  • PDF