• Title/Summary/Keyword: 문서 처리 자동화

Search Result 114, Processing Time 0.283 seconds

Web Page Classification System based upon Ontology (온톨로지 기반의 웹 페이지 분류 시스템)

  • Choi Jaehyuk;Seo Haesung;Noh Sanguk;Choi Kyunghee;Jung Gihyun
    • The KIPS Transactions:PartB
    • /
    • v.11B no.6
    • /
    • pp.723-734
    • /
    • 2004
  • In this paper, we present an automated Web page classification system based upon ontology. As a first step, to identify the representative terms given a set of classes, we compute the product of term frequency and document frequency. Secondly, the information gain of each term prioritizes it based on the possibility of classification. We compile a pair of the terms selected and a web page classification into rules using machine learning algorithms. The compiled rules classify any Web page into categories defined on a domain ontology. In the experiments, 78 terms out of 240 terms were identified as representative features given a set of Web pages. The resulting accuracy of the classification was, on the average, 83.52%.

Traceability Management Technique for Software Artifacts which Comprise Software Release (소프트웨어 릴리스를 구성하는 산출물들의 추적성 관리 기법)

  • Kim, Dae Yeob;Youn, Cheong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.7
    • /
    • pp.461-470
    • /
    • 2013
  • The capacity for tracing relationships among various artifacts which are created at each phase of software system development is essential for software quality management. Software release refers to delivering a set of newly created or changed artifacts to customers. The relationships among artifacts which comprise software release must be traced so that the work for customer's requirement of change and functional enhancement is effectively established. And release management can be effectively realized through the integration of configuration management and change management. This paper proposes the technique for supporting change management of artifacts and for tracing relationships of artifacts which comprise software release through the integrated environment of personal workspace and configuration management system. In the proposed environment, the visualized version graph and automated tagging function are used for tracing relationships of artifacts.

Development of a Framework for Semi-automatic Building Test Collection Specialized in Evaluating Relation Extraction between Technical Terminologies (기술용어 간 관계추출의 성능평가를 위한 반자동 테스트 컬렉션 구축 프레임워크 개발)

  • Jeong, Chang-Hoo;Choi, Sung-Pil;Lee, Min-Ho;Choi, Yun-Soo
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.2
    • /
    • pp.481-489
    • /
    • 2010
  • Due to the increase of the attention on relation extraction systems, the construction of test collections for assessing their performance has emerged as an important task. In this paper, we propose semi-automatic framework capable of constructing test collections for relation extraction on a large scale. Based on this framework, we develop a test collection which can assess the performance of various approaches to extracting relations between technical terminologies in scientific literatures. This framework can minimize the cost of constructing this kind of collections and reduce the intrinsic fluctuations which may come from the diversity in characteristics of collection developers. Furthermore, we can construct balanced and objective collections by means of controlling the selection process of seed documents and terminologies using the proposed framework.

Design and Implementation of Secure E-Procurement System based on XML (XML기반의 안전한 E-Procurement 시스템 설계 및 구현)

  • Moon, Tae-Soo;Song, You-Jin
    • The KIPS Transactions:PartD
    • /
    • v.9D no.6
    • /
    • pp.1043-1054
    • /
    • 2002
  • This paper intends to suggest a XML-based secure E-Procurement system using Unified Modeling Language(UML), as an application system for domestic automobile industry. Applying UML methodology, which is Component-based Development (CBD), we analyzed the workflow on procurement operation of automobile industry and implemented a prototype of efficient E-Procurement system for automobile industry, by developing XML/EDI and XML signature. Also, on this paper, object-oriented CBD is employed to minimize the risk of life cycle and reuse software as mentioned to limitation of information engineering methodology. It enables the interoperability with ERP (Enterprise Resource Planning) as corporate legacy system. This system proposes a solution to apply analysis and design of workflow, component development, interoperability with corporate information system, and XML signature for integrity and authentication of electronic documents in other system so far.

Development of Real-Time Scheduling System for OHT Mission Planning (OHT 작업 계획을 위한 실시간 스케줄링 시스템 개발)

  • Lee, Bok-Ju;Park, Hee-Mun;Kwon, Yong-Hwan;Han, Kyung-Ah;Seo, Kyung-Min
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.10 no.7
    • /
    • pp.205-214
    • /
    • 2021
  • For smart manufacturing, most semiconductor sites utilize automated material handling systems(AMHS). As one of the AMHSs, the OHT control system(OCS) manages overhead hoist transports(OHT) that move along rails installed on the ceiling. This paper proposes a real-time scheduling system to efficiently allocate and control the OHTs in semiconductor logistics processes. The proposed system, as an independent subsystem within the OCS, is interconnected with the main subsystem of the OCS, so that it can be easily modified without the effect of other systems. To develop the system, we first identify the functional requirements of the semiconductor logistics process and classify several types of control scenarios of the OHTs. Next, based on SEMI(Semiconductor Equipment and Materials International) standard, we design sequence diagrams and interface messages between the subsystems. The developed system is interoperated with the OCS main subsystem and the database in real time and performs two major roles: 1) OHT dispatching and 2) pathfinding. Six integrated tests were carried out to verify the functions of the developed system. The system was normally operated on six basic scenarios and two exception scenarios and we proved that it is suitable for the mission planning of the OHTs.

Extracting Supporting Evidence with High Precision via Bi-LSTM Network (양방향 장단기 메모리 네트워크를 활용한 높은 정밀도의 지지 근거 추출)

  • Park, ChaeHun;Yang, Wonsuk;Park, Jong C.
    • Annual Conference on Human and Language Technology
    • /
    • 2018.10a
    • /
    • pp.285-290
    • /
    • 2018
  • 논지가 높은 설득력을 갖기 위해서는 충분한 지지 근거가 필요하다. 논지 내의 주장을 논리적으로 지지할 수 있는 근거 자료 추출의 자동화는 자동 토론 시스템, 정책 투표에 대한 의사 결정 보조 등 여러 어플리케이션의 개발 및 상용화를 위해 필수적으로 해결되어야 한다. 하지만 웹문서로부터 지지 근거를 추출하는 시스템을 위해서는 다음과 같은 두 가지 연구가 선행되어야 하고, 이는 높은 성능의 시스템 구현을 어렵게 한다: 1) 논지의 주제와 직접적인 관련성은 낮지만 지지 근거로 사용될 수 있는 정보를 확보하기 위한 넓은 검색 범위, 2) 수집한 정보 내에서 논지의 주장을 명확하게 지지할 수 있는 근거를 식별할 수 있는 인지 능력. 본 연구는 높은 정밀도와 확장 가능성을 가진 지지 근거 추출을 위해 다음과 같은 단계적 지지 근거 추출 시스템을 제안한다: 1) TF-IDF 유사도 기반 관련 문서 선별, 2) 의미적 유사도를 통한 지지 근거 1차 추출, 3) 신경망 분류기를 통한 지지 근거 2차 추출. 제안하는 시스템의 유효성을 검증하기 위해 사설 4008개 내의 주장에 대해 웹 상에 있는 845675개의 뉴스에서 지지 근거를 추출하는 실험을 수행하였다. 주장과 지지 근거를 주석한 정보에 대하여 성능 평가를 진행한 결과 본 연구에서 제안한 단계적 시스템은 1,2차 추출 과정에서 각각 0.41, 0.70의 정밀도를 보였다. 이후 시스템이 추출한 지지 근거를 분석하여, 논지에 대한 적절한 이해를 바탕으로 한 지지 근거 추출이 가능하다는 것을 확인하였다.

  • PDF

Automating Model Building Processes for Simulation of Complex Manufacturing and Logistics Systems (복잡한 제조 및 물류 시스템에서의 시뮬레이션을 위한 자동 모델 생성 프로세스)

  • Seo, Jeong Hoon;Kim, Kap Hwan
    • Journal of the Korea Society for Simulation
    • /
    • v.27 no.2
    • /
    • pp.125-137
    • /
    • 2018
  • Simulations have been used to evaluate the efficiency of logistics or manufacturing systems and predict the outcomes of the systems. New simulation models are needed to evaluate new alternative plants and layouts during the resource design process. Although it is easy to handle minor changes in parameters by modifying a simulation model, it takes considerable time and effort for simulation modelers to alter the layout, which involves changes in many simulation sub-models. Therefore, this study proposes a method to transfer information in AutoCAD layout to the simulation model automatically. This study also defines a standard document as an Excel Form, and suggests a method to transfer information on the basic layouts, processes, resources, and workers to a simulation model automatically. A simulation tool, called Tecnomatix Plant Simulation 9.0, was used for this study. The proposed approach in this study was applied to a semiconductor wafer factory for a case study.

Re-defining Named Entity Type for Personal Information De-identification and A Generation method of Training Data (개인정보 비식별화를 위한 개체명 유형 재정의와 학습데이터 생성 방법)

  • Choi, Jae-hoon;Cho, Sang-hyun;Kim, Min-ho;Kwon, Hyuk-chul
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.05a
    • /
    • pp.206-208
    • /
    • 2022
  • As the big data industry has recently developed significantly, interest in privacy violations caused by personal information leakage has increased. There have been attempts to automate this through named entity recognition in natural language processing. In this paper, named entity recognition data is constructed semi-automatically by identifying sentences with de-identification information from de-identification information in Korean Wikipedia. This can reduce the cost of learning about information that is not subject to de-identification compared to using general named entity recognition data. In addition, it has the advantage of minimizing additional systems based on rules and statistics to classify de-identification information in the output. The named entity recognition data proposed in this paper is classified into twelve categories. There are included de-identification information, such as medical records and family relationships. In the experiment using the generated dataset, KoELECTRA showed performance of 0.87796 and RoBERTa of 0.88.

  • PDF

An SAO-based Text Mining Approach for Technology Roadmapping Using Patent Information (기술로드맵핑을 위한 특허정보의 SAO기반 텍스트 마이닝 접근 방법)

  • Choi, Sung-Chul;Kim, Hong-Bin;Yoon, Jang-Hyeok
    • Journal of Technology Innovation
    • /
    • v.20 no.1
    • /
    • pp.199-234
    • /
    • 2012
  • Technology roadmaps (TRMs) are considered to be the essential tool for strategic technology planning and management. Recently, rapidly evolving technological trends and severe technological competition are making TRM more important than ever before. That is because TRM plays a role of "map" that align organizational objectives with their relevant technologies. However, constructing and managing TRMs are costly and time-consuming because they rely on the qualitative and intuitive knowledge of human experts. Therefore, enhancing the productivity of developing TRMs is one of the major concerns in technology planning. In this regard, this paper proposes a technology roadmapping approach based on function of which concept includes objectives, structures and effects of a technology and which are represented as Subject-Action-Object structures extractable by exploiting natural language processing of patent text. We expect that the proposed method will broaden experts' technological horizons in the technology planning process and will help to construct TRMs efficiently with the reduced time and costs.

  • PDF

Automatic Recognition and Normalization System of Korean Time Expression using the individual time units (시간의 단위별 처리를 이용한 자동화된 한국어 시간 표현 인식 및 정규화 시스템)

  • Seon, Choong-Nyoung;Kang, Sang-Woo;Seo, Jung-Yun
    • Korean Journal of Cognitive Science
    • /
    • v.21 no.4
    • /
    • pp.447-458
    • /
    • 2010
  • Time expressions are a very important form of information in different types of data. Thus, the recognition of a time expression is an important factor in the field of information extraction. However, most previously designed systems consider only a specific domain, because time expressions do not have a regular form and frequently include different ellipsis phenomena. We present a two-level recognition method consisting of extraction and transformation phases to achieve generality and portability. In the extraction phase, time expressions are extracted by atomic time units for extensibility. Then, in the transformation phase, omitted information is restored using basis time and prior knowledge. Finally, every complete atomic time unit is transformed into a normalized form. The proposed system can be used as a general-purpose system, because it has a language- and domain-independent architecture. In addition, this system performs robustly in noisy data like SMS data, which include various errors. For SMS data, the accuracies of time-expression extraction and time-expression normalization by using the proposed system are 93.8% and 93.2%, respectively. On the basis of these experimental results, we conclude that the proposed system shows high performance in noisy data.

  • PDF