• Title/Summary/Keyword: Document Analysis

Search Result 1,182, Processing Time 0.035 seconds

Keyword Weight based Paragraph Extraction Algorithm (키워드 가중치 기반 문단 추출 알고리즘)

  • Lee, Jongwon;Joo, Sangwoong;Lee, Hyunju;Jung, Hoekyung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2017.10a
    • /
    • pp.504-505
    • /
    • 2017
  • Existing morpheme analyzers classify the words used in writing documents. A system for extracting sentences and paragraphs based on a morpheme analyzer is being developed. However, there are very few systems that compress documents and extract important paragraphs. The algorithm proposed in this paper calculates the weights of the keyword written in the document and extracts the paragraphs containing the keyword. Users can reduce the time to understand the document by reading the paragraphs containing the keyword without reading the entire document. In addition, since the number of extracted paragraphs differs according to the number of keyword used in the search, the user can search various patterns compared to the existing system.

  • PDF

Document Summarization Based on Sentence Clustering Using Graph Division (그래프 분할을 이용한 문장 클러스터링 기반 문서요약)

  • Lee Il-Joo;Kim Min-Koo
    • The KIPS Transactions:PartB
    • /
    • v.13B no.2 s.105
    • /
    • pp.149-154
    • /
    • 2006
  • The main purpose of document summarization is to reduce the complexity of documents that are consisted of sub-themes. Also it is to create summarization which includes the sub-themes. This paper proposes a summarization system which could extract any salient sentences in accordance with sub-themes by using graph division. A document can be represented in graphs by using chosen representative terms through term relativity analysis based on co-occurrence information. This graph, then, is subdivided to represent sub-themes through connected information. The divided graphs are types of sentence clustering which shows a close relationship. When salient sentences are extracted from the divided graphs, summarization consisted of core elements of sentences from the sub-themes can be produced. As a result, the summarization quality will be improved.

A Study on the Establishing Quality Records System in Quality Management (품질경영 체제에서의 품질기록 시스템 확보 방안)

  • 박상필;박건우
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.19 no.37
    • /
    • pp.137-143
    • /
    • 1996
  • It seems to be less important to quality record than document control. But quality records provide a objectve evidence for certain product. So, the requirements of quality record is more serve than that of design document. It is obvious that quality record control promotes accumulation of know-how. The puepose of this study is to possible implementation methods through analysis of Code requirements. This paper suggests the considerations when establishing the quality records control system.

  • PDF

A Study on Intellgence Emergency Guide Line System (지능형 피난유도선 시스템에 대한 연구)

  • Park, Yong-Gyu;Kim, Suk-Eun;Kang, Kyung-Sik
    • Proceedings of the Safety Management and Science Conference
    • /
    • 2010.04a
    • /
    • pp.107-116
    • /
    • 2010
  • Government and company are unfolding greenhouse gas reduction activity to prevent the effects of global warming. Also, verification business through greenhouse gas inventory construction is spreaded variously. Greenhouse gas verification proceeds by document examination, risk analysis, field survey. Document investigates emission information, calculation standard, emission report, data management system. And through risk assessment result, establish field verification plan. Through study on risk assessment of greenhouse gas inventory verification, wish to reduce risk of verification.

  • PDF

A Study on Risk Assessment of GHG Inventory Verification (온실가스 인벤토리 검증의 위험성평가에 대한 연구)

  • Lee, Kang-Bok;Kim, Geon-Ho;Lee, Seung-Hwan;Lee, Eun-Sook
    • Proceedings of the Safety Management and Science Conference
    • /
    • 2009.11a
    • /
    • pp.203-208
    • /
    • 2009
  • Government and company are unfolding greenhouse gas reduction activity to prevent the effects of global warming. Also, verification business through greenhouse gas inventory construction is spreaded variously. Greenhouse gas verification proceeds by document examination, risk analysis, field survey. Document investigates emission information, calculation standard, emission report, data management system. And through risk assessment result, establish field verification plan. Through study on risk assessment of greenhouse gas inventory verification, wish to reduce risk of verification.

  • PDF

Table Detection from Document Image using Vertical Arrangement of Text Blocks

  • Tran, Dieu Ni;Tran, Tuan Anh;Oh, Aran;Kim, Soo Hyung;Na, In Seop
    • International Journal of Contents
    • /
    • v.11 no.4
    • /
    • pp.77-85
    • /
    • 2015
  • Table detection is a challenging problem and plays an important role in document layout analysis. In this paper, we propose an effective method to identify the table region from document images. First, the regions of interest (ROIs) are recognized as the table candidates. In each ROI, we locate text components and extract text blocks. After that, we check all text blocks to determine if they are arranged horizontally or vertically and compare the height of each text block with the average height. If the text blocks satisfy a series of rules, the ROI is regarded as a table. Experiments on the ICDAR 2013 dataset show that the results obtained are very encouraging. This proves the effectiveness and superiority of our proposed method.

Index Graph : An IR Index Structure for Dynamic Document Database (인덱스 그래프 : 동적 문서 데이터베이스를 위한 IR 인덱스 구조)

  • 박병권
    • The Journal of Information Systems
    • /
    • v.10 no.1
    • /
    • pp.257-278
    • /
    • 2001
  • An IR(information retrieval) index for dynamic document databases where insertion, deletion, and update of documents happen frequently should be frequently updated. As the conventional structure of IR index is, however, focused on the information retrieval purpose, its structure is inefficient to handle dynamic update of it. In this paper, we propose a new structure for IR Index, we call it Index Graph, which is organized by connecting multiple indexes into a graph structure. By analysis and experiment, we prove the Index Graph is superior to the conventional structure of IR index in the performance of insertion, deletion, and update of documents as well as the performance of information retrieval.

  • PDF

A Study on the Systems Engineering Management Plan for the Railway Safety Project (철도안전프로젝트에 적용한 시스템엔지니어링 관리계획서 작성에 관한 연구)

  • Choi Yo-Chul;Cho Yun-Ok
    • Journal of the Korean Society for Railway
    • /
    • v.9 no.4 s.35
    • /
    • pp.482-486
    • /
    • 2006
  • The Systems Engineering Management Plan (SEMP) is the primary, top level technical management document for the integration of all engineering activities at the project plan phase. This document defined the activities to plan, control, and perform overall engineering integration. To develop the SEMP for Railway Safety System, several standards are reviewed and analyzed. And then a common requirement for SEMP preparation is derived from the results of analysis. Also, the SEMP example available practically applies to Railway Safety System. In particular, The SEMP focused on controling technical program management has been organized so far, but in this study the detailed contents of SEMP put stress on project management is derived. And it is related to each other between project management and technical engineering management. At the end, to continuously manage the items and contents of the SEMP, a database management and an automatic document generation system is presented using Computer-Aided Systems Engineering (CASE) tool.

A Study on the Systems Engineering Management Plan for the Railway Safety System (철도안전시스템에 적용한 시스템 엔지니어링 관리 계획 작성사례 연구)

  • Choi Yo-Chul;Park Young-Won;Cho Yun-Ok
    • Proceedings of the KSR Conference
    • /
    • 2005.05a
    • /
    • pp.64-69
    • /
    • 2005
  • The Systems Engineering Management Plan (SEMP) is the primary, top level technical management document for the integration of all engineering activities at the project plan phase. This document defined the activities to plan, control, and perform overall engineering integration. To develop the SEMP for Railway Safety System, several standards are reviewed and analyzed. And then a common requirement for SEMP preparation is derived from the results of analysis. Also, the SEMP example available practically applies to Railway Safety System. In particular, The SEMP focused on controling technical program management has been organized so far, but in this study the detailed contents of SEMP put stress on project management is derived. And it is related to each other between project management and technical engineering management. At the end, to continuously manage the items and contents of the SEMP, a database management and an automatic document generation system is presented using Computer-Aided Systems Engineering (CASE) tool.

  • PDF

Nonnegative Matrix Factorization with Orthogonality Constraints

  • Yoo, Ji-Ho;Choi, Seung-Jin
    • Journal of Computing Science and Engineering
    • /
    • v.4 no.2
    • /
    • pp.97-109
    • /
    • 2010
  • Nonnegative matrix factorization (NMF) is a popular method for multivariate analysis of nonnegative data, which is to decompose a data matrix into a product of two factor matrices with all entries restricted to be nonnegative. NMF was shown to be useful in a task of clustering (especially document clustering), but in some cases NMF produces the results inappropriate to the clustering problems. In this paper, we present an algorithm for orthogonal nonnegative matrix factorization, where an orthogonality constraint is imposed on the nonnegative decomposition of a term-document matrix. The result of orthogonal NMF can be clearly interpreted for the clustering problems, and also the performance of clustering is usually better than that of the NMF. We develop multiplicative updates directly from true gradient on Stiefel manifold, whereas existing algorithms consider additive orthogonality constraints. Experiments on several different document data sets show our orthogonal NMF algorithms perform better in a task of clustering, compared to the standard NMF and an existing orthogonal NMF.