• 제목/요약/키워드: Document Analysis

검색결과 1,179건 처리시간 0.028초

키워드 가중치 기반 문단 추출 알고리즘 (Keyword Weight based Paragraph Extraction Algorithm)

  • 이종원;주상웅;이현주;정회경
    • 한국정보통신학회:학술대회논문집
    • /
    • 한국정보통신학회 2017년도 추계학술대회
    • /
    • pp.504-505
    • /
    • 2017
  • 기존의 형태소 분석기는 문서 내에 사용된 단어들을 분류한다. 이를 기반으로 문장과 문단을 추출하는 시스템이 개발되고 있으나 해당 문서를 압축하여 주요 문단을 추출하는 시스템은 매우 미흡한 실정이다. 본 논문에서 제안하는 알고리즘은 문서 내에 사용된 키워드들의 가중치를 계산하고 키워드를 포함한 문단들을 추출한다. 이는 해당 문서를 모두 읽지 않고 키워드가 포함된 문단들을 읽음으로써 문서를 이해하는 시간을 줄일 수 있다. 또한 검색에 사용된 키워드의 개수에 따라 추출되는 문단의 수가 다름으로 사용자는 기존 시스템에 비해 다양한 패턴의 검색이 가능하다.

  • PDF

그래프 분할을 이용한 문장 클러스터링 기반 문서요약 (Document Summarization Based on Sentence Clustering Using Graph Division)

  • 이일주;김민구
    • 정보처리학회논문지B
    • /
    • 제13B권2호
    • /
    • pp.149-154
    • /
    • 2006
  • 문서요약은 여러 개의 하위 주제로 구성되어 있는 문서에 대해 문서의 복잡도를 줄이면서 하위 주제를 모두 포함하는 요약문을 생성하는 것이 목적이다. 본 논문은 그래프 분할을 이용하여 하위 주제별로 중요 문장을 추출하는 요약시스템을 제안한다. 문장별 공기정보에 의한 단어의 연관성 분석을 통해 선정된 대표어를 이용하여 문서를 그래프로 표현한다. 그래프는 연결정보에 의해 하위 주제를 의미하는 부분 그래프로 분할되며 부분 그래프는 긴밀한 관계를 갖는 문장들이 클러스터링된 형태이다. 부분 그래프별로 중요 문장을 추출하면 하위 주제별 핵심 내용들로만 요약문을 구성하게 되어 요약 성능이 향상된다.

품질경영 체제에서의 품질기록 시스템 확보 방안 (A Study on the Establishing Quality Records System in Quality Management)

  • 박상필;박건우
    • 산업경영시스템학회지
    • /
    • 제19권37호
    • /
    • pp.137-143
    • /
    • 1996
  • It seems to be less important to quality record than document control. But quality records provide a objectve evidence for certain product. So, the requirements of quality record is more serve than that of design document. It is obvious that quality record control promotes accumulation of know-how. The puepose of this study is to possible implementation methods through analysis of Code requirements. This paper suggests the considerations when establishing the quality records control system.

  • PDF

지능형 피난유도선 시스템에 대한 연구 (A Study on Intellgence Emergency Guide Line System)

  • 박용규;김석은;강경식
    • 대한안전경영과학회:학술대회논문집
    • /
    • 대한안전경영과학회 2010년도 춘계학술대회
    • /
    • pp.107-116
    • /
    • 2010
  • Government and company are unfolding greenhouse gas reduction activity to prevent the effects of global warming. Also, verification business through greenhouse gas inventory construction is spreaded variously. Greenhouse gas verification proceeds by document examination, risk analysis, field survey. Document investigates emission information, calculation standard, emission report, data management system. And through risk assessment result, establish field verification plan. Through study on risk assessment of greenhouse gas inventory verification, wish to reduce risk of verification.

  • PDF

온실가스 인벤토리 검증의 위험성평가에 대한 연구 (A Study on Risk Assessment of GHG Inventory Verification)

  • 이강복;김건호;이승환;이은숙
    • 대한안전경영과학회:학술대회논문집
    • /
    • 대한안전경영과학회 2009년도 추계학술대회
    • /
    • pp.203-208
    • /
    • 2009
  • Government and company are unfolding greenhouse gas reduction activity to prevent the effects of global warming. Also, verification business through greenhouse gas inventory construction is spreaded variously. Greenhouse gas verification proceeds by document examination, risk analysis, field survey. Document investigates emission information, calculation standard, emission report, data management system. And through risk assessment result, establish field verification plan. Through study on risk assessment of greenhouse gas inventory verification, wish to reduce risk of verification.

  • PDF

Table Detection from Document Image using Vertical Arrangement of Text Blocks

  • Tran, Dieu Ni;Tran, Tuan Anh;Oh, Aran;Kim, Soo Hyung;Na, In Seop
    • International Journal of Contents
    • /
    • 제11권4호
    • /
    • pp.77-85
    • /
    • 2015
  • Table detection is a challenging problem and plays an important role in document layout analysis. In this paper, we propose an effective method to identify the table region from document images. First, the regions of interest (ROIs) are recognized as the table candidates. In each ROI, we locate text components and extract text blocks. After that, we check all text blocks to determine if they are arranged horizontally or vertically and compare the height of each text block with the average height. If the text blocks satisfy a series of rules, the ROI is regarded as a table. Experiments on the ICDAR 2013 dataset show that the results obtained are very encouraging. This proves the effectiveness and superiority of our proposed method.

인덱스 그래프 : 동적 문서 데이터베이스를 위한 IR 인덱스 구조 (Index Graph : An IR Index Structure for Dynamic Document Database)

  • 박병권
    • 한국정보시스템학회지:정보시스템연구
    • /
    • 제10권1호
    • /
    • pp.257-278
    • /
    • 2001
  • An IR(information retrieval) index for dynamic document databases where insertion, deletion, and update of documents happen frequently should be frequently updated. As the conventional structure of IR index is, however, focused on the information retrieval purpose, its structure is inefficient to handle dynamic update of it. In this paper, we propose a new structure for IR Index, we call it Index Graph, which is organized by connecting multiple indexes into a graph structure. By analysis and experiment, we prove the Index Graph is superior to the conventional structure of IR index in the performance of insertion, deletion, and update of documents as well as the performance of information retrieval.

  • PDF

철도안전프로젝트에 적용한 시스템엔지니어링 관리계획서 작성에 관한 연구 (A Study on the Systems Engineering Management Plan for the Railway Safety Project)

  • 최요철;조연옥
    • 한국철도학회논문집
    • /
    • 제9권4호
    • /
    • pp.482-486
    • /
    • 2006
  • The Systems Engineering Management Plan (SEMP) is the primary, top level technical management document for the integration of all engineering activities at the project plan phase. This document defined the activities to plan, control, and perform overall engineering integration. To develop the SEMP for Railway Safety System, several standards are reviewed and analyzed. And then a common requirement for SEMP preparation is derived from the results of analysis. Also, the SEMP example available practically applies to Railway Safety System. In particular, The SEMP focused on controling technical program management has been organized so far, but in this study the detailed contents of SEMP put stress on project management is derived. And it is related to each other between project management and technical engineering management. At the end, to continuously manage the items and contents of the SEMP, a database management and an automatic document generation system is presented using Computer-Aided Systems Engineering (CASE) tool.

철도안전시스템에 적용한 시스템 엔지니어링 관리 계획 작성사례 연구 (A Study on the Systems Engineering Management Plan for the Railway Safety System)

  • 최요철;박영원;조연옥
    • 한국철도학회:학술대회논문집
    • /
    • 한국철도학회 2005년도 춘계학술대회 논문집
    • /
    • pp.64-69
    • /
    • 2005
  • The Systems Engineering Management Plan (SEMP) is the primary, top level technical management document for the integration of all engineering activities at the project plan phase. This document defined the activities to plan, control, and perform overall engineering integration. To develop the SEMP for Railway Safety System, several standards are reviewed and analyzed. And then a common requirement for SEMP preparation is derived from the results of analysis. Also, the SEMP example available practically applies to Railway Safety System. In particular, The SEMP focused on controling technical program management has been organized so far, but in this study the detailed contents of SEMP put stress on project management is derived. And it is related to each other between project management and technical engineering management. At the end, to continuously manage the items and contents of the SEMP, a database management and an automatic document generation system is presented using Computer-Aided Systems Engineering (CASE) tool.

  • PDF

Nonnegative Matrix Factorization with Orthogonality Constraints

  • Yoo, Ji-Ho;Choi, Seung-Jin
    • Journal of Computing Science and Engineering
    • /
    • 제4권2호
    • /
    • pp.97-109
    • /
    • 2010
  • Nonnegative matrix factorization (NMF) is a popular method for multivariate analysis of nonnegative data, which is to decompose a data matrix into a product of two factor matrices with all entries restricted to be nonnegative. NMF was shown to be useful in a task of clustering (especially document clustering), but in some cases NMF produces the results inappropriate to the clustering problems. In this paper, we present an algorithm for orthogonal nonnegative matrix factorization, where an orthogonality constraint is imposed on the nonnegative decomposition of a term-document matrix. The result of orthogonal NMF can be clearly interpreted for the clustering problems, and also the performance of clustering is usually better than that of the NMF. We develop multiplicative updates directly from true gradient on Stiefel manifold, whereas existing algorithms consider additive orthogonality constraints. Experiments on several different document data sets show our orthogonal NMF algorithms perform better in a task of clustering, compared to the standard NMF and an existing orthogonal NMF.