• Title/Summary/Keyword: document structure

Search Result 594, Processing Time 0.026 seconds

An Automated Creation of Document Model for Logical Structure Analysis of Document Images (문서 영상의 논리적인 구조 분석을 위한 문서 모델의 자동 생성)

  • Lee, Kyong-Ho;Choy, Yoon-Chul;Cho, Sung-Bae;Koh, Kyun
    • Proceedings of the Korea Multimedia Society Conference
    • /
    • 2000.11a
    • /
    • pp.103-106
    • /
    • 2000
  • 본 논문에서는 문서 영상으로부터 전자 문서를 자동 생성하기 위한 논리적인 구조 분석을 효율적으로 지원하기 위하여 문서 모델의 자동 생성과 점증적인 학습 기법을 제안한다. 이를 위하여 문서 유형의 논리적인 구조 정보와 기하적인 특성을 효과적으로 기술할 수 있는 문서 모델을 정의한다. 특히 제안된 방법은 문서 모델의 생성 결과로부터 SGML DTD와 DSSSL 스타일 시트를 생성하기 때문에 문서의 재 사용성과 호환성을 지원한다.

  • PDF

Traditional Medicine Resources and Traditional Medicine Information Shown on The Chosun Dynasty Geological Document (조선시대(朝鮮時代) '지리지(地理志)' 류(類)에 나타난 한약자원(韓藥資源) 및 고전의학(古傳醫學) 정보(情報))

  • Lee, Jeong-Hwa;Kim, Seong-Su
    • Korean Journal of Oriental Medicine
    • /
    • v.13 no.3
    • /
    • pp.69-78
    • /
    • 2007
  • King Se-Jong, who had put much effort into maintaining the national structure, published geological documents to reinforce his authority and through these documents could see the nationwide distribution of medicinal resources. Since then the regional geological documents published in each region included information on the local medical environment of those times. This study is one that focuses on how to specifically understand oriental medicine resources and the contents of ancient medicine by applying the appropriate geological information.

  • PDF

An XML Data Management System Using an Object-Relational Database

  • Nam, S.H.;Jung, T.S.;Kim, T.K.;Kim, K.R.;Zahng, H.K.;Yoo, J.S.;Cho, W.S.
    • Proceedings of the Korea Society for Industrial Systems Conference
    • /
    • 2007.02a
    • /
    • pp.163-167
    • /
    • 2007
  • We propose an XML document storage system, called XDMS (XML Document Management System), by using an object-relational DBMS. XDMS generates object database schema from XML Schema and stores the XML documents in an object-relational database. SAX parser is used for understanding the structure of the XML documents, and XDMS transforms the documents into objects in the database. Experiment shows that object-relational databases provide more efficient storage and query model compared with relational databases.

  • PDF

A Study on XML/EDI System Security using XML Signature (XML 전자서명을 이용한 XML/EDI 시스템 보안에 관한 연구)

  • 이경록;서장훈;박명규
    • Journal of the Korea Safety Management & Science
    • /
    • v.4 no.1
    • /
    • pp.57-68
    • /
    • 2002
  • As Internet spreads rapidly, the industrial structure is changing to a new paradigm. The previous EDI system was asked to change and WEB EDI, E-Mail EDI, FTP EDI etc. which are based on the internet appeared. These days the XML/EDI which has XML document appeared. The XML/EDI consider advantages and disadvantages of VAN/EDI and EDI which based on the internet. Also, EDI system has to assure a safe exchange between sender and receiver. But, the internet has security problems because it uses a open TCP/IP protocol. Although there are many methods for security, it is being studied with XML concept. On this paper, we will suppose the XML/EDI system model with XML Signature, and build a procedure of electronic signature and delivery of document between sender and receiver.

General MFD Structure for UPnP Bridge (UPnP 브리지를 위한 범용 MFD 구조)

  • Choi, Yong-Soon;Kang, Jung-Seok;Park, Hong-Seong
    • Proceedings of the KIEE Conference
    • /
    • 2007.10a
    • /
    • pp.289-290
    • /
    • 2007
  • UPnP Bridge supporting diverse network interface has to meet standard requirements in order to be connected with legacy devices. It is able to provide or bridge a service description and device description according to a specification because IEEE1394 and USB have this standard requirements. But it is difficult to know whether it RS232C supports only serial communication and packet transfer. It need a document for the standard definition of communication protocol on UPnP device having such interface. By doing so, this document can understand device and packet type. This paper defines MFD(Message Field Description) and makes UPnP message converter. So it will be base to standardize supporting variable legacy device.

  • PDF

Design and Implementation of an XML Document Management System Based on $O_2$ ($O_2$기반의 XML 문서관리 시스템 설계 및 구현)

  • 유재수
    • The Journal of Information Technology and Database
    • /
    • v.7 no.1
    • /
    • pp.27-39
    • /
    • 2000
  • In this paper, we design and implement a XML management system based on OODBMS that supports structured information retrieval of XML documents. We also propose an object oriented modeling to store and fetch XML documents, to manage image data, and to support versioning for the XML document management system(XMS). The XMS consists of a repository manager that maintains the interfaces for external application programs, a XML instance storage manager that stores XML documents in the database, a XML instance manager that fetches XML documents stored in the database, a XML index manager that creates index for the structure information and the contents of documents, and a query processor that processes various queries.

  • PDF

Judging Translated Web Document & Constructing Bilingual Corpus (웹 번역문서 판별과 병렬 말뭉치 구축)

  • Jee-hyung, Kim;Yill-byung, Lee
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2004.10a
    • /
    • pp.787-789
    • /
    • 2004
  • People frequently feel the need of a general searching tool that frees from language barrier when they find information through the internet. Therefore, it is necessary to have a multilingual parallel corpus to search with a word that includes a search keyword and has a corresponding word in another language, Multilingual parallel corpus can be built and reused effectively through the several processes which are judgment of the web documents, sentence alignment and word alignment. To build a multilingual parallel corpus, multi-lingual dictionary should be constructed in each language and HTML should be simplified. And by understanding the meaning and the statistics of document structure, judgment on translated web documents will be made and the searched web pages will be aligned in sentence unit.

  • PDF

An Efficient Search Method For XML document

  • Qian, Xie;Cho, Dong-Sub
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2011.04a
    • /
    • pp.1287-1290
    • /
    • 2011
  • Because of the rapid development of internet, there are more and more documents stored by the XML-based format. When there is a great deal of XML documents, how to get the valuable Information is an important subject. This paper proposes an effective XML document search method to search text contents and structures of XML documents. We build the keyword matrix of text contexts and structure matrixes of structures in XML documents to improve the efficiency of query time. When there is a great deal of XML documents, the search method we propose can improve much efficiency of query time.

An Efficient Information Retrieval System for Unstructured Data Using Inverted Index

  • Abdullah Iftikhar;Muhammad Irfan Khan;Kulsoom Iftikhar
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.7
    • /
    • pp.31-44
    • /
    • 2024
  • The inverted index is combination of the keywords and posting lists associated for indexing of document. In modern age excessive use of technology has increased data volume at a very high rate. Big data is great concern of researchers. An efficient Document indexing in big data has become a major challenge for researchers. All organizations and web engines have limited number of resources such as space and storage which is very crucial in term of data management of information retrieval system. Information retrieval system need to very efficient. Inverted indexing technique is introduced in this research to minimize the delay in retrieval of data in information retrieval system. Inverted index is illustrated and then its issues are discussed and resolve by implementing the scalable inverted index. Then existing algorithm of inverted compared with the naïve inverted index. The Interval list of inverted indexes stores on primary storage except of auxiliary memory. In this research an efficient architecture of information retrieval system is proposed particularly for unstructured data which don't have a predefined structure format and data volume.

Font Classification of English Printed Character using Non-negative Matrix Factorization (NMF를 이용한 영문자 활자체 폰트 분류)

  • Lee, Chang-Woo;Kang, Hyun;Jung, Kee-Chul;Kim, Hang-Joon
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.41 no.2
    • /
    • pp.65-76
    • /
    • 2004
  • Today, most documents are electronically produced and their paleography is digitalized by imaging, resulting in a tremendous number of electronic documents in the shape of images. Therefore, to process these document images, many methods of document structure analysis and recognition have already been proposed, including font classification. Accordingly, the current paper proposes a font classification method for document images that uses non-negative matrix factorization (NMF), which is able to learn part-based representations of objects. In the proposed method, spatially total features of font images are automatically extracted using NMF, then the appropriateness of the features specifying each font is investigated. The proposed method is expected to improve the performance of optical character recognition (OCR), document indexing, and retrieval systems, when such systems adopt a font classifier as a preprocessor.