• Title/Summary/Keyword: Document Image

Search Result 300, Processing Time 0.034 seconds

Block Classification of Document Images by Block Attributes and Texture Features (블록의 속성과 질감특징을 이용한 문서영상의 블록분류)

  • Jang, Young-Nae;Kim, Joong-Soo;Lee, Cheol-Hee
    • Journal of Korea Multimedia Society
    • /
    • v.10 no.7
    • /
    • pp.856-868
    • /
    • 2007
  • We propose an effective method for block classification in a document image. The gray level document image is converted to the binary image for a block segmentation. This binary image would be smoothed to find the locations and sizes of each block. And especially during this smoothing, the inner block heights of each block are obtained. The gray level image is divided to several blocks by these location informations. The SGLDM(spatial gray level dependence matrices) are made using the each gray-level document block and the seven second-order statistical texture features are extracted from the (0,1) direction's SGLDM which include the document attributes. Document image blocks are classified to two groups, text and non-text group, by the inner block height of the block at the nearest neighbor rule. The seven texture features(that were extracted from the SGLDM) are used for the five detail categories of small font, large font, table, graphic and photo blocks. These document blocks are available not only for structure analysis of document recognition but also the various applied area.

  • PDF

Keyword Spotting on Hangul Document Images Using Image-to-Image Matching (영상 대 영상 매칭을 이용한 한글 문서 영상에서의 단어 검색)

  • Park Sang Cheol;Son Hwa Jeong;Kim Soo Hyung
    • The KIPS Transactions:PartB
    • /
    • v.12B no.3 s.99
    • /
    • pp.357-364
    • /
    • 2005
  • In this paper, we propose an accurate and fast keyword spotting system for searching user-specified keyword in Hangul document images by using two-level image-to-image matching. The system is composed of character segmentation, creating a query image, feature extraction, and matching procedure. Two different feature vectors are used in the matching procedure. An experiment using 1600 Hangul word images from 8 document images, downloaded from the website of Korea Information Science Society, demonstrates that the proposed system is superior to conventional image-based document retrieval systems.

A study on secure transmission system for document image using mixing algorithm (합성 알고리즘을 이용한 안전한 문서화상 전송체계에 관한 연구)

  • 박일남;이대영
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.22 no.11
    • /
    • pp.2552-2562
    • /
    • 1997
  • This paepr presents a secure transmission system for document image using mixing algorithm. For this, we apply DM and RDM algorithm propoposed before. The transmitter embeds secretly the signature onto secure document, embeds it to non-secure document and transfers it to the receiver. The receiver makes a check of any forgery on the signature and the document. The total amount of data transmitted and the image quallity are about the same to that of the original document. Thus, a third party can not notice the fact that signatures and secure document is embedded on the document.

  • PDF

Development of an image processing algorithm for korean document recognition (인식률을 향상한 한글문서 인식 알고리즘 개발)

  • 김희식;김영재;이평원
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1997.10a
    • /
    • pp.1391-1394
    • /
    • 1997
  • This paper proposes a new image processing algorithm to recognize korean documents. It take out the region of text area form input image, then it makes esgmentation of lines, words and characters in the text. A precision segmentation is very important to recognize the input document. The input image has 8-bit gray scaled resolution. Not only the histogram but also brightness dispersion graph are used for segmentation. The result shows a higher accuracy of document recognition.

  • PDF

History Document Image Background Noise and Removal Methods

  • Ganchimeg, Ganbold
    • International Journal of Knowledge Content Development & Technology
    • /
    • v.5 no.2
    • /
    • pp.11-24
    • /
    • 2015
  • It is common for archive libraries to provide public access to historical and ancient document image collections. It is common for such document images to require specialized processing in order to remove background noise and become more legible. Document images may be contaminated with noise during transmission, scanning or conversion to digital form. We can categorize noises by identifying their features and can search for similar patterns in a document image to choose appropriate methods for their removal. In this paper, we propose a hybrid binarization approach for improving the quality of old documents using a combination of global and local thresholding. This article also reviews noises that might appear in scanned document images and discusses some noise removal methods.

A Study On Runlength Distance Mixing Algorithm For Document Image (문서화상에 대한 차분부호장 혼합 합성 알고리즘)

  • 박일남
    • The Journal of Information Technology
    • /
    • v.5 no.1
    • /
    • pp.1-12
    • /
    • 2002
  • This paper presents a composition method for document image using RDM algorithm. It is possible to compose about double quantity of document image in same document space compared with RL or DM algorithm, if it used. RDM algorithm is available to compose secret document as well as digital signatures onto non-secure document. In this case, secure transmissin of document will be realize because the third party do not recognize secure transmission.

  • PDF

Design and Implementation of Digital Jikin using Smartphone Application

  • Hong, Daewon;Kang, Miju;Chun, Junchul
    • Journal of Internet Computing and Services
    • /
    • v.18 no.5
    • /
    • pp.87-94
    • /
    • 2017
  • Due to the recent advances of IT industry, many companies and institutions have been used electronic documents rather than original paper copies. However, the characteristic of electronic document allows it to be readily damaged from proscribed copying, counterfeit, and falsification. These can cause the serious security problems for electronic documents. Conventional security methods for digital documents involve adding a separated image or marker, but these methods can reduce the readability of document. Therefore, we proposed a digital Jikin (Korean traditional stamp) which is normally used to identify the source or author of a document in asia. The proposed digital Jikin can preserve the readability of electronic document while protecting the document from proscribed copying, counterfeit, or falsification using image processing approach. In this paper, a digital Jikin application is designed and implemented under android platform and it converts the critical information of document onto the digital Jikin. The proposed digital Jikin contains important information in the boundary of Jikin not only about the author of documents or source, but also keywords, number of images, and many more. Therefore, the authenticity of document or whether the document has been altered or not by other person can be evaluated by the server. The proposed digital Jikin can be sent to a server through the wireless networks and can be stored using PHP and MySQL. We believe that the proposed method can offer the better and simple solution for strengthening the security of electronic document.

EXTRACTION OF CHARACTERS FROM THE QUADTREE ENCODE DOCUMENT IMAGE OF HANGUL (쿼드트리로 구성된 한글 문서 영상에서의 문자추출에 관한 연구)

  • Park, Eun-Kyoung;Cho, Dong-Sub
    • Proceedings of the KIEE Conference
    • /
    • 1991.11a
    • /
    • pp.201-204
    • /
    • 1991
  • In this paper the method of representing the document image by the quadtree data structure, and extracting each character seperately from the constructed quadtree are described. The document image is represented by a binary encoded quadtree and the segmentation is performed according to the information of each leaf node of the quadtree. Then, each character is extracted by the relation of positions of segments. This method enables to extract characters without examining every pixel in the image and the required storage of document image is decreased.

  • PDF

Texture-based PCA for Analyzing Document Image (텍스처 정보 기반의 PCA를 이용한 문서 영상의 분석)

  • Kim, Bo-Ram;Kim, Wook-Hyun
    • Proceedings of the IEEK Conference
    • /
    • 2006.06a
    • /
    • pp.283-284
    • /
    • 2006
  • In this paper, we propose a novel segmentation and classification method using texture features for the document image. First, we extract the local entropy and then segment the document image to separate the background and the foreground using the Otsu's method. Finally, we classify the segmented regions into each component using PCA(principle component analysis) algorithm based on the texture features that are extracted from the co-occurrence matrix for the entropy image. The entropy-based segmentation is robust to not only noise and the change of light, but also skew and rotation. Texture features are not restricted from any form of the document image and have a superior discrimination for each component. In addition, PCA algorithm used for the classifier can classify the components more robustly than neural network.

  • PDF

Automatic Title Detection by Spatial Feature and Projection Profile for Document Images (공간 정보와 투영 프로파일을 이용한 문서 영상에서의 타이틀 영역 추출)

  • Park, Hyo-Jin;Kim, Bo-Ram;Kim, Wook-Hyun
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.11 no.3
    • /
    • pp.209-214
    • /
    • 2010
  • This paper proposes an algorithm of segmentation and title detection for document image. The automated title detection method that we have developed is composed of two phases, segmentation and title area detection. In the first phase, we extract and segment the document image. To perform this operation, the binary map is segmented by combination of morphological operation and CCA(connected component algorithm). The first phase provides segmented regions that would be detected as title area for the second stage. Candidate title areas are detected using geometric information, then we can extract the title region that is performed by removing non-title regions. After classification step that removes non-text regions, projection is performed to detect a title region. From the fact that usually the largest font is used for the title in the document, horizontal projection is performed within text areas. In this paper, we proposed a method of segmentation and title detection for various forms of document images using geometric features and projection profile analysis. The proposed system is expected to have various applications, such as document title recognition, multimedia data searching, real-time image processing and so on.