Search | Korea Science

Document Image Segmentation and Classification using Texture Features and Structural Information (텍스쳐 특징과 구조적인 정보를 이용한 문서 영상의 분할 및 분류)

Park, Kun-Hye;Kim, Bo-Ram;Kim, Wook-Hyun
- Journal of the Institute of Convergence Signal Processing
- /
- v.11 no.3
- /
- pp.215-220
- /
- 2010
In this paper, we propose a new texture-based page segmentation and classification method in which table region, background region, image region and text region in a given document image are automatically identified. The proposed method for document images consists of two stages, document segmentation and contents classification. In the first stage, we segment the document image, and then, we classify contents of document in the second stage. The proposed classification method is based on a texture analysis. Each contents in the document are considered as regions with different textures. Thus the problem of classification contents of document can be posed as a texture segmentation and analysis problem. Two-dimensional Gabor filters are used to extract texture features for each of these regions. Our method does not assume any a priori knowledge about content or language of the document. As we can see experiment results, our method gives good performance in document segmentation and contents classification. The proposed system is expected to apply such as multimedia data searching, real-time image processing.
PDF KSCI

Document Layout Analysis Using Coarse/Fine Strategy (Coarse/fine 전략을 이용한 문서 구조 분석)

박동열;곽희규;김수형
- Proceedings of the IEEK Conference
- /
- 2000.06d
- /
- pp.198-201
- /
- 2000
We propose a method for analyzing the document structure. This method consists of two processes, segmentation and classification. The segmentation first divides a low resolution image, and then finely splits the original document image using projection profiles. The classification deterimines each segmented region as text, line, table or image. An experiment with 238 documents images shows that the segmentation accuracy is 99.1% and the classification accuracy is 97.3%.
PDF

Development of an image processing algorithm for korean document recognition (인식률을 향상한 한글문서 인식 알고리즘 개발)

김희식;김영재;이평원
- 제어로봇시스템학회:학술대회논문집
- /
- 1997.10a
- /
- pp.1391-1394
- /
- 1997
This paper proposes a new image processing algorithm to recognize korean documents. It take out the region of text area form input image, then it makes esgmentation of lines, words and characters in the text. A precision segmentation is very important to recognize the input document. The input image has 8-bit gray scaled resolution. Not only the histogram but also brightness dispersion graph are used for segmentation. The result shows a higher accuracy of document recognition.
PDF

An Efficient Block Segmentation and Classification of a Document Image Using Edge Information (문서영상의 에지 정보를 이용한 효과적인 블록분할 및 유형분류)

박창준;전준형;최형문
- Journal of the Korean Institute of Telematics and Electronics B
- /
- v.33B no.10
- /
- pp.120-129
- /
- 1996
This paper presents an efficient block segmentation and classification using the edge information of the document image. We extract four prominent features form the edge gradient and orientaton, all of which, and thereby the block clssifications, are insensitive to the background noise and the brightness variation of of the image. Using these four features, we can efficiently classify a document image into the seven categrories of blocks of small-size letters, large-size letters, tables, equations, flow-charts, graphs, and photographs, the first five of which are text blocks which are character-recognizable, and the last two are non-character blocks. By introducing the clumn interval and text line intervals of the document in the determination of th erun length of CRLA (constrained run length algorithm), we can obtain an efficient block segmentation with reduced memory size. The simulation results show that the proposed algorithm can rigidly segment and classify the blocks of the documents into the above mentioned seven categories and classification performance is high enough for all the categories except for the graphs with too much variations.
PDF

Automatic Title Detection by Spatial Feature and Projection Profile for Document Images (공간 정보와 투영 프로파일을 이용한 문서 영상에서의 타이틀 영역 추출)

Park, Hyo-Jin;Kim, Bo-Ram;Kim, Wook-Hyun
- Journal of the Institute of Convergence Signal Processing
- /
- v.11 no.3
- /
- pp.209-214
- /
- 2010
This paper proposes an algorithm of segmentation and title detection for document image. The automated title detection method that we have developed is composed of two phases, segmentation and title area detection. In the first phase, we extract and segment the document image. To perform this operation, the binary map is segmented by combination of morphological operation and CCA(connected component algorithm). The first phase provides segmented regions that would be detected as title area for the second stage. Candidate title areas are detected using geometric information, then we can extract the title region that is performed by removing non-title regions. After classification step that removes non-text regions, projection is performed to detect a title region. From the fact that usually the largest font is used for the title in the document, horizontal projection is performed within text areas. In this paper, we proposed a method of segmentation and title detection for various forms of document images using geometric features and projection profile analysis. The proposed system is expected to have various applications, such as document title recognition, multimedia data searching, real-time image processing and so on.
PDF KSCI

DP-LinkNet: A convolutional network for historical document image binarization

Xiong, Wei;Jia, Xiuhong;Yang, Dichun;Ai, Meihui;Li, Lirong;Wang, Song
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.15 no.5
- /
- pp.1778-1797
- /
- 2021
Document image binarization is an important pre-processing step in document analysis and archiving. The state-of-the-art models for document image binarization are variants of encoder-decoder architectures, such as FCN (fully convolutional network) and U-Net. Despite their success, they still suffer from three limitations: (1) reduced feature map resolution due to consecutive strided pooling or convolutions, (2) multiple scales of target objects, and (3) reduced localization accuracy due to the built-in invariance of deep convolutional neural networks (DCNNs). To overcome these three challenges, we propose an improved semantic segmentation model, referred to as DP-LinkNet, which adopts the D-LinkNet architecture as its backbone, with the proposed hybrid dilated convolution (HDC) and spatial pyramid pooling (SPP) modules between the encoder and the decoder. Extensive experiments are conducted on recent document image binarization competition (DIBCO) and handwritten document image binarization competition (H-DIBCO) benchmark datasets. Results show that our proposed DP-LinkNet outperforms other state-of-the-art techniques by a large margin. Our implementation and the pre-trained models are available at https://github.com/beargolden/DP-LinkNet.
https://doi.org/10.3837/tiis.2021.05.011 인용 PDF KSCI HTML

Texture-based PCA for Analyzing Document Image (텍스처 정보 기반의 PCA를 이용한 문서 영상의 분석)

Kim, Bo-Ram;Kim, Wook-Hyun
- Proceedings of the IEEK Conference
- /
- 2006.06a
- /
- pp.283-284
- /
- 2006
In this paper, we propose a novel segmentation and classification method using texture features for the document image. First, we extract the local entropy and then segment the document image to separate the background and the foreground using the Otsu's method. Finally, we classify the segmented regions into each component using PCA(principle component analysis) algorithm based on the texture features that are extracted from the co-occurrence matrix for the entropy image. The entropy-based segmentation is robust to not only noise and the change of light, but also skew and rotation. Texture features are not restricted from any form of the document image and have a superior discrimination for each component. In addition, PCA algorithm used for the classifier can classify the components more robustly than neural network.
PDF

Document Layout Analysis Based on Fuzzy Energy Matrix

Oh, KangHan;Kim, SooHyung
- International Journal of Contents
- /
- v.11 no.2
- /
- pp.1-8
- /
- 2015
In this paper, we describe a novel method for document layout analysis that is based on a Fuzzy Energy Matrix (FEM). A FEM is a two-dimensional matrix that contains the likelihood of text and non-text and is generated through the use of Fuzzy theory. The key idea is to define an Energy map for the document to categorize text and non-text. The proposed mechanism is designed for execution with a low-resolution document image, and hence our method has a fast processing speed. The proposed method has been tested on public ICDAR 2009 datasets to conduct a comparison against other state-of-the-art methods, and it was also tested with Korean documents. The results of the experiment indicate that this scheme achieves superior segmentation accuracy, in terms of both precision and recall, and also requires less time for computation than other state-of-the-art document image analysis methods.
https://doi.org/10.5392/IJoC.2015.11.2.001 인용 PDF KSCI KPUBS HTML

Block Classification of Document Images by Block Attributes and Texture Features (블록의 속성과 질감특징을 이용한 문서영상의 블록분류)

Jang, Young-Nae;Kim, Joong-Soo;Lee, Cheol-Hee
- Journal of Korea Multimedia Society
- /
- v.10 no.7
- /
- pp.856-868
- /
- 2007
We propose an effective method for block classification in a document image. The gray level document image is converted to the binary image for a block segmentation. This binary image would be smoothed to find the locations and sizes of each block. And especially during this smoothing, the inner block heights of each block are obtained. The gray level image is divided to several blocks by these location informations. The SGLDM(spatial gray level dependence matrices) are made using the each gray-level document block and the seven second-order statistical texture features are extracted from the (0,1) direction's SGLDM which include the document attributes. Document image blocks are classified to two groups, text and non-text group, by the inner block height of the block at the nearest neighbor rule. The seven texture features(that were extracted from the SGLDM) are used for the five detail categories of small font, large font, table, graphic and photo blocks. These document blocks are available not only for structure analysis of document recognition but also the various applied area.
PDF

A Study on Extraction of Character String in Document Image Using Morphology (Morphology를 이용한 문서화상내의 문자열 추출에 관한 연구)

장희돈;김동현;김석태;남궁재찬
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.18 no.1
- /
- pp.123-132
- /
- 1993
This paper presents the segmentation of sentence area and diagram area from docwnent image. For extracting the sentence area, we perform the Dilation, basic operation of Morphology, to the document image and obtain the smeared document image. After the smeared docwnent image is blocked, we determine the writing form by the vertical and horizontal characteristics of the document image and calculate the skew from it. And then, we relocate the document image and extract the chatacter string from the relocated docwnent. 11 document images of three classes are considered and the character string has been well extracting from 11 document images.
PDF

Search Result 51, Processing Time 0.04 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)