• Title/Summary/Keyword: printed text

Search Result 94, Processing Time 0.026 seconds

Machine Printed and Handwritten Text Discrimination in Korean Document Images

  • Trieu, Son Tung;Lee, Guee Sang
    • Smart Media Journal
    • /
    • v.5 no.3
    • /
    • pp.30-34
    • /
    • 2016
  • Nowadays, there are a lot of Korean documents, which often need to be identified in one of printed or handwritten text. Early methods for the identification use structural features, which can be simple and easy to apply to text of a specific font, but its performance depends on the font type and characteristics of the text. Recently, the bag-of-words model has been used for the identification, which can be invariant to changes in font size, distortions or modifications to the text. The method based on bag-of-words model includes three steps: word segmentation using connected component grouping, feature extraction, and finally classification using SVM(Support Vector Machine). In this paper, bag-of-words model based method is proposed using SURF(Speeded Up Robust Feature) for the identification of machine printed and handwritten text in Korean documents. The experiment shows that the proposed method outperforms methods based on structural features.

The Contrast between Traditional Printed Text and Hypertext Reading Comprehension (전통 인쇄텍스트와 하이퍼텍스트 독해력 비교)

  • Hong, Sung-Ryong
    • Journal of Digital Contents Society
    • /
    • v.10 no.4
    • /
    • pp.537-542
    • /
    • 2009
  • The constraints of printed text have been lifted through developments in computer technology which has been identified as a revolutionary force. Hypertexts can be simply defined as electronic text that is found online, in a non-linear manner. In contrast to traditional printed texts, electronic writing depends upon an emergent technology, which is still subject to transformation. Unfortunately more research is needed on the experiences readers have when reading documents in hypertext formats for the purpose of knowledge retention. This study is to research the contrast between the traditional printed texts and hypertexts. Other areas where the literature has been relatively silent will be explored such as the experiences subjects have in reading hypertexts, and printed texts. It was clearly founded that the format of text does significantly influence the recall comprehension level of readers in the Printed Text and Hypertext Groups.

  • PDF

Augmenting Text Document by Controlling Its IR-Reflectance (적외선 반사 특성 제어를 통한 텍스트 문서 증강)

  • Park, Hanhoon;Moon, Kwang-Seok
    • Journal of Korea Multimedia Society
    • /
    • v.20 no.6
    • /
    • pp.882-892
    • /
    • 2017
  • Locally Likely Arrangement Hashing (LLAH) is a method that describes image features based on the geometry between their neighbors. Thus, it has been preferred to implement augmented reality on poorly-textured objects such as text documents. However, LLAH strongly requires that image features be detected with high repeatability and located at a distance from one another. To fulfill the requirement for text document, this paper proposes a method that facilitates the word detection in infrared (IR) range by adjusting the IR-reflectance of words. Specifically, the words are printed out with two different black inks: one is using the K(carbon black) ink only, the other is mixing the C(cyan), M(magenta), Y(yellow) inks. Since only the words printed out with the K ink is visible in IR range, a part of words are selected in advance to be used as features and printed out the K ink. The selected words can be robustly detected with high repeatability in IR range and this enables to implement augmented reality on text documents with high fidelity. The validity of the proposed method was verified through experiments.

Adaptive Character Segmentation to Improve Text Recognition Accuracy on Mobile Phones (모바일 시스템에서 텍스트 인식 위한 적응적 문자 분할)

  • Kim, Jeong Sik;Yang, Hyung Jeong;Kim, Soo Hyung;Lee, Guee Sang;Do, Luu Ngoc;Kim, Sun Hee
    • Smart Media Journal
    • /
    • v.1 no.4
    • /
    • pp.59-71
    • /
    • 2012
  • Since mobile phones are used as common communication devices, their applications are increasingly important to human's life. Using smart-phones camera to collect daily life environment's information is one of targets for many applications such as text recognition, object recognition or context awareness. Studies have been conducted to provide important information through the recognition of texts, which are artificially or naturally included in images and movies acquired from mobile phones. In this study, a character segmentation method that improves character-recognition accuracy in images obtained from mobile phone cameras is proposed. The proposed method first classifies texts in a given image to printed letters and handwritten letters since segmentation approaches for them are different. For printed letters, rough segmentation process is conducted, then the segmented regions are integrated, deleted, and re-segmented. Segmentation for the handwritten letters is performed after skews are corrected and the characters are classified by integrating them. The experimental result shows our method achieves a successful performance for both printed and handwritten letters as 95.9% and 84.7%, respectively.

  • PDF

Real-time Printed Text Detection System using Deep Learning Model (딥러닝 모델을 활용한 실시간 인쇄물 문자 탐지 시스템)

  • Ye-Jun Choi;Song-Won Kim;Mi-Kyeong Moon
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.19 no.3
    • /
    • pp.523-530
    • /
    • 2024
  • Online, such as web pages and digital documents, have the ability to search for specific words or specific phrases that users want to search in real time. Printed materials such as printed books and reference books often have difficulty finding specific words or specific phrases in real time. This paper describes the development of a deep learning model for detecting text and a real-time character detection system using OCR for recognizing text. This study proposes a method of detecting text using the EAST model, a method of recognizing the detected text using EasyOCR, and a method of expressing the recognized text as a bounding box by comparing a specific word or specific phrase that the user wants to search for. Through this system, users expect to find specific words or phrases they want to search in real time in print, such as books and reference books, and find necessary information easily and quickly.

A Study on Optical Changes and Sequence Discrimination of Toner-printed Text and Writing Text (토너 출력문자와 필기구류 기재문자 간 광학적 변화와 선후관계에 관한 연구)

  • Lee, Ka Young;Yoon, Do-Young;Lee, Joong
    • Korean Chemical Engineering Research
    • /
    • v.55 no.1
    • /
    • pp.135-140
    • /
    • 2017
  • This paper is on a study for discrimination on relative sequence as a most actively discussed topic in forensic document fields. This paper describes the application of the visual spectral comparator and infinite focus microscope as observation methods for overlapping region of printing and writing lines. As a result, we could categorize overlapping region images and identify the sequence of printing and writing lines by various inks.

Implementation of a Web-Based Electronic Text for High School's Probability and Statistics Education

  • Choi, Sook-Hee
    • Communications for Statistical Applications and Methods
    • /
    • v.11 no.2
    • /
    • pp.329-343
    • /
    • 2004
  • With advancement of computer and network, world wide web(WWW) as a medium of information communication is generalized in many fields. In educational aspect, applications of WWW as alternative media for class teachings or printed matters are increasing. In this article, we demonstrate a web-based electronic text on the 'probability and statistics' which is one of six fields of mathematics in the 7th curriculum. This text places importance on comprehension of concepts of probability and statistics as an applied science.

Recognition of Characters Printed on PCB Components Using Deep Neural Networks (심층신경망을 이용한 PCB 부품의 인쇄문자 인식)

  • Cho, Tai-Hoon
    • Journal of the Semiconductor & Display Technology
    • /
    • v.20 no.3
    • /
    • pp.6-10
    • /
    • 2021
  • Recognition of characters printed or marked on the PCB components from images captured using cameras is an important task in PCB components inspection systems. Previous optical character recognition (OCR) of PCB components typically consists of two stages: character segmentation and classification of each segmented character. However, character segmentation often fails due to corrupted characters, low image contrast, etc. Thus, OCR without character segmentation is desirable and increasingly used via deep neural networks. Typical implementation based on deep neural nets without character segmentation includes convolutional neural network followed by recurrent neural network (RNN). However, one disadvantage of this approach is slow execution due to RNN layers. LPRNet is a segmentation-free character recognition network with excellent accuracy proved in license plate recognition. LPRNet uses a wide convolution instead of RNN, thus enabling fast inference. In this paper, LPRNet was adapted for recognizing characters printed on PCB components with fast execution and high accuracy. Initial training with synthetic images followed by fine-tuning on real text images yielded accurate recognition. This net can be further optimized on Intel CPU using OpenVINO tool kit. The optimized version of the network can be run in real-time faster than even GPU.

Stroke Width-Based Contrast Feature for Document Image Binarization

  • Van, Le Thi Khue;Lee, Gueesang
    • Journal of Information Processing Systems
    • /
    • v.10 no.1
    • /
    • pp.55-68
    • /
    • 2014
  • Automatic segmentation of foreground text from the background in degraded document images is very much essential for the smooth reading of the document content and recognition tasks by machine. In this paper, we present a novel approach to the binarization of degraded document images. The proposed method uses a new local contrast feature extracted based on the stroke width of text. First, a pre-processing method is carried out for noise removal. Text boundary detection is then performed on the image constructed from the contrast feature. Then local estimation follows to extract text from the background. Finally, a refinement procedure is applied to the binarized image as a post-processing step to improve the quality of the final results. Experiments and comparisons of extracting text from degraded handwriting and machine-printed document image against some well-known binarization algorithms demonstrate the effectiveness of the proposed method.

Destination Address Block Location on Machine-printed and Handwritten Korean Mail Piece Images (인쇄 및 필기 한글 우편영상에서의 수취인 주소 영역 추출 방법)

  • 정선화;장승익;임길택;남윤석
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.1
    • /
    • pp.8-19
    • /
    • 2004
  • In this paper, we propose an efficient method for locating destination address block on both of machine-Printed and handwritten Korean mail piece images. The proposed method extracts connected components from the binary mail piece image, generates text lines by merging them, and then groups the text fines into nine clusters. The destination address block is determined by selecting some clusters. Considering the geometric characteristics of address information on Korean mail piece, we split a mail piece image into nine areas with an equal size. The nine clusters are initialized with the center coordinate of each area. A modified Manhattan distance function is used to compute the distance between text lines and clusters. We modified the distance function on which the aspect ratio of mail piece could be reflected. The experiment done with live Korean mail piece images has demonstrated the superiority of the Proposed method. The success rate for 1, 988 testing images was about 93.56%.