• 제목/요약/키워드: Handwritten Document

검색결과 16건 처리시간 0.018초

Text Line Segmentation using AHTC and Watershed Algorithm for Handwritten Document Images

  • Oh, KangHan;Kim, SooHyung;Na, InSeop;Kim, GwangBok
    • International Journal of Contents
    • /
    • 제10권3호
    • /
    • pp.35-40
    • /
    • 2014
  • Text line segmentation is a critical task in handwritten document recognition. In this paper, we propose a novel text-line-segmentation method using baseline estimation and watershed. The baseline-detection algorithm estimates the baseline using Adaptive Head-Tail Connection (AHTC) on the document. Then, the watershed method segments the line region using the baseline-detection result. Finally, the text lines are separated by watershed result and a post-processing algorithm defines the lines more correctly. The scheme successfully segments text lines with 97% accuracy from the handwritten document images in the ICDAR database.

Language Identification in Handwritten Words Using a Convolutional Neural Network

  • Tung, Trieu Son;Lee, Gueesang
    • International Journal of Contents
    • /
    • 제13권3호
    • /
    • pp.38-42
    • /
    • 2017
  • Documents of the last few decades typically include more than one kind of language, so linguistic classification of each word is essential, especially in terms of English and Korean in handwritten documents. Traditional methods mostly use conventional features of structural or stroke features, but sometimes they fail to identify many characteristics of words because of complexity introduced by handwriting. Therefore, traditional methods lead to a considerably more-complicated task and naturally lead to possibly poor results. In this study, convolutional neural network (CNN) is used for classification of English and Korean handwritten words in text documents. Experimental results reveal that the proposed method works effectively compared to previous methods.

Machine Printed and Handwritten Text Discrimination in Korean Document Images

  • Trieu, Son Tung;Lee, Guee Sang
    • 스마트미디어저널
    • /
    • 제5권3호
    • /
    • pp.30-34
    • /
    • 2016
  • Nowadays, there are a lot of Korean documents, which often need to be identified in one of printed or handwritten text. Early methods for the identification use structural features, which can be simple and easy to apply to text of a specific font, but its performance depends on the font type and characteristics of the text. Recently, the bag-of-words model has been used for the identification, which can be invariant to changes in font size, distortions or modifications to the text. The method based on bag-of-words model includes three steps: word segmentation using connected component grouping, feature extraction, and finally classification using SVM(Support Vector Machine). In this paper, bag-of-words model based method is proposed using SURF(Speeded Up Robust Feature) for the identification of machine printed and handwritten text in Korean documents. The experiment shows that the proposed method outperforms methods based on structural features.

디지털펜과 필기체인식 기술을 이용한 수기문서 전자화 프레임워크 (A Framework for Digitalizing Handwritten Document using Digital Pen and Handwriting Recognition Technology)

  • 손봉기;김학준
    • 한국산학기술학회논문지
    • /
    • 제12권3호
    • /
    • pp.1417-1426
    • /
    • 2011
  • 아직도 다양한 비즈니스 현장에서는 업무특성이나 법률적 제약 때문에 종이문서로 정보를 취득하고 있다. 이러한 수기문서는 IT 시스템을 통한 실시간 정보 처리와 관리를 위해 전자화 과정을 거쳐 디지털 문서로 변환되어 야 한다. 기존의 문서 전자화 시스템은 수기문서를 스캐닝과 후처리 작업을 거쳐 전자화하기 때문에 연속적인 업무 처리가 어렵다. 이 논문에서는 디지털펜과 필기체인식 기술을 이용한 수기문서 전자화 프레임워크인 LiveForm을 제안한다. 또한, 제안한 프레임워크의 적용가능성을 보이기 위해 LiveForm 기반 산업특수가스 유통 서비스를 구현하고 적용 효과를 분석한다. LiveForm은 디지털펜으로 절대좌표값이 인쇄된 종이문서를 작성하면 작성문서와 동일한 디지털이미지를 생성하고, 기록 정보는 필기체인식을 통해 디지털 문자로 변환하여 업무시스템에 자동으로 입력한다. LiveForm 기반 응용시스템은 종이문서 기반 정보 취득이 많은 업무에서 문서 전자화를 위한 스캐닝과 데이터 수동입력없이 취득한 정보를 업무시스템 자동으로 반영할 수 있어 업무 프로세스를 대폭 개선할 수 있다.

수기문서 전자화 프레임워크 기반의 교육시설 하자관리 시스템 (A Handwritten Document Digitalization Framework based Defect Management System in Educational Facilities)

  • 손봉기
    • 교육녹색환경연구
    • /
    • 제9권3호
    • /
    • pp.1-11
    • /
    • 2010
  • In the construction industry, IT based information system has been diversely applied to increase productivity. Although IT device such as PDA, RFID, Barcode, wireless network and web camera has been introduced to gather information in construction site, the effect of the IT device is limited, because of bringing about additional works of engineer. In this paper, we proposed a defect management system which is based on handwritten document digitalization framework for introducing applicability of new IT device, digital pen. By the proposed system, we can effectively gather and input defect information to defect management system by using digital pen and paper like conventional way. Applying the data gathering device, digital pen to defect management, it is able to increase productivity by improving work process, building up and utilizing defect information database of good quality.

수기정보 전자화 기술 기반의 농축산물 생산이력정보 수집 시스템 (A Production Traceability Information Gathering System based on Handwritten Data Digitalization Technology in Agro-livestock Products)

  • 손봉기
    • 한국산학기술학회논문지
    • /
    • 제12권10호
    • /
    • pp.4632-4641
    • /
    • 2011
  • 이 논문에서는 농축산물 이력추적관리제의 성공적 도입 및 확대에 있어 중요한 기반요소인 생산이력정보를 효율적으로 수집할 수 있는 수기정보 전자화 기술 기반의 농축산물 생산이력정보 수집 시스템을 제안한다. 제안 시스템은 디지털펜으로 종이문서 형태의 관리대장 작성만으로 기록 대장과 동일한 디지털이미지를 생성하고, 필기체인식을 통해 기록 내용을 데이터베스화한다. 제안 시스템은 PC, PDA, 터치스크린 등의 정보 수집기기에 비해 이동성, 사용 편이성, 데이터 입력 속도 측면에서 뛰어나고, 열악한 농축산 작업 환경에서 사용하기 적합하기 때문에 전산능력과 시간적 여유가 없는 농가에서 효율적으로 양질의 생산이력정보를 수집할 수 있다. 수기정보 전자화 기술은 가공, 유통, 판매 단계의 종이문서 기반 정보취득 업무에 적용될 수 있으며, RFID/USN 기반 시스템과 연동하여 고도화된 이력추적관리 시스템 구축에 사용될 수 있다.

Document Structure Understanding on Subjects Registration Table

  • Ito, Yuichi;Ohno, Masanaga;Tsuruoka, Shinji;Yoshikawa, Tomohiro;Tsuyoshi, Shinogi
    • 한국지능시스템학회:학술대회논문집
    • /
    • 한국퍼지및지능시스템학회 2003년도 ISIS 2003
    • /
    • pp.571-574
    • /
    • 2003
  • This research is aimed to automate the generating process of the database from paper based table forms like this work. The registration table has so complicate table structures, ana in this research we used the registration tables as an example of general table structure understanding. We propose a table structure understanding system for some table types, and it has some steps. The first step is that the document images on paper are read from the image scanner. The second step is that a document image segments into some tables. In the third step, the character strings is extracted using image processing technology and the property of the character strings is determined. And the structured database is generated automatically. The proposed system consists of two systems. "Master document generation system" is used for the table form definition, and it doesn′t include the handwritten characters. "Structure analysis system for complete d table" is used for the written form, and it analyzes the table form filled in the handwritten character. We implemented the system using MS Visual C++ on Windows, and it can get the correct extraction rate 98% among 51 registration tables written by the different students.

  • PDF

Text Line Segmentation of Handwritten Documents by Area Mapping

  • Boragule, Abhijeet;Lee, GueeSang
    • 스마트미디어저널
    • /
    • 제4권3호
    • /
    • pp.44-49
    • /
    • 2015
  • Text line segmentation is a preprocessing step in OCR, which can significantly influence the accuracy of document analysis applications. This paper proposes a novel methodology for the text line segmentation of handwritten documents. First, the average width of the connected components is used to form a 1-D Gaussian kernel and a smoothing operation is then applied to the input binary image. The adaptive binarization of the smoothed image forms the final text lines. In this work, the segmentation method involves two stages: firstly, the large connected components are labelled as a unique text line using text line area mapping. Secondly, the final refinement of the segmentation is performed using the Euclidean distance between the text line and small connected components. The group of uniquely labelled text candidates achieves promising segmentation results. The proposed approach works well on Korean and English language handwritten documents captured using a camera.

문서 입출력 시스템의 구성에 관한 연구 (A Study on the Construction of a Document Input/Output system)

  • 함영국;도상윤;정홍규;김우성;박래홍;이창범;김상중
    • 전자공학회논문지B
    • /
    • 제29B권10호
    • /
    • pp.100-112
    • /
    • 1992
  • In this paper, an integrated document input/output system is developed which constructs the graphic document from a text file, converts the document into encoded facsimile data, and also recognizes printed/handwritten alphanumerics and Korean characters in a facsimile or graphic document. For an output system, we develop the method which generates bit-map patterns from the document consisting of the KSC5601 and ASCII codes. The binary graphic image, if necessary, is encoded by the G3 coding scheme for facsimile transmission. For a user friendly input system for documents consisting of alphanumerics and Korean characters obtained from a facsimile or scanner, we propose a document recognition algirithm utilizing several special features(partial projection, cross point, and distance features) and the membership function of the fuzzy set theory. In summary, we develop an integrated document input/output system and its performance is demonstrated via computer simulation.

  • PDF

정비작업의 생산성 향상을 위한 전자문서자동화시스템 모형 - 건설장비 정비작업을 중심으로 - (Electronic Document Automation System Model for Improving Productivity in maintenance work - in Inspection Process of Construction Equipment Maintenance -)

  • 공명달
    • 대한안전경영과학회지
    • /
    • 제19권3호
    • /
    • pp.49-58
    • /
    • 2017
  • This paper suggests a specific model that could efficiently improve the interaction and the interface between MES(Manufacturing Execution System) server and POP(Point of Production) terminal through electronic document server and electronic pen, bluetooth receiver and form paper in disassembly and process inspection works. The proposed model shows that the new method by electronic document automation system can more efficiently perform to reduce processing time for maintenance work, compared with the current approach by handwritten processing system. It is noted in case of the method by electronic document automation system that the effects of proposed model are as follows; (a) While the processing time per equipment for maintenance by the current method was 300 minutes, the processing time by the new method was 50 minutes. (b) While the processing error ratio by the current method was 20%, the error ratio by the new method was 1%.