• Title/Summary/Keyword: Handwritten Data

Search Result 91, Processing Time 0.034 seconds

Improved Handwritten Hangeul Recognition using Deep Learning based on GoogLenet (GoogLenet 기반의 딥 러닝을 이용한 향상된 한글 필기체 인식)

  • Kim, Hyunwoo;Chung, Yoojin
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.7
    • /
    • pp.495-502
    • /
    • 2018
  • The advent of deep learning technology has made rapid progress in handwritten letter recognition in many languages. Handwritten Chinese recognition has improved to 97.2% accuracy while handwritten Japanese recognition approached 99.53% percent accuracy. Hanguel handwritten letters have many similar characters due to the characteristics of Hangeul, so it was difficult to recognize the letters because the number of data was small. In the handwritten Hanguel recognition using Hybrid Learning, it used a low layer model based on lenet and showed 96.34% accuracy in handwritten Hanguel database PE92. In this paper, 98.64% accuracy was obtained by organizing deep CNN (Convolution Neural Network) in handwritten Hangeul recognition. We designed a new network for handwritten Hangeul data based on GoogLenet without using the data augmentation or the multitasking techniques used in Hybrid learning.

HANDWRITTEN HANGUL RECOGNITION MODEL USING MULTI-LABEL CLASSIFICATION

  • HANA CHOI
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • v.27 no.2
    • /
    • pp.135-145
    • /
    • 2023
  • Recently, as deep learning technology has developed, various deep learning technologies have been introduced in handwritten recognition, greatly contributing to performance improvement. The recognition accuracy of handwritten Hangeul recognition has also improved significantly, but prior research has focused on recognizing 520 Hangul characters or 2,350 Hangul characters using SERI95 data or PE92 data. In the past, most of the expressions were possible with 2,350 Hangul characters, but as globalization progresses and information and communication technology develops, there are many cases where various foreign words need to be expressed in Hangul. In this paper, we propose a model that recognizes and combines the consonants, medial vowels, and final consonants of a Korean syllable using a multi-label classification model, and achieves a high recognition accuracy of 98.38% as a result of learning with the public data of Korean handwritten characters, PE92. In addition, this model learned only 2,350 Hangul characters, but can recognize the characters which is not included in the 2,350 Hangul characters

A Production Traceability Information Gathering System based on Handwritten Data Digitalization Technology in Agro-livestock Products (수기정보 전자화 기술 기반의 농축산물 생산이력정보 수집 시스템)

  • Son, Bong-Ki
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.12 no.10
    • /
    • pp.4632-4641
    • /
    • 2011
  • The detailed production traceability information is a fundamental element in successful introduction and revitalization of traceability system. In this paper, we propose a production traceability information gathering system which is based on handwritten data digitalization technology in agro-livestock products. By the proposed system, we can effectively gather the detailed production traceability information with digital pen and the management ledger of paper document type by only writing the ledger. The server of the system generates the same digital image as the ledger and converts the handwritten data into digital text to insert the data into the database. Because the system is superior to data gathering system based on PC, PDA and touch screen in mobility, usability, data input speed, suitability in agro-livestock environment, it is possible to effectively gather traceability information of high quality by users even if they have low information ability and insufficient time to input data. We expect that the handwritten data digitalization technology is used to gather document based information in stage of manufacturing, distribution and marketing. In addition, this technology is applied to implementing advanced traceability system with RFID/USN based systems.

A Dataset of Online Handwritten Assamese Characters

  • Baruah, Udayan;Hazarika, Shyamanta M.
    • Journal of Information Processing Systems
    • /
    • v.11 no.3
    • /
    • pp.325-341
    • /
    • 2015
  • This paper describes the Tezpur University dataset of online handwritten Assamese characters. The online data acquisition process involves the capturing of data as the text is written on a digitizer with an electronic pen. A sensor picks up the pen-tip movements, as well as pen-up/pen-down switching. The dataset contains 8,235 isolated online handwritten Assamese characters. Preliminary results on the classification of online handwritten Assamese characters using the above dataset are presented in this paper. The use of the support vector machine classifier and the classification accuracy for three different feature vectors are explored in our research.

A Study of Construction of Character Image Data for Recognition Handwritten Text (필기체 문자 인식을 위한 문자 영상 데이터 구축에 관한 연구)

  • Lee, H.R.;Ko, K.C.;Lee, M.R.
    • Annual Conference on Human and Language Technology
    • /
    • 2000.10d
    • /
    • pp.63-67
    • /
    • 2000
  • In order to develop a character recognition system, it is an essential preceding work that gathers an image data of the standard. On this purpose a data of the digitized images of a handwritten characters was collected. The types of a gathered image data are Korean character, Chiness character, Numeral, English character, Special character, and so on. This paper deals with a handwritten character image data base, and the image data base different from the general storage structure of a lame capacity multimedia was designed and builded.

  • PDF

Oversampling-Based Ensemble Learning Methods for Imbalanced Data (불균형 데이터 처리를 위한 과표본화 기반 앙상블 학습 기법)

  • Kim, Kyung-Min;Jang, Ha-Young;Zhang, Byoung-Tak
    • KIISE Transactions on Computing Practices
    • /
    • v.20 no.10
    • /
    • pp.549-554
    • /
    • 2014
  • Handwritten character recognition data is usually imbalanced because it is collected from the natural language sentences written by different writers. The imbalanced data can cause seriously negative effect on the performance of most of machine learning algorithms. But this problem is typically ignored in handwritten character recognition, because it is considered that most of difficulties in handwritten character recognition is caused by the high variance in data set and similar shapes between characters. We propose the oversampling-based ensemble learning methods to solve imbalanced data problem in handwritten character recognition and to improve the recognition accuracy. Also we show that proposed method achieved improvements in recognition accuracy of minor classes as well as overall recognition accuracy empirically.

A Study on Phoneme Extractions and Recognitions for Handwritten Korean Characters using Context-Free Grammar (CFG 방법을 이용한 필기체 한글에서의 자소추출과 인식에 관한 연구)

  • 김형래;박인갑;서동필;김에녹
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.29B no.9
    • /
    • pp.8-16
    • /
    • 1992
  • This paper presents a method which can recognized the Handwritten Korean characters by using a Context-Free Grammar. The input characters are thinned in order to dwindle the mount of data, the thinned characters are converted into one-dimension strings according to six-forms. when the point of contact among phonemes is found, two phonemes are seperated respectively by marking the index mark (\) at the points. The Context-Free Grammar to input characters is classified into group grammars concerning the similarity of phonemes, input characters are parsed by making use of the Pushdown automata method. As the bent parts in the Handwritten characters are found frequently, We try to correct the bent parts by using the parsing distance measure, which recognize characters according to minium value caused by measuring the weight distance between two sentences. In this experiment, the recognition rate shows 93.8% to 275 Handwritten Korean characters.

  • PDF

A Handwritten Document Digitalization Framework based Defect Management System in Educational Facilities (수기문서 전자화 프레임워크 기반의 교육시설 하자관리 시스템)

  • Son, Bong-Ki
    • The Journal of Sustainable Design and Educational Environment Research
    • /
    • v.9 no.3
    • /
    • pp.1-11
    • /
    • 2010
  • In the construction industry, IT based information system has been diversely applied to increase productivity. Although IT device such as PDA, RFID, Barcode, wireless network and web camera has been introduced to gather information in construction site, the effect of the IT device is limited, because of bringing about additional works of engineer. In this paper, we proposed a defect management system which is based on handwritten document digitalization framework for introducing applicability of new IT device, digital pen. By the proposed system, we can effectively gather and input defect information to defect management system by using digital pen and paper like conventional way. Applying the data gathering device, digital pen to defect management, it is able to increase productivity by improving work process, building up and utilizing defect information database of good quality.

Recognition of Online Handwritten Digit using Zernike Moment and Neural Network (Zerinke 모멘트와 신경망을 이용한 온라인 필기체 숫자 인식)

  • Mun, Won-Ho;Choi, Yeon-Suk;Cha, Eui-Young
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2010.05a
    • /
    • pp.205-208
    • /
    • 2010
  • We introduce a novel feature extraction scheme for online handwritten digit based on utilizing Zernike moment and angulation feature. The time sequential signal from mouse movement on the writing pad is described as a sequence of consecutive points on the x-y plane. So, we can create data-set which are successive and time-sequential pixel position data by preprocessing. Data preprocessed is used for Zernike moment and angulation feature extraction. this feature is scale-, translation-, and rotation-invariant. The extracted specific feature is fed to a BP(backpropagation) neural network, which in turn classifies it as one of the nine digits. In this paper, proposed method not noly show high recognition rate but also need less learning data for 200 handwritten digit data.

  • PDF