• Title/Summary/Keyword: handwritten

Search Result 355, Processing Time 0.03 seconds

A Comprehensive Approach for Tamil Handwritten Character Recognition with Feature Selection and Ensemble Learning

  • Manoj K;Iyapparaja M
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.6
    • /
    • pp.1540-1561
    • /
    • 2024
  • This research proposes a novel approach for Tamil Handwritten Character Recognition (THCR) that combines feature selection and ensemble learning techniques. The Tamil script is complex and highly variable, requiring a robust and accurate recognition system. Feature selection is used to reduce dimensionality while preserving discriminative features, improving classification performance and reducing computational complexity. Several feature selection methods are compared, and individual classifiers (support vector machines, neural networks, and decision trees) are evaluated through extensive experiments. Ensemble learning techniques such as bagging, and boosting are employed to leverage the strengths of multiple classifiers and enhance recognition accuracy. The proposed approach is evaluated on the HP Labs Dataset, achieving an impressive 95.56% accuracy using an ensemble learning framework based on support vector machines. The dataset consists of 82,928 samples with 247 distinct classes, contributed by 500 participants from Tamil Nadu. It includes 40,000 characters with 500 user variations. The results surpass or rival existing methods, demonstrating the effectiveness of the approach. The research also offers insights for developing advanced recognition systems for other complex scripts. Future investigations could explore the integration of deep learning techniques and the extension of the proposed approach to other Indic scripts and languages, advancing the field of handwritten character recognition.

A Study on Binarization of Handwritten Character Image (필기체 문자 영상의 이진화에 관한 연구)

  • 최영규;이상범
    • Journal of the Korea Computer Industry Society
    • /
    • v.3 no.5
    • /
    • pp.575-584
    • /
    • 2002
  • On-line handwritten character recognition be achieved successful results since effectively neural networks divided the letter which is the time ordering of strokes and stroke position. But off-line handwritten character recognition is in difficulty of incomplete preprocessing because has not information of motion or time and has frequently overlap of the letter and many noise occurrence. consequently off-line handwritten character recognition needs study of various methods. This paper apply watershed algorithm to preprocessing for off-line handwritten hangul character recognition. This paper presents effective method in four steps in watershed algorithm as consider execution time of watershed algorithm and quality of result image. As apply watershed algorithm with effective structure to preprocessing, can get to the good result of image enhancement and binarization. In this experiment, this paper is estimate the previous method with this paper method for execution time and quality in image. Average execution time on the previous method is 2.16 second and Average execution time on this paper method is 1.72 second. While this paper method is remove noise effectively with overlap stroke, the previous method does not seem to be remove noise effectively with overlap stroke.

  • PDF

An Efficient Slant Correction for Handwritten Hangul Strings using Structural Properties (한글필기체의 구조적 특징을 이용한 효율적 기울기 보정)

  • 유대근;김경환
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.1_2
    • /
    • pp.93-102
    • /
    • 2003
  • A slant correction method for handwritten Korean strings based on analysis of stroke distribution, which effectively reflects structural properties of Korean characters, is presented in this paper. The method aims to deal with typical problems which have been frequently observed in slant correction of handwritten Korean strings with conventional approaches developed for English/European languages. Extracted strokes from a line of text image are classified into two clusters by applying the K-means clustering. Gaussian modeling is applied to each of the clusters and the slant angle is estimated from the model which represents the vertical strokes. Experimental results support the effectiveness of the proposed method. For the performance comparison 1,300 handwritten address string images were used, and the results show that the proposed method has more superior performance than other conventional approaches.

Recognition of Handwritten Numerals using Hybrid Features And Combined Classifier (복합 특징과 결합 인식기에 의한 필기체 숫자인식)

  • 박중조;송영기;김경민
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.5 no.1
    • /
    • pp.14-22
    • /
    • 2001
  • Off-line handwritten numeral recognition is a very difficult task and hard to achieve high recognition results using a single feature and a single classifier, since handwritten numerals contain many pattern variations which mostly depend upon individual writing styles. In this paper, we propose handwritten numeral recognition system using hybrid features and combined classifier. To improve recognition rate, we select mutually helpful features -directional features, crossing point feature and mesh features- and make throe new hybrid feature sets by using these features. These hybrid feature sets hold the local and global characteristics of input numeral images. And we implement combined classifier by combining three neural network classifiers to achieve high recognition rate, where fuzzy integral is used for multiple network fusion. In order to verify the performance of the proposed recognition system, experiments with the unconstrained handwritten numeral database of Concordia University, Canada were performed. As a result, our method has produced 97.85% of the recognition rate.

  • PDF

Triangulation Based Skeletonization and Trajectory Recovery for Handwritten Character Patterns

  • Phan, Dung;Na, In-Seop;Kim, Soo-Hyung;Lee, Guee-Sang;Yang, Hyung-Jeong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.9 no.1
    • /
    • pp.358-377
    • /
    • 2015
  • In this paper, we propose a novel approach for trajectory recovery. Our system uses a triangulation procedure for skeletonization and graph theory to extract the trajectory. Skeletonization extracts the polyline skeleton according to the polygonal contours of the handwritten characters, and as a result, the junction becomes clear and the characters that are touching each other are separated. The approach for the trajectory recovery is based on graph theory to find the optimal path in the graph that has the best representation of the trajectory. An undirected graph model consisting of one or more strokes is constructed from a polyline skeleton. By using the polyline skeleton, our approach accelerates the process to search for an optimal path. In order to evaluate the performance, we built our own dataset, which includes testing and ground-truth. The dataset consist of thousands of handwritten characters and word images, which are extracted from five handwritten documents. To show the relative advantage of our skeletonization method, we first compare the results against those from Zhang-Suen, a state-of-the-art skeletonization method. For the trajectory recovery, we conduct a comparison using the Root Means Square Error (RMSE) and Dynamic Time Warping (DTW) in order to measure the error between the ground truth and the real output. The comparison reveals that our approach has better performance for both the skeletonization stage and the trajectory recovery stage. Moreover, the processing time comparison proves that our system is faster than the existing systems.

Oversampling-Based Ensemble Learning Methods for Imbalanced Data (불균형 데이터 처리를 위한 과표본화 기반 앙상블 학습 기법)

  • Kim, Kyung-Min;Jang, Ha-Young;Zhang, Byoung-Tak
    • KIISE Transactions on Computing Practices
    • /
    • v.20 no.10
    • /
    • pp.549-554
    • /
    • 2014
  • Handwritten character recognition data is usually imbalanced because it is collected from the natural language sentences written by different writers. The imbalanced data can cause seriously negative effect on the performance of most of machine learning algorithms. But this problem is typically ignored in handwritten character recognition, because it is considered that most of difficulties in handwritten character recognition is caused by the high variance in data set and similar shapes between characters. We propose the oversampling-based ensemble learning methods to solve imbalanced data problem in handwritten character recognition and to improve the recognition accuracy. Also we show that proposed method achieved improvements in recognition accuracy of minor classes as well as overall recognition accuracy empirically.

Efficient Handwritten Character Verification Using an Improved Dynamic Time Warping Algorithm (개선된 동적 타임 워핑 알고리즘을 이용한 효율적인 필기문자 감정)

  • Jang, Seok-Woo;Park, Young-Jae;Kim, Gye-Young
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.7
    • /
    • pp.19-26
    • /
    • 2010
  • In this paper, we suggest a efficient handwritten character verification method in on-line environments which automatically analyses two input character string and computes their similarity degrees. The proposed algorithm first applies the circular projection method to input handwritten strings and extracts their representative features including shape, directions, etc. It then calculates the similarity between two character strings by using an improved dynamic time warping (DTW) algorithm. We improved the conventional DTW algorithm efficiently through adopting the branch-and-bound policy to the existing DTW algorithm which is well-known to produce good results in the various optimization problems. The experimental results to verify the performance of the proposed system show that the suggested handwritten character verification method operates more efficiently than the existing DTW and DDTW algorithms in terms of the speed.

A Study on Human Recognition Experiments with Handwritten Digit for Machine Recognition of Handwritten Digit (필기 숫자의 기계 인식을 위한 인간의 필기 숫자 인식 실험에 대한 고찰)

  • Yoon, Sung-Soo;Chung, Hyun-Sook;Yi, Kwang-Oh;Lee, Yill-Byeong;Lee, Sang-Ho
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.18 no.3
    • /
    • pp.373-380
    • /
    • 2008
  • So far there have been many researches on machine-based recognition of handwritten digit. But we have not yet attained the level of performance that can be satisfactory to men. The dissatisfaction with the performance of machine comes from not only the low accuracy of recognition but also the dissimilarity of the recognition results between man and machine. To reduce the difference of machine from man we first made an experiment with the human recognition of handwritten digits and then inquiry into the way of the human recognition that makes the results of men different from that of machine. We found out the attributes that play an important role in the human recognition process through the analysis of the experimental results like uni- and bi-directional confused pairs of digits, several ones unmixed up with another and the redundancy of mis-recognition, and proposed the approach direction to be able to improve the accuracy of the machine-based recognition, and furthermore the similarity in the recognition results of men and machine on the basis of the found facts above.

A Unicode based Deep Handwritten Character Recognition model for Telugu to English Language Translation

  • BV Subba Rao;J. Nageswara Rao;Bandi Vamsi;Venkata Nagaraju Thatha;Katta Subba Rao
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.2
    • /
    • pp.101-112
    • /
    • 2024
  • Telugu language is considered as fourth most used language in India especially in the regions of Andhra Pradesh, Telangana, Karnataka etc. In international recognized countries also, Telugu is widely growing spoken language. This language comprises of different dependent and independent vowels, consonants and digits. In this aspect, the enhancement of Telugu Handwritten Character Recognition (HCR) has not been propagated. HCR is a neural network technique of converting a documented image to edited text one which can be used for many other applications. This reduces time and effort without starting over from the beginning every time. In this work, a Unicode based Handwritten Character Recognition(U-HCR) is developed for translating the handwritten Telugu characters into English language. With the use of Centre of Gravity (CG) in our model we can easily divide a compound character into individual character with the help of Unicode values. For training this model, we have used both online and offline Telugu character datasets. To extract the features in the scanned image we used convolutional neural network along with Machine Learning classifiers like Random Forest and Support Vector Machine. Stochastic Gradient Descent (SGD), Root Mean Square Propagation (RMS-P) and Adaptative Moment Estimation (ADAM)optimizers are used in this work to enhance the performance of U-HCR and to reduce the loss function value. This loss value reduction can be possible with optimizers by using CNN. In both online and offline datasets, proposed model showed promising results by maintaining the accuracies with 90.28% for SGD, 96.97% for RMS-P and 93.57% for ADAM respectively.

An Implementation of Hangul Handwriting Correction Application Based on Deep Learning (딥러닝에 의한 한글 필기체 교정 어플 구현)

  • Jae-Hyeong Lee;Min-Young Cho;Jin-soo Kim
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.29 no.3
    • /
    • pp.13-22
    • /
    • 2024
  • Currently, with the proliferation of digital devices, the significance of handwritten texts in daily lives is gradually diminishing. As the use of keyboards and touch screens increase, a decline in Korean handwriting quality is being observed across a broad spectrum of Korean documents, from young students to adults. However, Korean handwriting still remains necessary for many documentations, as it retains individual unique features while ensuring readability. To this end, this paper aims to implement an application designed to improve and correct the quality of handwritten Korean script The implemented application utilizes the CRAFT (Character-Region Awareness For Text Detection) model for handwriting area detection and employs the VGG-Feature-Extraction as a deep learning model for learning features of the handwritten script. Simultaneously, the application presents the user's handwritten Korean script's reliability on a syllable-by-syllable basis as a recognition rate and also suggests the most similar fonts among candidate fonts. Furthermore, through various experiments, it can be confirmed that the proposed application provides an excellent recognition rate comparable to conventional commercial character recognition OCR systems.