• Title/Summary/Keyword: Script Recognition

Search Result 45, Processing Time 0.024 seconds

An Arabic Script Recognition System

  • Alginahi, Yasser M.;Mudassar, Mohammed;Nomani Kabir, Muhammad
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.9 no.9
    • /
    • pp.3701-3720
    • /
    • 2015
  • A system for the recognition of machine printed Arabic script is proposed. The Arabic script is shared by three languages i.e., Arabic, Urdu and Farsi. The three languages have a descent amount of vocabulary in common, thus compounding the problems for identification. Therefore, in an ideal scenario not only the script has to be differentiated from other scripts but also the language of the script has to be recognized. The recognition process involves the segregation of Arabic scripted documents from Latin, Han and other scripted documents using horizontal and vertical projection profiles, and the identification of the language. Identification mainly involves extracting connected components, which are subjected to Principle Component Analysis (PCA) transformation for extracting uncorrelated features. Later the traditional K-Nearest Neighbours (KNN) algorithm is used for recognition. Experiments were carried out by varying the number of principal components and connected components to be extracted per document to find a combination of both that would give the optimal accuracy. An accuracy of 100% is achieved for connected components >=18 and Principal components equals to 15. This proposed system would play a vital role in automatic archiving of multilingual documents and the selection of the appropriate Arabic script in multi lingual Optical Character Recognition (OCR) systems.

5-Year-Old Children's Script Knowledge According to Task Situation and Socioeconomic Status (과제 상황 및 계층에 따른 만 5세 유아의 스크립트 지식)

  • 성미영;이순형
    • Journal of the Korean Home Economics Association
    • /
    • v.40 no.11
    • /
    • pp.119-130
    • /
    • 2002
  • This study investigated preschool children's script knowledge according to task situation and socioeconomic status. Subjects were seventy-eight 5-year-old children (38 low- and 40 middle-income children; 36 boys and 42 girls) recruited from three day-care centers in Seoul. Each child participated in the script knowledge assessment session. Assessment of script knowledge consisted of a picture-recognition and picture-sequencing task. Statistical methods used for data analysis were means, standard deviations, repeated measures ANOVA. Results showed that children's script knowledge scores were higher in familiar task situation than in unfamiliar task situation. Furthermore, middle-income children had higher scores of script knowledge than low-income children. Findings of this study indicate that there is the difference of script knowledge between low- and middle-income preschoolers.

Wine Label Character Recognition in Mobile Phone Images using a Lexicon-Driven Post-Processing (사전기반 후처리를 이용한 모바일 폰 영상에서 와인 라벨 문자 인식)

  • Lim, Jun-Sik;Kim, Soo-Hyung;Lee, Chil-Woo;Lee, Guee-Sang;Yang, Hyung-Jung;Lee, Myung-Eun
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.5
    • /
    • pp.546-550
    • /
    • 2010
  • In this paper, we propose a method for the postprocessing of cursive script recognition in Wine Label Images. The proposed method mainly consists of three steps: combination matrix generation, character combination filtering, string matching. Firstly, the combination matrix generation step detects all possible combinations from a recognition result for each of the pieces. Secondly, the unnecessary information in the combination matrix is removed by comparing with bigram of word in the lexicon. Finally, string matching step decides the identity of result as a best matched word in the lexicon based on the levenshtein distance. An experimental result shows that the recognition accuracy is 85.8%.

A Study on the Neural Network for the Character Recognition (문자인식을 위한 신경망컴퓨터에 관한 연구)

  • 이창기;전병실
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.29B no.8
    • /
    • pp.1-6
    • /
    • 1992
  • This paper proposed a neural computer architecture for the learning of script character pattern recognition categories. Oriented filter with complex cells preprocess about the input script character, abstracts contour from the character. This contour normalized and inputed to the ART. Top-down attentional and matching mechanisms are critical in self-stabilizing of the code learning process. The architecture embodies a parallel search scheme that updates itself adaptively as the learning process unfolds. After learning ART self-stabilizes, recognition time does not grow as a function of code complexity. Vigilance level shows the similarity between learned patterns and new input patterns. This character recognition system is designed to adaptable. The simulation of this system showed satisfied result in the recognition of the hand written characters.

  • PDF

Manchu Script Letters Dataset Creation and Labeling

  • Aaron Daniel Snowberger;Choong Ho Lee
    • Journal of information and communication convergence engineering
    • /
    • v.22 no.1
    • /
    • pp.80-87
    • /
    • 2024
  • The Manchu language holds historical significance, but a complete dataset of Manchu script letters for training optical character recognition machine-learning models is currently unavailable. Therefore, this paper describes the process of creating a robust dataset of extracted Manchu script letters. Rather than performing automatic letter segmentation based on whitespace or the thickness of the central word stem, an image of the Manchu script was manually inspected, and one copy of the desired letter was selected as a region of interest. This selected region of interest was used as a template to match all other occurrences of the same letter within the Manchu script image. Although the dataset in this study contained only 4,000 images of five Manchu script letters, these letters were collected from twenty-eight writing styles. A full dataset of Manchu letters is expected to be obtained through this process. The collected dataset was normalized and trained using a simple convolutional neural network to verify its effectiveness.

An Implementation of Hangul Handwriting Correction Application Based on Deep Learning (딥러닝에 의한 한글 필기체 교정 어플 구현)

  • Jae-Hyeong Lee;Min-Young Cho;Jin-soo Kim
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.29 no.3
    • /
    • pp.13-22
    • /
    • 2024
  • Currently, with the proliferation of digital devices, the significance of handwritten texts in daily lives is gradually diminishing. As the use of keyboards and touch screens increase, a decline in Korean handwriting quality is being observed across a broad spectrum of Korean documents, from young students to adults. However, Korean handwriting still remains necessary for many documentations, as it retains individual unique features while ensuring readability. To this end, this paper aims to implement an application designed to improve and correct the quality of handwritten Korean script The implemented application utilizes the CRAFT (Character-Region Awareness For Text Detection) model for handwriting area detection and employs the VGG-Feature-Extraction as a deep learning model for learning features of the handwritten script. Simultaneously, the application presents the user's handwritten Korean script's reliability on a syllable-by-syllable basis as a recognition rate and also suggests the most similar fonts among candidate fonts. Furthermore, through various experiments, it can be confirmed that the proposed application provides an excellent recognition rate comparable to conventional commercial character recognition OCR systems.

The Role of Script Type in Janpanese Word Recognition:A Connectionist Model (일본어의 단어인지과정에서 표기형태의 역할:연결주의 모형)

  • ;阿部純
    • Korean Journal of Cognitive Science
    • /
    • v.2 no.2
    • /
    • pp.487-513
    • /
    • 1990
  • The present paper reviews experimental finding such as kanji stroop effect, kana superiority effect in naming task, kanji superiority effect in lexical devision task, and the different pattern of facilitatory priming effect in repetition priming task. Most of the experimental findings indicate that kana script and kanji script are processed independently and modularly. These indications are also consistent with the basic observations on Japanese dyslexics. A connectionist model named JIA(Japanese Interactive Activation)is proposed which is a revision of interactive activation model proposed by McClelland & Rumelhart(1981). The differences between the two models are as follows. Firstly, JIA has a kana module and kanji module at letter level. Secondly, JIA adopts script-specific interconnections between letter-level nodes and word-level nodes:word nodes receive larger activation from the script consistent letter-level nodes. JIA successfully explains all the experimental findings and many cases of Japanese dyslexia. A computer program which simulates JIA model was written and run.

Matching Algorithm for Hangul Recognition Based on PDA

  • Kim Hyeong-Gyun;Choi Gwang-Mi
    • Journal of information and communication convergence engineering
    • /
    • v.2 no.3
    • /
    • pp.161-166
    • /
    • 2004
  • Electronic Ink is a stored data in the form of the handwritten text or the script without converting it into ASCII by handwritten recognition on the pen-based computers and Personal Digital Assistants(PDA) for supporting natural and convenient data input. One of the most important issue is to search the electronic ink in order to use it. We proposed and implemented a script matching algorithm for the electronic ink. Proposed matching algorithm separated the input stroke into a set of primitive stroke using the curvature of the stroke curve. After determining the type of separated strokes, it produced a stroke feature vector. And then it calculated the distance between the stroke feature vector of input strokes and one of strokes in the database using the dynamic programming technique.

Fuzzy-Membership Based Writer Identification from Handwritten Devnagari Script

  • Kumar, Rajiv;Ravulakollu, Kiran Kumar;Bhat, Rajesh
    • Journal of Information Processing Systems
    • /
    • v.13 no.4
    • /
    • pp.893-913
    • /
    • 2017
  • The handwriting based person identification systems use their designer's perceived structural properties of handwriting as features. In this paper, we present a system that uses those structural properties as features that graphologists and expert handwriting analyzers use for determining the writer's personality traits and for making other assessments. The advantage of these features is that their definition is based on sound historical knowledge (i.e., the knowledge discovered by graphologists, psychiatrists, forensic experts, and experts of other domains in analyzing the relationships between handwritten stroke characteristics and the phenomena that imbeds individuality in stroke). Hence, each stroke characteristic reflects a personality trait. We have measured the effectiveness of these features on a subset of handwritten Devnagari and Latin script datasets from the Center for Pattern Analysis and Recognition (CPAR-2012), which were written by 100 people where each person wrote three samples of the Devnagari and Latin text that we have designed for our experiments. The experiment yielded 100% correct identification on the training set. However, we observed an 88% and 89% correct identification rate when we experimented with 200 training samples and 100 test samples on handwritten Devnagari and Latin text. By introducing the majority voting based rejection criteria, the identification accuracy increased to 97% on both script sets.

A Study on Input Pattern Generation of Neural-Networks for Character Recognition (문자인식 시스템을 위한 신경망 입력패턴 생성에 관한 연구)

  • Shin, Myong-Jun;Kim, Sung-Jong;Son, Young-Ik
    • Proceedings of the KIEE Conference
    • /
    • 2006.04a
    • /
    • pp.129-131
    • /
    • 2006
  • The performances of neural network systems mainly depend on the kind and the number of input patterns for its training. Hence, the kind of input patterns as well as its number is very important for the character recognition system using back-propagation network. The more input patters are used, the better the system recognizes various characters. However, training is not always successful as the number of input patters increases. Moreover, there exists a limit to consider many input patterns of the recognition system for cursive script characters. In this paper we present a new character recognition system using the back-propagation neural networks. By using an additional neural network, an input pattern generation method is provided for increasing the recognition ratio and a successful training. We firstly introduce the structure of the proposed system. Then, the character recognition system is investigated through some experiments.

  • PDF