• Title/Summary/Keyword: WeOCR

Search Result 165, Processing Time 0.037 seconds

A Study on Improvement of Korean OCR Accuracy Using Deep Learning (딥러닝을 이용한 한글 OCR 정확도 향상에 대한 연구)

  • Kang, Ga-Hyeon;Ko, Ji-Hyun;Kwon, Yong-Jun;Kwon, Na-Young;Koh, Seok-Ju
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2018.05a
    • /
    • pp.693-695
    • /
    • 2018
  • In this paper, we propose the improvement of Hangul OCR accuracy through deep learning. OCR is a program that senses printed and handwritten characters in an optical way and encodes them digitally. In the case of the most commonly used Tesseract OCR, the accuracy of English recognition is high. However, Hangul has lower accuracy because it has less learning data for a complex structure. Therefore, in this study, we propose a method to improve the accuracy of Hangul OCR by extracting the character region from the desired image through image processing and using deep learning using it as learning data. It is expected that OCR, which has been developed only by existing alphanumeric and several languages, can be applied to various languages.

  • PDF

Text Region Extraction and OCR on Camera Based Images (카메라 영상 위에서의 문자 영역 추출 및 OCR)

  • Shin, Hyun-Kyung
    • The KIPS Transactions:PartD
    • /
    • v.17D no.1
    • /
    • pp.59-66
    • /
    • 2010
  • Traditional OCR engines are designed to the scanned documents in calibrated environment. Three dimensional perspective distortion and smooth distortion in images are critical problems caused by un-calibrated devices, e.g. image from smart phones. To meet the growing demand of character recognition of texts embedded in the photos acquired from the non-calibrated hand-held devices, we address the problem in three categorical aspects: rotational invariant method of text region extraction, scale invariant method of text line segmentation, and three dimensional perspective mapping. With the integration of the methods, we developed an OCR for camera-captured images.

Development of Smart Household Ledger based on OCR (OCR 기반 스마트 가계부 구현)

  • Chae, Sung-eun;Jung, Ki-seok;Lee, Jeong-yeol;Rho, Young-J.
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.18 no.6
    • /
    • pp.269-276
    • /
    • 2018
  • OCR(Optical Character Recognition) using computers has been developed for 20 years and applied to various fields such as parking management based on the recognition of license plates of cars. This technology was also used in the development of our smart OCR-based household ledger. In order to improve filling the purchase history into a smartphone based household account book, we can take pictures of receipts with the smarphone camera and automatically organize the purchase list. In this process, the recognition rate of the characters of the receipt image is not high enough with OCR technology. We could improve the rate by applying the image processing technology and adjusting the contrast of the receipt image. The rate improved from 89% to 92.5%.

Trends in Deep Learning-based Medical Optical Character Recognition (딥러닝 기반의 의료 OCR 기술 동향)

  • Sungyeon Yoon;Arin Choi;Chaewon Kim;Sumin Oh;Seoyoung Sohn;Jiyeon Kim;Hyunhee Lee;Myeongeun Han;Minseo Park
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.2
    • /
    • pp.453-458
    • /
    • 2024
  • Optical Character Recognition is the technology that recognizes text in images and converts them into digital format. Deep learning-based OCR is being used in many industries with large quantities of recorded data due to its high recognition performance. To improve medical services, deep learning-based OCR was actively introduced by the medical industry. In this paper, we discussed trends in OCR engines and medical OCR and provided a roadmap for development of medical OCR. By using natural language processing on detected text data, current medical OCR has improved its recognition performance. However, there are limits to the recognition performance, especially for non-standard handwriting and modified text. To develop advanced medical OCR, databaseization of medical data, image pre-processing, and natural language processing are necessary.

A Study on the Interference in Single Frequency Network and On Channel Repeater (SFN 및 OCR의 간섭영향에 관한 연구)

  • 최성웅;이형수;오우진
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2003.10a
    • /
    • pp.737-740
    • /
    • 2003
  • SFN (Single Frequency Network) and OCR (On Channel Repeater) are often considered for the efficiency of frequency allotment in digital TV. In this paper, we discuss the performance and evaluate some coverage criterions for SFN and OCR. Also, we propose MATLAB simulator for coverage planning and estimation.

  • PDF

Novel Equalization On-Channel Repeater with Feedback Interference Canceller in Terrestrial Digital Multimedia Broadcasting System

  • Park, Sung-Ik;Eum, Ho-Min;Park, So-Ra;Kim, Geon;Lee, Yong-Tae;Kim, Heung-Mook;Oh, Wang-Rok
    • ETRI Journal
    • /
    • v.31 no.4
    • /
    • pp.357-364
    • /
    • 2009
  • In this paper, we propose a novel equalization on-channel repeater (OCR) with a feedback interference canceller (FIC) to relay terrestrial digital multimedia broadcasting signals in single frequency networks. The proposed OCR not only has high output power by cancelling the feedback signals caused by insufficient antenna isolation through the FIC, but also shows better output signal quality than the conventional OCR by removing multipath signals existing between the main transmitter and the OCR through an equalizer. In addition, computer simulations and laboratory test results demonstrate that the proposed OCR successfully cancels feedback signals and compensates channel distortions and provides a higher quality transmitting signal with higher output power than conventional OCRs.

A Study on the Overcurrent Relay Modeling and Protective Coordination for Overload in Domestic AC Electrical Railway System (국내 교류 전기철도 급전계통 보호용 과전류 계전기 모델링 및 과부하 보호 협조에 관한 연구)

  • Kim, Hyun-Dong;Cho, Gyu-Jung;Huh, Seung-Hoon;Kim, Chul-Hwan
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.65 no.7
    • /
    • pp.1121-1127
    • /
    • 2016
  • In this paper, modeling of overcurrent relay(OCR) to protect domestic AC electric railway Auto Transformer(AT) feeding system and operation characteristic analysis on overload condition are described. The target system of this paper is actual site where overload trip of circuit breaker occurs frequently. Because this AT feeding system is made of parallel single track which had a load(electric train) respectively, and is connected with only T phase of Scott Transformer. In addition, this system has been feeding 66kV voltage by KEPCO, not 154kV. We focus on protective coordination of Scott Transformer primary side and secondary side OCR for Korea single track AC electrical railway system in operation currently. We modeled single track AT feeding system and OCR. Also we performed faults and overload analysis for verification of OCR's setting values and system modeling. To analyze above mentioned research, we used PSCAD/EMTDC software tool.

Development an Android based OCR Application for Hangul Food Menu (한글 음식 메뉴 인식을 위한 OCR 기반 어플리케이션 개발)

  • Lee, Gyu-Cheol;Yoo, Jisang
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.5
    • /
    • pp.951-959
    • /
    • 2017
  • In this paper, we design and implement an Android-based Hangul food menu recognition application that recognizes characters from images captured by a smart phone. Optical Character Recognition (OCR) technology is divided into preprocessing, recognition and post-processing. In the preprocessing process, the characters are extracted using Maximally Stable Extremal Regions (MSER). In recognition process, Tesseract-OCR, a free OCR engine, is used to recognize characters. In the post-processing process, the wrong result is corrected by using the dictionary DB for the food menu. In order to evaluate the performance of the proposed method, experiments were conducted to compare the recognition performance using the actual menu plate as the DB. The recognition rate measurement experiment with OCR Instantly Free, Text Scanner and Text Fairy, which is a character recognizing application in Google Play Store, was conducted. The experimental results show that the proposed method shows an average recognition rate of 14.1% higher than other techniques.

Convolutional Neural Networks for Character-level Classification

  • Ko, Dae-Gun;Song, Su-Han;Kang, Ki-Min;Han, Seong-Wook
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.6 no.1
    • /
    • pp.53-59
    • /
    • 2017
  • Optical character recognition (OCR) automatically recognizes text in an image. OCR is still a challenging problem in computer vision. A successful solution to OCR has important device applications, such as text-to-speech conversion and automatic document classification. In this work, we analyze character recognition performance using the current state-of-the-art deep-learning structures. One is the AlexNet structure, another is the LeNet structure, and the other one is the SPNet structure. For this, we have built our own dataset that contains digits and upper- and lower-case characters. We experiment in the presence of salt-and-pepper noise or Gaussian noise, and report the performance comparison in terms of recognition error. Experimental results indicate by five-fold cross-validation that the SPNet structure (our approach) outperforms AlexNet and LeNet in recognition error.

Keyword Spotting on Hangul Document Images Using Character Feature Models (문자 별 특징 모델을 이용한 한글 문서 영상에서 키워드 검색)

  • Park, Sang-Cheol;Kim, Soo-Hyung;Choi, Deok-Jai
    • The KIPS Transactions:PartB
    • /
    • v.12B no.5 s.101
    • /
    • pp.521-526
    • /
    • 2005
  • In this Paper, we propose a keyword spotting system as an alternative to searching system for poor quality Korean document images and compare the Proposed system with an OCR-based document retrieval system. The system is composed of character segmentation, feature extraction for the query keyword, and word-to-word matching. In the character segmentation step, we propose an effective method to remove the connectivity between adjacent characters and a character segmentation method by making the variance of character widths minimum. In the query creation step, feature vector for the query is constructed by a combination of a character model by typeface. In the matching step, word-to-word matching is applied base on a character-to-character matching. We demonstrated that the proposed keyword spotting system is more efficient than the OCR-based one to search a keyword on the Korean document images, especially when the quality of documents is quite poor and point size is small.