• Title/Summary/Keyword: OCR text extraction

Search Result 28, Processing Time 0.026 seconds

Development of an Automated ESG Document Review System using Ensemble-Based OCR and RAG Technologies

  • Eun-Sil Choi
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.9
    • /
    • pp.25-37
    • /
    • 2024
  • This study proposes a novel automation system that integrates Optical Character Recognition (OCR) and Retrieval-Augmented Generation (RAG) technologies to enhance the efficiency of the ESG (Environmental, Social, and Governance) document review process. The proposed system improves text recognition accuracy by applying an ensemble model-based image preprocessing algorithm and hybrid information extraction models in the OCR process. Additionally, the RAG pipeline optimizes information retrieval and answer generation reliability through the implementation of layout analysis algorithms, re-ranking algorithms, and ensemble retrievers. The system's performance was evaluated using certificate images from online portals and corporate internal regulations obtained from various sources, such as the company's websites. The results demonstrated an accuracy of 93.8% for certification reviews and 92.2% for company regulations reviews, indicating that the proposed system effectively supports human evaluators in the ESG assessment process.

Arabic Words Extraction and Character Recognition from Picturesque Image Macros with Enhanced VGG-16 based Model Functionality Using Neural Networks

  • Ayed Ahmad Hamdan Al-Radaideh;Mohd Shafry bin Mohd Rahim;Wad Ghaban;Majdi Bsoul;Shahid Kamal;Naveed Abbas
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.7
    • /
    • pp.1807-1822
    • /
    • 2023
  • Innovation and rapid increased functionality in user friendly smartphones has encouraged shutterbugs to have picturesque image macros while in work environment or during travel. Formal signboards are placed with marketing objectives and are enriched with text for attracting people. Extracting and recognition of the text from natural images is an emerging research issue and needs consideration. When compared to conventional optical character recognition (OCR), the complex background, implicit noise, lighting, and orientation of these scenic text photos make this problem more difficult. Arabic language text scene extraction and recognition adds a number of complications and difficulties. The method described in this paper uses a two-phase methodology to extract Arabic text and word boundaries awareness from scenic images with varying text orientations. The first stage uses a convolution autoencoder, and the second uses Arabic Character Segmentation (ACS), which is followed by traditional two-layer neural networks for recognition. This study presents the way that how can an Arabic training and synthetic dataset be created for exemplify the superimposed text in different scene images. For this purpose a dataset of size 10K of cropped images has been created in the detection phase wherein Arabic text was found and 127k Arabic character dataset for the recognition phase. The phase-1 labels were generated from an Arabic corpus of quotes and sentences, which consists of 15kquotes and sentences. This study ensures that Arabic Word Awareness Region Detection (AWARD) approach with high flexibility in identifying complex Arabic text scene images, such as texts that are arbitrarily oriented, curved, or deformed, is used to detect these texts. Our research after experimentations shows that the system has a 91.8% word segmentation accuracy and a 94.2% character recognition accuracy. We believe in the future that the researchers will excel in the field of image processing while treating text images to improve or reduce noise by processing scene images in any language by enhancing the functionality of VGG-16 based model using Neural Networks.

Correction of Signboard Distortion by Vertical Stroke Estimation

  • Lim, Jun Sik;Na, In Seop;Kim, Soo Hyung
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.7 no.9
    • /
    • pp.2312-2325
    • /
    • 2013
  • In this paper, we propose a preprocessing method that it is to correct the distortion of text area in Korean signboard images as a preprocessing step to improve character recognition. Distorted perspective in recognizing of Korean signboard text may cause of the low recognition rate. The proposed method consists of four main steps and eight sub-steps: main step consists of potential vertical components detection, vertical components detection, text-boundary estimation and distortion correction. First, potential vertical line components detection consists of four steps, including edge detection for each connected component, pixel distance normalization in the edge, dominant-point detection in the edge and removal of horizontal components. Second, vertical line components detection is composed of removal of diagonal components and extraction of vertical line components. Third, the outline estimation step is composed of the left and right boundary line detection. Finally, distortion of the text image is corrected by bilinear transformation based on the estimated outline. We compared the changes in recognition rates of OCR before and after applying the proposed algorithm. The recognition rate of the distortion corrected signboard images is 29.63% and 21.9% higher at the character and the text unit than those of the original images.

Word Extraction from Table Regions in Document Images (문서 영상 내 테이블 영역에서의 단어 추출)

  • Jeong, Chang-Bu;Kim, Soo-Hyung
    • The KIPS Transactions:PartB
    • /
    • v.12B no.4 s.100
    • /
    • pp.369-378
    • /
    • 2005
  • Document image is segmented and classified into text, picture, or table by a document layout analysis, and the words in table regions are significant for keyword spotting because they are more meaningful than the words in other regions. This paper proposes a method to extract words from table regions in document images. As word extraction from table regions is practically regarded extracting words from cell regions composing the table, it is necessary to extract the cell correctly. In the cell extraction module, table frame is extracted first by analyzing connected components, and then the intersection points are extracted from the table frame. We modify the false intersections using the correlation between the neighboring intersections, and extract the cells using the information of intersections. Text regions in the individual cells are located by using the connected components information that was obtained during the cell extraction module, and they are segmented into text lines by using projection profiles. Finally we divide the segmented lines into words using gap clustering and special symbol detection. The experiment performed on In table images that are extracted from Korean documents, and shows $99.16\%$ accuracy of word extraction.

Development of a Blocks Recognition Application for Children's Education using a Smartphone Camera (스마트폰 카메라 기반 아동 교육용 산수 블록 인식 애플리케이션 개발)

  • Park, Sang-A;Oh, Ji-Won;Hong, In-Sik;Nam, Yunyoung
    • Journal of Internet Computing and Services
    • /
    • v.20 no.4
    • /
    • pp.29-38
    • /
    • 2019
  • Currently, information society is rapidly changing and demands innovation and creativity in various fields. Therefore, the importance of mathematics, which can be the basis of creativity and logic, is emphasized. The purpose of this paper is to develop a math education application that can further expand the logical thinking of mathematics and allow voluntary learning to occur through the use of readily available teaching aid for children to form motivation and interest in learning. This paper provides math education applications using a smartphone and blocks for children. The main function of the application is to shoot with the camera and show the calculated values. When a child uses a block to make a formula and shoots a block using a camera, you can directly see the calculated value of your formula. The preprocessing process, text extraction, and character recognition of the photographed images have been implemented using OpenCV libraries and Tesseract-OCR libraries.

An Implementation of Hangul Handwriting Correction Application Based on Deep Learning (딥러닝에 의한 한글 필기체 교정 어플 구현)

  • Jae-Hyeong Lee;Min-Young Cho;Jin-soo Kim
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.29 no.3
    • /
    • pp.13-22
    • /
    • 2024
  • Currently, with the proliferation of digital devices, the significance of handwritten texts in daily lives is gradually diminishing. As the use of keyboards and touch screens increase, a decline in Korean handwriting quality is being observed across a broad spectrum of Korean documents, from young students to adults. However, Korean handwriting still remains necessary for many documentations, as it retains individual unique features while ensuring readability. To this end, this paper aims to implement an application designed to improve and correct the quality of handwritten Korean script The implemented application utilizes the CRAFT (Character-Region Awareness For Text Detection) model for handwriting area detection and employs the VGG-Feature-Extraction as a deep learning model for learning features of the handwritten script. Simultaneously, the application presents the user's handwritten Korean script's reliability on a syllable-by-syllable basis as a recognition rate and also suggests the most similar fonts among candidate fonts. Furthermore, through various experiments, it can be confirmed that the proposed application provides an excellent recognition rate comparable to conventional commercial character recognition OCR systems.

The Development of Travel Data Sharing System using the Optical Character Reader. (광학문자 인식을 이용한 여행 정보 공유 시스템의 개발)

  • Park, Ju-Hyeon;Lee, Hyun-Dong;Kim, Dong-Hyun;Cho, Dae-soo
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2018.07a
    • /
    • pp.189-190
    • /
    • 2018
  • 최근에는 여행에 대한 각종 정보가 많이 공유되는 추세이다. 최근 사람들은 소셜 네트워크 서비스를 이용 중이거나 웹 서핑을 하는 도중에 기억하고 싶어 하는 여행지를 단순히 캡처 해놓거나 메모장에 기록해둔다. 이러한 방법은 시간이 지나 많은 데이터가 쌓이면 관리하기 어렵다는 문제가 존재한다. 본 논문에서는 사용자의 편리를 고려하여 사진의 텍스트를 광학식 문자 판독을 활용하여 출력하고 게시 글 형태로 저장할 수 있게 개발하였다. 명소의 위치 또한 자동완성 위치 검색 라이브러리를 통하여 편리 저장이 가능하다. 위치 데이터를 통해 향후 사용자가 근접하고 있는 여행지 또한 제공해줄 수 있도록 구현하였다. 이를 위하여 웹을 통해서 이용할 수도 있으며 실시간 검색과 알림 이벤트를 위해 웹 주소 입력 없이도 앱을 실행할 수 있는 프로그래시브웹 앱을 구현하였다.

  • PDF

Implementation of Recipe Recommendation System Using Ingredients Combination Analysis based on Recipe Data (레시피 데이터 기반의 식재료 궁합 분석을 이용한 레시피 추천 시스템 구현)

  • Min, Seonghee;Oh, Yoosoo
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.8
    • /
    • pp.1114-1121
    • /
    • 2021
  • In this paper, we implement a recipe recommendation system using ingredient harmonization analysis based on recipe data. The proposed system receives an image of a food ingredient purchase receipt to recommend ingredients and recipes to the user. Moreover, it performs preprocessing of the receipt images and text extraction using the OCR algorithm. The proposed system can recommend recipes based on the combined data of ingredients. It collects recipe data to calculate the combination for each food ingredient and extracts the food ingredients of the collected recipe as training data. And then, it acquires vector data by learning with a natural language processing algorithm. Moreover, it can recommend recipes based on ingredients with high similarity. Also, the proposed system can recommend recipes using replaceable ingredients to improve the accuracy of the result through preprocessing and postprocessing. For our evaluation, we created a random input dataset to evaluate the proposed recipe recommendation system's performance and calculated the accuracy for each algorithm. As a result of performance evaluation, the accuracy of the Word2Vec algorithm was the highest.