• Title/Summary/Keyword: OCR

Search Result 473, Processing Time 0.021 seconds

Behaviour of Dry Sand under $K_o$-Loading/unloading Conditions(I) : Single-Cyclic Test ($K_o$-재하/제하에 의한 건조모래의 거동(I): 단주기 시험)

  • 송무효;남선우
    • Geotechnical Engineering
    • /
    • v.10 no.4
    • /
    • pp.83-102
    • /
    • 1994
  • For estimation of Ko value depending upon the stress history of dry sand, a new type of Ko oedometer apparatus is devised, and the horizontal earth pressure is accurately measured. For this study, 2 types of one-cyclic Ko loading/unloading models have been studied experimentally using four relative densities of the sand. The results obtained in this test are as follows Kon, the coefficient of earth pressure at rest for virgin loading is a function of the angle of internal friction of the sand and is determined as Kon=1-0.914 sin, Kou the coefficient of earth pressure at -rest for virgin unloading is a function of K. and overconsolidation ratio(OCR), and is determined as Kou : Kon(OCR)". The exponent u, increases as the relative density increases. Ko,, the coefficient of earth pressure at rest for virgin reloading decreases in hyperbola type as the vertical stress, cv', increases. And, the stress path at virgin reloading lends to the maximum prestress point, independent upon the value of the minimum unloading stress. The gradient of this curve, mr, increases as OCR increases.ases.

  • PDF

Streamlined GoogLeNet Algorithm Based on CNN for Korean Character Recognition (한글 인식을 위한 CNN 기반의 간소화된 GoogLeNet 알고리즘 연구)

  • Kim, Yeon-gyu;Cha, Eui-young
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.9
    • /
    • pp.1657-1665
    • /
    • 2016
  • Various fields are being researched through Deep Learning using CNN(Convolutional Neural Network) and these researches show excellent performance in the image recognition. In this paper, we provide streamlined GoogLeNet of CNN architecture that is capable of learning a large-scale Korean character database. The experimental data used in this paper is PHD08 that is the large-scale of Korean character database. PHD08 has 2,187 samples for each character and there are 2,350 Korean characters that make total 5,139,450 sample data. As a training result, streamlined GoogLeNet showed over 99% of test accuracy at PHD08. Also, we made additional Korean character data that have fonts that are not in the PHD08 in order to ensure objectivity and we compared the performance of classification between streamlined GoogLeNet and other OCR programs. While other OCR programs showed a classification success rate of 66.95% to 83.16%, streamlined GoogLeNet showed 89.14% of the classification success rate that is higher than other OCR program's rate.

Automatic Generation of Training Character Samples for OCR Systems

  • Le, Ha;Kim, Soo-Hyung;Na, In-Seop;Do, Yen;Park, Sang-Cheol;Jeong, Sun-Hwa
    • International Journal of Contents
    • /
    • v.8 no.3
    • /
    • pp.83-93
    • /
    • 2012
  • In this paper, we propose a novel method that automatically generates real character images to familiarize existing OCR systems with new fonts. At first, we generate synthetic character images using a simple degradation model. The synthetic data is used to train an OCR engine, and the trained OCR is used to recognize and label real character images that are segmented from ideal document images. Since the OCR engine is unable to recognize accurately all real character images, a substring matching method is employed to fix wrongly labeled characters by comparing two strings; one is the string grouped by recognized characters in an ideal document image, and the other is the ordered string of characters which we are considering to train and recognize. Based on our method, we build a system that automatically generates 2350 most common Korean and 117 alphanumeric characters from new fonts. The ideal document images used in the system are postal envelope images with characters printed in ascending order of their codes. The proposed system achieved a labeling accuracy of 99%. Therefore, we believe that our system is effective in facilitating the generation of numerous character samples to enhance the recognition rate of existing OCR systems for fonts that have never been trained.

Study on Measuring Geometrical Modification of Document Image in Scanning Process (스캐닝 과정에서 발생하는 전자문서의 기하학적 변형감지에 관한 연구)

  • Oh, Dong-Yeol;Oh, Hae-Seok;Rhew, Sung-Yul
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.10 no.8
    • /
    • pp.1869-1876
    • /
    • 2009
  • Scanner which is a kind of optical devices is used to convert paper documents into document image files. The assessment of scanned document image is performed to check if there are any modification on document image files in scanning process. In assessment of scanned documents, user checks the degree of skew, noise, folded state and etc This paper proposed to how to measure geometrical modifications of document image in scanning process. In this study, we check the degree of modification in document image file by image processing and we compare the evaluation value which means the degree of modification in each items with OCR success ratio in a document image file. To analyse the correlation between OCR success ratio and the evaluation value which means the degree of modification in each items, we apply Pearson Correlation Coefficient and calculate weight value for each items to score total evaluation value of image modification degrees on a image file. The document image which has high rating score by proposed method also has high OCR success ratio.

Recognition of Bill Form using Feature Pyramid Network (FPN(Feature Pyramid Network)을 이용한 고지서 양식 인식)

  • Kim, Dae-Jin;Hwang, Chi-Gon;Yoon, Chang-Pyo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.4
    • /
    • pp.523-529
    • /
    • 2021
  • In the era of the Fourth Industrial Revolution, technological changes are being applied in various fields. Automation digitization and data management are also in the field of bills. There are more than tens of thousands of forms of bills circulating in society and bill recognition is essential for automation, digitization and data management. Currently in order to manage various bills, OCR technology is used for character recognition. In this time, we can increase the accuracy, when firstly recognize the form of the bill and secondly recognize bills. In this paper, a logo that can be used as an index to classify the form of the bill was recognized as an object. At this time, since the size of the logo is smaller than that of the entire bill, FPN was used for Small Object Detection among deep learning technologies. As a result, it was possible to reduce resource waste and increase the accuracy of OCR recognition through the proposed algorithm.

Expiration Date Notification System Based on YOLO and OCR algorithms for Visually Impaired Person (YOLO와 OCR 알고리즘에 기반한 시각 장애우를 위한 유통기한 알림 시스템)

  • Kim, Min-Soo;Moon, Mi-Kyung;Han, Chang-Hee
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.16 no.6
    • /
    • pp.1329-1338
    • /
    • 2021
  • There are rarely effective methods to help visually impaired people when they want to know the expiration date of products excepted to only Braille. In this study, we developed an expiration date notification system based on YOLO and OCR for visually impaired people. The handicapped people can automatically know the expiration date of a specific product by using our system without the help of a caregiver, fast and accurately. The proposed system is worked by four different steps: (1) identification of a target product by scanning its barcode; (2) segmentation of an image area with the expiration date using YOLO; (3) classification of the expiration date by OCR: (4) notification of the expiration date by TTS. Our system showed an average classification accuracy of about 86.00% when blindfolded subjects used the proposed system in real-time. This result validates that the proposed system can be potentially used for visually impaired people.

Spam Image Detection Model based on Deep Learning for Improving Spam Filter

  • Seong-Guk Nam;Dong-Gun Lee;Yeong-Seok Seo
    • Journal of Information Processing Systems
    • /
    • v.19 no.3
    • /
    • pp.289-301
    • /
    • 2023
  • Due to the development and dissemination of modern technology, anyone can easily communicate using services such as social network service (SNS) through a personal computer (PC) or smartphone. The development of these technologies has caused many beneficial effects. At the same time, bad effects also occurred, one of which was the spam problem. Spam refers to unwanted or rejected information received by unspecified users. The continuous exposure of such information to service users creates inconvenience in the user's use of the service, and if filtering is not performed correctly, the quality of service deteriorates. Recently, spammers are creating more malicious spam by distorting the image of spam text so that optical character recognition (OCR)-based spam filters cannot easily detect it. Fortunately, the level of transformation of image spam circulated on social media is not serious yet. However, in the mail system, spammers (the person who sends spam) showed various modifications to the spam image for neutralizing OCR, and therefore, the same situation can happen with spam images on social media. Spammers have been shown to interfere with OCR reading through geometric transformations such as image distortion, noise addition, and blurring. Various techniques have been studied to filter image spam, but at the same time, methods of interfering with image spam identification using obfuscated images are also continuously developing. In this paper, we propose a deep learning-based spam image detection model to improve the existing OCR-based spam image detection performance and compensate for vulnerabilities. The proposed model extracts text features and image features from the image using four sub-models. First, the OCR-based text model extracts the text-related features, whether the image contains spam words, and the word embedding vector from the input image. Then, the convolution neural network-based image model extracts image obfuscation and image feature vectors from the input image. The extracted feature is determined whether it is a spam image by the final spam image classifier. As a result of evaluating the F1-score of the proposed model, the performance was about 14 points higher than the OCR-based spam image detection performance.

A Personal Information Security System using Form Recognition and Optical Character Recognition in Electronic Documents (전자문서에서 서식인식과 광학문자인식을 이용한 개인정보 탐지 및 보호 시스템)

  • Baek, Jong-Kyung;Jee, Yoon-Seok;Park, Jae-Pyo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.5
    • /
    • pp.451-457
    • /
    • 2020
  • Format recognition and OCR techniques are widely used as methods for detecting and protecting personal information from electronic documents. However, due to the poor recognition rate of the OCR engine, personal information cannot be detected or false positives commonly occur. It also takes a long time to analyze a large amount of electronic documents. In this paper, we propose a method to improve the speed of image analysis of electronic documents, character recognition rate of the OCR engine, and detection rate of personal information by improving the existing method. The analysis speed was increased using the format recognition method while the analysis speed and character recognition rate of the OCR engine was improved by image correction. An algorithm for analyzing personal information from images was proposed to increase the reconnaissance rate of personal information. Through the experiments, 1755 image format recognition samples were analyzed in an average time of 0.24 seconds, which was 0.5 seconds higher than the conventional PAID system format recognition method, and the image recognition rate was 99%. The proposed method in this paper can be used in various fields such as public, telecommunications, finance, tourism, and security as a system to protect personal information in electronic documents.

A Study on the Characteristics of Stress History of Kwang-Yang Port Clayey Soil Based on the Long-term Consolidation Test (장기압밀시험에 의한 광양항 점성토의 응력이력 특성 연구)

  • Kim, Jin-Young;Ryu, Seung-Seok;Baek, Won-Jin;Shim, Jae-Rok;Oh, Jong-Shin;Kim, Seong-Gon
    • Journal of the Korean Geotechnical Society
    • /
    • v.28 no.6
    • /
    • pp.31-38
    • /
    • 2012
  • In this present study, the long-term consolidation tests were performed using the remolded Kwang-Yang port clayey soil to clarify the effect of stress history and over-consolidation ratio (OCR) on the long-term consolidation characteristics of the soft clayey soil. For the over-consolidated state clayey soils, in case OCR exceeds 1.5, there are no great differences of secondary consolidation settlement and final settlement even if OCR increases from 2.0 to 3.0. Therefore, it has been understood that the value of OCR applied on the field site to reduce the secondary consolidation settlement and the final settlement is about 1.5. In addition, in order to investigate the relationship between the pre-loading period and the characteristics of long-term consolidation behavior obtained from the test results using the remolded Kwang-Yang port clayey soils, the influence on long-term consolidation behavior was not large though the pre-load was unloaded with the consolidation degree 70~80% exceeded.

Profiling Stress History(OCR, $\sigma를$p) of Marine Clay Using Piezocone Penetration Test (해성점토지반에서 CPT를 이용한 응력이력(OCR, $\sigma$를 p)의 산정)

  • 이강운;윤길림;채영수
    • Journal of the Korean Geotechnical Society
    • /
    • v.18 no.6
    • /
    • pp.73-81
    • /
    • 2002
  • Various CPT-based prediction models far profiling stress history of marine clay at the southern part of the Korean peninsula were investigated by using both statistical analysis and case history study. Preconsolidation pressures($\sigma'$p) and overconsolidation ratio(OCR) estimated by empirical correlations and cone penetration tests were compared with those of laboratory odometer test results. Stress history of marine clay determined by odometer test results was in general overconsolidated at below 10m depth from the mudline, whereas marine clay at below l0m depth from the mudline which has an around 0.3 overconsolidation ratio showed variable stresses and unstable states. Preconsolidation pressures were computed by both empirical methods of the Chen and Mayne(1996) and theoretical method of Konrad and Law(1987). It is estimated that Chen and Mayne(1996)'s prediction method based on pore water pressure is more reliable than any other prediction methods, and their method proved to be the most reliable for overconsolidation ratio estimation. However, it is recommended that Mayne & Holtz(1988) and Mayne & Bachus(1988) methods are more suitable than any other methods for predicting the overconsolidation ratio at an underconsolidated (OCR<1) clay. For these reasons, rather than making use of existing prediction models, development of site specific empirical correlations which considers local characteristics and site conditions may be required due to different local stress history and variable soil properties.