• 제목/요약/키워드: Text Recognition

검색결과 659건 처리시간 0.031초

모바일 시스템 응용을 위한 실외 한국어 간판 영상에서 텍스트 검출 및 인식 (Text Detection and Recognition in Outdoor Korean Signboards for Mobile System Applications)

  • 박종현;이귀상;김수형;이명훈
    • 전자공학회논문지CI
    • /
    • 제46권2호
    • /
    • pp.44-51
    • /
    • 2009
  • 자연 영상에서의 텍스트 이해는 지난 수년간 매우 활발한 연구 분야로 자리하고 있다. 논문에서 우리는 한국어 간판 영상으로부터 자동으로 텍스트를 인식하는 방법을 제안한다. 제안된 방법은 상호명의 인식을 위한 텍스트 영역의 검출 및 이진화를 포함하고 있다. 먼저 수직, 수평 방향의 에지 히스토그램을 이용하여 텍스트 영역의 정교한 검출을 수행하였다. 두 번째 단계는 검출된 텍스트 영역에 대해서 연결요소 기법을 적용하여 각각의 독립된 한 개의 문자 영역으로 분할되어지고, 마지막으로 최소 거리 분류법에 의해 각각의 글자를 인식한다. 각각의 문자 인식을 위해 모양 기반 통계적 특징을 추출한다. 실험에서 제안된 전체적인 효율성 및 정확성을 분석하였으며, 현재 구현된 모바일 시스템의 실용성을 확인할 수 있었다.

문자열 검출을 위한 슬라브 영역 추정 (Slab Region Localization for Text Extraction using SIFT Features)

  • 최종현;최성후;윤종필;구근휘;김상우
    • 전기학회논문지
    • /
    • 제58권5호
    • /
    • pp.1025-1034
    • /
    • 2009
  • In steel making production line, steel slabs are given a unique identification number. This identification number, Slab management number(SMN), gives information about the use of the slab. Identification of SMN has been done by humans for several years, but this is expensive and not accurate and it has been a heavy burden on the workers. Consequently, to improve efficiency, automatic recognition system is desirable. Generally, a recognition system consists of text localization, text extraction, character segmentation, and character recognition. For exact SMN identification, all the stage of the recognition system must be successful. In particular, the text localization is great important stage and difficult to process. However, because of many text-like patterns in a complex background and high fuzziness between the slab and background, directly extracting text region is difficult to process. If the slab region including SMN can be detected precisely, text localization algorithm will be able to be developed on the more simple method and the processing time of the overall recognition system will be reduced. This paper describes about the slab region localization using SIFT(Scale Invariant Feature Transform) features in the image. First, SIFT algorithm is applied the captured background and slab image, then features of two images are matched by Nearest Neighbor(NN) algorithm. However, correct matching rate can be low when two images are matched. Thus, to remove incorrect match between the features of two images, geometric locations of the matched two feature points are used. Finally, search rectangle method is performed in correct matching features, and then the top boundary and side boundaries of the slab region are determined. For this processes, we can reduce search region for extraction of SMN from the slab image. Most cases, to extract text region, search region is heuristically fixed [1][2]. However, the proposed algorithm is more analytic than other algorithms, because the search region is not fixed and the slab region is searched in the whole image. Experimental results show that the proposed algorithm has a good performance.

모음 검출을 통한 텍스트 독립 화자인식에 관한 연구 (A Study on the Text-Independent Speaker Recognition from the Vowel Extraction)

  • 김에녹;복혁규;김형래
    • 전자공학회논문지B
    • /
    • 제31B권10호
    • /
    • pp.82-91
    • /
    • 1994
  • In this thesis, we perform the experiment of speaker recognition by identifying vowels in the pronounciation of each speaker. In detail, we extract the vowels from the pronounciation of each speaker first. From it, we check the frequency energgy of 29 channels. After changing these into fuzzy values, we employ the fuzzy inference to recognize the speaker by text-dependent and text-independent methods. For this experiment, an algorithm of extracting vowels is developed, and newly introduced parameter is the frequency energy of the 29 channels computed from the extracted vowels. It shows the features of each speakers better than existing parameters. The advanced point of this paramter is to use the reference pattern only without the help of any codebook. As a rewult, test-dependent method showed about 95.5% rate of recognition, and text-independent method showed about 94.2% rate of recognition.

  • PDF

딥러닝 기반 광학 문자 인식 기술 동향 (Recent Trends in Deep Learning-Based Optical Character Recognition)

  • 민기현;이아람;김거식;김정은;강현서;이길행
    • 전자통신동향분석
    • /
    • 제37권5호
    • /
    • pp.22-32
    • /
    • 2022
  • Optical character recognition is a primary technology required in different fields, including digitizing archival documents, industrial automation, automatic driving, video analytics, medicine, and financial institution, among others. It was created in 1928 using pattern matching, but with the advent of artificial intelligence, it has since evolved into a high-performance character recognition technology. Recently, methods for detecting curved text and characters existing in a complicated background are being studied. Additionally, deep learning models are being developed in a way to recognize texts in various orientations and resolutions, perspective distortion, illumination reflection and partially occluded text, complex font characters, and special characters and artistic text among others. This report reviews the recent deep learning-based text detection and recognition methods and their various applications.

인공지능 기법을 이용한 텍스트 인식에 관한 연구 (A Study On The Text Recognition Using Artificial Intelligence Technique)

  • 이행세;최태영;김영길;김정우
    • 대한전자공학회논문지
    • /
    • 제26권11호
    • /
    • pp.1782-1793
    • /
    • 1989
  • Stroke crossing number, syntactic pattern recognition procedure, top down recognition structure, and heuristic approach are studied for the Korean text recognition. We propose new algorithms: 1)Korean vowel seperation using limited scanning method in the Korean characters, 2) extracting strokes using stroke width method, 3) stroke crossing number and its properties, 4) average, standard deviation, and mode of stroke crossing number, and 5) classification and recognition methods of limited chinese character. These are studied with computer simuladtions and experiments.

  • PDF

A Study on the Impact of Speech Data Quality on Speech Recognition Models

  • Yeong-Jin Kim;Hyun-Jong Cha;Ah Reum Kang
    • 한국컴퓨터정보학회논문지
    • /
    • 제29권1호
    • /
    • pp.41-49
    • /
    • 2024
  • 현재 음성인식 기술은 꾸준히 발전하고 다양한 분야에서 널리 사용되고 있다. 본 연구에서는 음성 데이터 품질이 음성인식 모델에 미치는 영향을 알아보기 위해 데이터셋을 전체 데이터셋과 SNR 상위 70%의 데이터셋으로 나눈 후 Seamless M4T와 Google Cloud Speech-to-Text를 이용하여 각 모델의 텍스트 변환 결과를 확인하고 Levenshtein Distance를 사용하여 평가하였다. 실험 결과에서 Seamless M4T는 높은 SNR(신호 대 잡음비)을 가진 데이터를 사용한 모델에서 점수가 13.6으로 전체 데이터셋의 점수인 16.6보다 더 낮게 나왔다. 그러나 Google Cloud Speech-to-Text는 전체 데이터셋에서 8.3으로 높은 SNR을 가진 데이터보다 더 낮은 점수가 나왔다. 이는 새로운 음성인식 모델을 훈련할 때 SNR이 높은 데이터를 사용하는 것이 영향이 있다고 할 수 있으며, Levenshtein Distance 알고리즘이 음성인식 모델을 평가하기 위한 지표 중 하나로 쓰일 수 있음을 나타낸다.

Arabic Text Recognition with Harakat Using Deep Learning

  • Ashwag, Maghraby;Esraa, Samkari
    • International Journal of Computer Science & Network Security
    • /
    • 제23권1호
    • /
    • pp.41-46
    • /
    • 2023
  • Because of the significant role that harakat plays in Arabic text, this paper used deep learning to extract Arabic text with its harakat from an image. Convolutional neural networks and recurrent neural network algorithms were applied to the dataset, which contained 110 images, each representing one word. The results showed the ability to extract some letters with harakat.

Automatic proficiency assessment of Korean speech read aloud by non-natives using bidirectional LSTM-based speech recognition

  • Oh, Yoo Rhee;Park, Kiyoung;Jeon, Hyung-Bae;Park, Jeon Gue
    • ETRI Journal
    • /
    • 제42권5호
    • /
    • pp.761-772
    • /
    • 2020
  • This paper presents an automatic proficiency assessment method for a non-native Korean read utterance using bidirectional long short-term memory (BLSTM)-based acoustic models (AMs) and speech data augmentation techniques. Specifically, the proposed method considers two scenarios, with and without prompted text. The proposed method with the prompted text performs (a) a speech feature extraction step, (b) a forced-alignment step using a native AM and non-native AM, and (c) a linear regression-based proficiency scoring step for the five proficiency scores. Meanwhile, the proposed method without the prompted text additionally performs Korean speech recognition and a subword un-segmentation for the missing text. The experimental results indicate that the proposed method with prompted text improves the performance for all scores when compared to a method employing conventional AMs. In addition, the proposed method without the prompted text has a fluency score performance comparable to that of the method with prompted text.

인식률을 향상한 한글문서 인식 알고리즘 개발 (Development of an image processing algorithm for korean document recognition)

  • 김희식;김영재;이평원
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 1997년도 한국자동제어학술회의논문집; 한국전력공사 서울연수원; 17-18 Oct. 1997
    • /
    • pp.1391-1394
    • /
    • 1997
  • This paper proposes a new image processing algorithm to recognize korean documents. It take out the region of text area form input image, then it makes esgmentation of lines, words and characters in the text. A precision segmentation is very important to recognize the input document. The input image has 8-bit gray scaled resolution. Not only the histogram but also brightness dispersion graph are used for segmentation. The result shows a higher accuracy of document recognition.

  • PDF

Text Location and Extraction for Business Cards Using Stroke Width Estimation

  • Zhang, Cheng Dong;Lee, Guee-Sang
    • International Journal of Contents
    • /
    • 제8권1호
    • /
    • pp.30-38
    • /
    • 2012
  • Text extraction and binarization are the important pre-processing steps for text recognition. The performance of text binarization strongly related to the accuracy of recognition stage. In our proposed method, the first stage based on line detection and shape feature analysis applied to locate the position of a business card and detect the shape from the complex environment. In the second stage, several local regions contained the possible text components are separated based on the projection histogram. In each local region, the pixels grouped into several connected components based on the connected component labeling and projection histogram. Then, classify each connect component into text region and reject the non-text region based on the feature information analysis such as size of connected component and stroke width estimation.