Search | Korea Science

Text Detection and Recognition in Outdoor Korean Signboards for Mobile System Applications (모바일 시스템 응용을 위한 실외 한국어 간판 영상에서 텍스트 검출 및 인식)

Park, J.H.;Lee, G.S.;Kim, S.H.;Lee, M.H.;Toan, N.D.
- Journal of the Institute of Electronics Engineers of Korea CI
- /
- v.46 no.2
- /
- pp.44-51
- /
- 2009
Text understand in natural images has become an active research field in the past few decades. In this paper, we present an automatic recognition system in Korean signboards with a complex background. The proposed algorithm includes detection, binarization and extraction of text for the recognition of shop names. First, we utilize an elaborate detection algorithm to detect possible text region based on edge histogram of vertical and horizontal direction. And detected text region is segmented by clustering method. Second, the text is divided into individual characters based on connected components whose center of mass lie below the center line, which are recognized by using a minimum distance classifier. A shape-based statistical feature is adopted, which is adequate for Korean character recognition. The system has been implemented in a mobile phone and is demonstrated to show acceptable performance.
PDF KSCI

Slab Region Localization for Text Extraction using SIFT Features (문자열 검출을 위한 슬라브 영역 추정)

Choi, Jong-Hyun;Choi, Sung-Hoo;Yun, Jong-Pil;Koo, Keun-Hwi;Kim, Sang-Woo
- The Transactions of The Korean Institute of Electrical Engineers
- /
- v.58 no.5
- /
- pp.1025-1034
- /
- 2009
In steel making production line, steel slabs are given a unique identification number. This identification number, Slab management number(SMN), gives information about the use of the slab. Identification of SMN has been done by humans for several years, but this is expensive and not accurate and it has been a heavy burden on the workers. Consequently, to improve efficiency, automatic recognition system is desirable. Generally, a recognition system consists of text localization, text extraction, character segmentation, and character recognition. For exact SMN identification, all the stage of the recognition system must be successful. In particular, the text localization is great important stage and difficult to process. However, because of many text-like patterns in a complex background and high fuzziness between the slab and background, directly extracting text region is difficult to process. If the slab region including SMN can be detected precisely, text localization algorithm will be able to be developed on the more simple method and the processing time of the overall recognition system will be reduced. This paper describes about the slab region localization using SIFT(Scale Invariant Feature Transform) features in the image. First, SIFT algorithm is applied the captured background and slab image, then features of two images are matched by Nearest Neighbor(NN) algorithm. However, correct matching rate can be low when two images are matched. Thus, to remove incorrect match between the features of two images, geometric locations of the matched two feature points are used. Finally, search rectangle method is performed in correct matching features, and then the top boundary and side boundaries of the slab region are determined. For this processes, we can reduce search region for extraction of SMN from the slab image. Most cases, to extract text region, search region is heuristically fixed [1][2]. However, the proposed algorithm is more analytic than other algorithms, because the search region is not fixed and the slab region is searched in the whole image. Experimental results show that the proposed algorithm has a good performance.
PDF KSCI

A Study on the Text-Independent Speaker Recognition from the Vowel Extraction (모음 검출을 통한 텍스트 독립 화자인식에 관한 연구)

김에녹;복혁규;김형래
- Journal of the Korean Institute of Telematics and Electronics B
- /
- v.31B no.10
- /
- pp.82-91
- /
- 1994
In this thesis, we perform the experiment of speaker recognition by identifying vowels in the pronounciation of each speaker. In detail, we extract the vowels from the pronounciation of each speaker first. From it, we check the frequency energgy of 29 channels. After changing these into fuzzy values, we employ the fuzzy inference to recognize the speaker by text-dependent and text-independent methods. For this experiment, an algorithm of extracting vowels is developed, and newly introduced parameter is the frequency energy of the 29 channels computed from the extracted vowels. It shows the features of each speakers better than existing parameters. The advanced point of this paramter is to use the reference pattern only without the help of any codebook. As a rewult, test-dependent method showed about 95.5% rate of recognition, and text-independent method showed about 94.2% rate of recognition.
PDF

Recent Trends in Deep Learning-Based Optical Character Recognition (딥러닝 기반 광학 문자 인식 기술 동향)

Min, G.;Lee, A.;Kim, K.S.;Kim, J.E.;Kang, H.S.;Lee, G.H.
- Electronics and Telecommunications Trends
- /
- v.37 no.5
- /
- pp.22-32
- /
- 2022
Optical character recognition is a primary technology required in different fields, including digitizing archival documents, industrial automation, automatic driving, video analytics, medicine, and financial institution, among others. It was created in 1928 using pattern matching, but with the advent of artificial intelligence, it has since evolved into a high-performance character recognition technology. Recently, methods for detecting curved text and characters existing in a complicated background are being studied. Additionally, deep learning models are being developed in a way to recognize texts in various orientations and resolutions, perspective distortion, illumination reflection and partially occluded text, complex font characters, and special characters and artistic text among others. This report reviews the recent deep learning-based text detection and recognition methods and their various applications.
https://doi.org/10.22648/ETRI.2022.J.370503 인용 PDF

A Study On The Text Recognition Using Artificial Intelligence Technique (인공지능 기법을 이용한 텍스트 인식에 관한 연구)

이행세;최태영;김영길;김정우
- Journal of the Korean Institute of Telematics and Electronics
- /
- v.26 no.11
- /
- pp.1782-1793
- /
- 1989
Stroke crossing number, syntactic pattern recognition procedure, top down recognition structure, and heuristic approach are studied for the Korean text recognition. We propose new algorithms: 1)Korean vowel seperation using limited scanning method in the Korean characters, 2) extracting strokes using stroke width method, 3) stroke crossing number and its properties, 4) average, standard deviation, and mode of stroke crossing number, and 5) classification and recognition methods of limited chinese character. These are studied with computer simuladtions and experiments.
PDF

A Study on the Impact of Speech Data Quality on Speech Recognition Models

Yeong-Jin Kim;Hyun-Jong Cha;Ah Reum Kang
- Journal of the Korea Society of Computer and Information
- /
- v.29 no.1
- /
- pp.41-49
- /
- 2024
Speech recognition technology is continuously advancing and widely used in various fields. In this study, we aimed to investigate the impact of speech data quality on speech recognition models by dividing the dataset into the entire dataset and the top 70% based on Signal-to-Noise Ratio (SNR). Utilizing Seamless M4T and Google Cloud Speech-to-Text, we examined the text transformation results for each model and evaluated them using the Levenshtein Distance. Experimental results revealed that Seamless M4T scored 13.6 in models using data with high SNR, which is lower than the score of 16.6 for the entire dataset. However, Google Cloud Speech-to-Text scored 8.3 on the entire dataset, indicating lower performance than data with high SNR. This suggests that using data with high SNR during the training of a new speech recognition model can have an impact, and Levenshtein Distance can serve as a metric for evaluating speech recognition models.
https://doi.org/10.9708/jksci.2024.29.01.041 인용 PDF HTML

Arabic Text Recognition with Harakat Using Deep Learning

Ashwag, Maghraby;Esraa, Samkari
- International Journal of Computer Science & Network Security
- /
- v.23 no.1
- /
- pp.41-46
- /
- 2023
Because of the significant role that harakat plays in Arabic text, this paper used deep learning to extract Arabic text with its harakat from an image. Convolutional neural networks and recurrent neural network algorithms were applied to the dataset, which contained 110 images, each representing one word. The results showed the ability to extract some letters with harakat.
https://doi.org/10.22937/IJCSNS.2023.23.1.6 인용 PDF

Automatic proficiency assessment of Korean speech read aloud by non-natives using bidirectional LSTM-based speech recognition

Oh, Yoo Rhee;Park, Kiyoung;Jeon, Hyung-Bae;Park, Jeon Gue
- ETRI Journal
- /
- v.42 no.5
- /
- pp.761-772
- /
- 2020
This paper presents an automatic proficiency assessment method for a non-native Korean read utterance using bidirectional long short-term memory (BLSTM)-based acoustic models (AMs) and speech data augmentation techniques. Specifically, the proposed method considers two scenarios, with and without prompted text. The proposed method with the prompted text performs (a) a speech feature extraction step, (b) a forced-alignment step using a native AM and non-native AM, and (c) a linear regression-based proficiency scoring step for the five proficiency scores. Meanwhile, the proposed method without the prompted text additionally performs Korean speech recognition and a subword un-segmentation for the missing text. The experimental results indicate that the proposed method with prompted text improves the performance for all scores when compared to a method employing conventional AMs. In addition, the proposed method without the prompted text has a fluency score performance comparable to that of the method with prompted text.
https://doi.org/10.4218/etrij.2019-0400 인용 PDF KSCI

Development of an image processing algorithm for korean document recognition (인식률을 향상한 한글문서 인식 알고리즘 개발)

김희식;김영재;이평원
- 제어로봇시스템학회:학술대회논문집
- /
- 1997.10a
- /
- pp.1391-1394
- /
- 1997
This paper proposes a new image processing algorithm to recognize korean documents. It take out the region of text area form input image, then it makes esgmentation of lines, words and characters in the text. A precision segmentation is very important to recognize the input document. The input image has 8-bit gray scaled resolution. Not only the histogram but also brightness dispersion graph are used for segmentation. The result shows a higher accuracy of document recognition.
PDF

Text Location and Extraction for Business Cards Using Stroke Width Estimation

Zhang, Cheng Dong;Lee, Guee-Sang
- International Journal of Contents
- /
- v.8 no.1
- /
- pp.30-38
- /
- 2012
Text extraction and binarization are the important pre-processing steps for text recognition. The performance of text binarization strongly related to the accuracy of recognition stage. In our proposed method, the first stage based on line detection and shape feature analysis applied to locate the position of a business card and detect the shape from the complex environment. In the second stage, several local regions contained the possible text components are separated based on the projection histogram. In each local region, the pixels grouped into several connected components based on the connected component labeling and projection histogram. Then, classify each connect component into text region and reject the non-text region based on the feature information analysis such as size of connected component and stroke width estimation.
https://doi.org/10.5392/IJoC.2012.8.1.030 인용 PDF KSCI

Search Result 670, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)