• Title/Summary/Keyword: Text-recognition

Search Result 673, Processing Time 0.025 seconds

Speaker Identification Using Augmented PCA in Unknown Environments (부가 주성분분석을 이용한 미지의 환경에서의 화자식별)

  • Yu, Ha-Jin
    • MALSORI
    • /
    • no.54
    • /
    • pp.73-83
    • /
    • 2005
  • The goal of our research is to build a text-independent speaker identification system that can be used in any condition without any additional adaptation process. The performance of speaker recognition systems can be severely degraded in some unknown mismatched microphone and noise conditions. In this paper, we show that PCA(principal component analysis) can improve the performance in the situation. We also propose an augmented PCA process, which augments class discriminative information to the original feature vectors before PCA transformation and selects the best direction for each pair of highly confusable speakers. The proposed method reduced the relative recognition error by 21%.

  • PDF

A study on the humanistic measure about cultural changes of voice recognition technology (음성인식기술의 문화변동에 대한 인문학적 대응에 관한 연구)

  • Yuk, Hyun-Seung;Cho, Byung-Chul
    • Journal of Digital Convergence
    • /
    • v.13 no.8
    • /
    • pp.21-31
    • /
    • 2015
  • The Journal of Digital Policy & Management. This space is for the abstract of your study in English. Recently, advancements in voice recognition technology lead to a new oral cultural era. Text based on new oral cultures, can bring about a cultural revolution. This research is rooted within the humanistic approach, including oral and text. The goal of the research is the humanistic measurements in regards to these cultural issues. Just like the complementary relationship between oral and text for the future. First of all, we will discuss the aspects that have resulted in the change between a text culture to an oral culture. After checking these changes with regards to voice recognition technology, we will be able to discuss the possibilities and problems of this cultural change. We discussed expected outcomes, such as the complementarity of speaking and writing, the expansion from the private culture to the public culture, the possibilities of a simultaneous concurrency. We also discussed the necessity such as a new semiotic approach of the voice and preparation for the expansion of the world of life. Specifically, the necessity for the advancement and control of the Korean culture against the dominance of a global corporation will be explored. In this study, basic research will be undertaken to look at the possibility of the new voice recognition technology and cultural changes, that are expected to be able to be effectively utilized and continue into more detailed research.

Skewed Angle Detection in Text Images Using Orthogonal Angle View

  • Chin, Seong-Ah;Choo, Moon-Won
    • Proceedings of the IEEK Conference
    • /
    • 2000.07a
    • /
    • pp.62-65
    • /
    • 2000
  • In this paper we propose skewed angle detection methods for images that contain text that is not aligned horizontally. In most images text areas are aligned along the horizontal axis, however there are many occasions when the text may be at a skewed angle (denoted by 0 < ${\theta}\;{\leq}\;{\pi}$). In the work described, we adapt the Hough transform, Shadow and Threshold Projection methods to detect the skewed angle of text in an input image using the orthogonal angle view property. The results of this method are a primary text skewed angle, which allows us to rotate the original input image into an image with horizontally aligned text. This utilizes document image processing prior to the recognition stage.

  • PDF

Text Extraction in HIS Color Space by Weighting Scheme

  • Le, Thi Khue Van;Lee, Gueesang
    • Smart Media Journal
    • /
    • v.2 no.1
    • /
    • pp.31-36
    • /
    • 2013
  • A robust and efficient text extraction is very important for an accuracy of Optical Character Recognition (OCR) systems. Natural scene images with degradations such as uneven illumination, perspective distortion, complex background and multi color text give many challenges to computer vision task, especially in text extraction. In this paper, we propose a method for extraction of the text in signboard images based on a combination of mean shift algorithm and weighting scheme of hue and saturation in HSI color space for clustering algorithm. The number of clusters is determined automatically by mean shift-based density estimation, in which local clusters are estimated by repeatedly searching for higher density points in feature vector space. Weighting scheme of hue and saturation is used for formulation a new distance measure in cylindrical coordinate for text extraction. The obtained experimental results through various natural scene images are presented to demonstrate the effectiveness of our approach.

  • PDF

Robust Recognition of a Player Name in Golf Videos (골프 동영상에서의 강건한 선수명 인식)

  • Jung, Cheol-Kon;Kim, Joong-Kyu
    • 한국HCI학회:학술대회논문집
    • /
    • 2008.02a
    • /
    • pp.659-662
    • /
    • 2008
  • In sports videos, text provides valuable information about the game such as scores and information about the players. This paper proposed a robust recognition method of player name in golf videos. In golf, most of users want to search the scenes which contain the play shots of favorite players. We use text information in golf videos for robust extraction of player information, By using OCR, we have obtained the text information, and then recognized the player information from player name DB. We can search the scenes of favorite players by using this player information. By conducting experiments on several golf videos, we demonstrate that our method achieves impressive performance with respect to the robustness.

  • PDF

Text Region Detection Method Using Table Border Pseudo Label (표의 테두리 유사 라벨을 활용한 문자 영역 검출 방법)

  • Han, Jeong Hoon;Park, Se Jin;Moon, Young Shik
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.10
    • /
    • pp.1271-1279
    • /
    • 2020
  • Text region detection is a technology that detects text area in handwriting or printed documents. The detected text areas are digitized through a recognition step, which is used in various fields depending on the purpose of use. However, the detection result of the small text unit is not suitable for the industrial field. In addition, the border of tables in the document that it causes miss-detected results, which has an adverse effect on the recognition step. To solve the issues, we propose a method for detecting text region using the border information of the table. In order to utilize the border information of the table, the proposed method adjusts the flow of two decoders. Experimentally, we show improved performance using the table border pseudo label based on weak supervised learning.

Caption Detection Algorithm Using Temporal Information in Video (동영상에서 시간 영역 정보를 이용한 자막 검출 알고리듬)

  • 권철현;신청호;김수연;박상희
    • The Transactions of the Korean Institute of Electrical Engineers D
    • /
    • v.53 no.8
    • /
    • pp.606-610
    • /
    • 2004
  • A noble caption text detection and recognition algorithm using the temporal nature of video is proposed in this paper. A text registration technique is used to locate the temporal and spatial positions of captions in video from the accumulated frame difference information. Experimental results show that the proposed method is effective and robust. Also, a high processing speed is achieved since no time consuming operation is included.

Text Line Segmentation using AHTC and Watershed Algorithm for Handwritten Document Images

  • Oh, KangHan;Kim, SooHyung;Na, InSeop;Kim, GwangBok
    • International Journal of Contents
    • /
    • v.10 no.3
    • /
    • pp.35-40
    • /
    • 2014
  • Text line segmentation is a critical task in handwritten document recognition. In this paper, we propose a novel text-line-segmentation method using baseline estimation and watershed. The baseline-detection algorithm estimates the baseline using Adaptive Head-Tail Connection (AHTC) on the document. Then, the watershed method segments the line region using the baseline-detection result. Finally, the text lines are separated by watershed result and a post-processing algorithm defines the lines more correctly. The scheme successfully segments text lines with 97% accuracy from the handwritten document images in the ICDAR database.

Stroke Width-Based Contrast Feature for Document Image Binarization

  • Van, Le Thi Khue;Lee, Gueesang
    • Journal of Information Processing Systems
    • /
    • v.10 no.1
    • /
    • pp.55-68
    • /
    • 2014
  • Automatic segmentation of foreground text from the background in degraded document images is very much essential for the smooth reading of the document content and recognition tasks by machine. In this paper, we present a novel approach to the binarization of degraded document images. The proposed method uses a new local contrast feature extracted based on the stroke width of text. First, a pre-processing method is carried out for noise removal. Text boundary detection is then performed on the image constructed from the contrast feature. Then local estimation follows to extract text from the background. Finally, a refinement procedure is applied to the binarized image as a post-processing step to improve the quality of the final results. Experiments and comparisons of extracting text from degraded handwriting and machine-printed document image against some well-known binarization algorithms demonstrate the effectiveness of the proposed method.

Machine Printed and Handwritten Text Discrimination in Korean Document Images

  • Trieu, Son Tung;Lee, Guee Sang
    • Smart Media Journal
    • /
    • v.5 no.3
    • /
    • pp.30-34
    • /
    • 2016
  • Nowadays, there are a lot of Korean documents, which often need to be identified in one of printed or handwritten text. Early methods for the identification use structural features, which can be simple and easy to apply to text of a specific font, but its performance depends on the font type and characteristics of the text. Recently, the bag-of-words model has been used for the identification, which can be invariant to changes in font size, distortions or modifications to the text. The method based on bag-of-words model includes three steps: word segmentation using connected component grouping, feature extraction, and finally classification using SVM(Support Vector Machine). In this paper, bag-of-words model based method is proposed using SURF(Speeded Up Robust Feature) for the identification of machine printed and handwritten text in Korean documents. The experiment shows that the proposed method outperforms methods based on structural features.