• Title/Summary/Keyword: Text Image Segmentation

Search Result 65, Processing Time 0.027 seconds

The Geometric Layout Analysis of the Document Image Using Connected Components Method and Median Filter (연결요소 방법과 메디안 필터를 이용한 문서영상 기하학적 구조분석)

  • Jang, Dae-Geun;Hwang, Chan-Sik
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.27 no.8A
    • /
    • pp.805-813
    • /
    • 2002
  • Document image should be classified into detailed regions as text, picture, table and etc through the geometric layout analysis if paper documents can be converted automatically into electronic documents. However, complexity of the document layout and variety of the size and density of a picture are the reason to make it difficult to analyze the geometric layout of the document images. In this paper, we propose the method which have a better performance of the region segmentation and classifications, and the line extraction in the table region than the commercial softwares and previous methods. The proposed method can segment the document into detailed regions by using connected components method even if its layout is complex. This method also classifies texts and pictures by using separable median filter even. Though their size and density are diverse, In addition, this method extracts the lines from the table adapting one dimensional median filter to the each horizontal and vertical direction, even though lines are deformed or texts attached to them.

Design and Implementation of Automated Detection System of Personal Identification Information for Surgical Video De-Identification (수술 동영상의 비식별화를 위한 개인식별정보 자동 검출 시스템 설계 및 구현)

  • Cho, Youngtak;Ahn, Kiok
    • Convergence Security Journal
    • /
    • v.19 no.5
    • /
    • pp.75-84
    • /
    • 2019
  • Recently, the value of video as an important data of medical information technology is increasing due to the feature of rich clinical information. On the other hand, video is also required to be de-identified as a medical image, but the existing methods are mainly specialized in the stereotyped data and still images, which makes it difficult to apply the existing methods to the video data. In this paper, we propose an automated system to index candidate elements of personal identification information on a frame basis to solve this problem. The proposed system performs indexing process using text and person detection after preprocessing by scene segmentation and color knowledge based method. The generated index information is provided as metadata according to the purpose of use. In order to verify the effectiveness of the proposed system, the indexing speed was measured using prototype implementation and real surgical video. As a result, the work speed was more than twice as fast as the playing time of the input video, and it was confirmed that the decision making was possible through the case of the production of surgical education contents.

Expiration Date Notification System Based on YOLO and OCR algorithms for Visually Impaired Person (YOLO와 OCR 알고리즘에 기반한 시각 장애우를 위한 유통기한 알림 시스템)

  • Kim, Min-Soo;Moon, Mi-Kyung;Han, Chang-Hee
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.16 no.6
    • /
    • pp.1329-1338
    • /
    • 2021
  • There are rarely effective methods to help visually impaired people when they want to know the expiration date of products excepted to only Braille. In this study, we developed an expiration date notification system based on YOLO and OCR for visually impaired people. The handicapped people can automatically know the expiration date of a specific product by using our system without the help of a caregiver, fast and accurately. The proposed system is worked by four different steps: (1) identification of a target product by scanning its barcode; (2) segmentation of an image area with the expiration date using YOLO; (3) classification of the expiration date by OCR: (4) notification of the expiration date by TTS. Our system showed an average classification accuracy of about 86.00% when blindfolded subjects used the proposed system in real-time. This result validates that the proposed system can be potentially used for visually impaired people.

Sign Language Dataset Built from S. Korean Government Briefing on COVID-19 (대한민국 정부의 코로나 19 브리핑을 기반으로 구축된 수어 데이터셋 연구)

  • Sim, Hohyun;Sung, Horyeol;Lee, Seungjae;Cho, Hyeonjoong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.8
    • /
    • pp.325-330
    • /
    • 2022
  • This paper conducts the collection and experiment of datasets for deep learning research on sign language such as sign language recognition, sign language translation, and sign language segmentation for Korean sign language. There exist difficulties for deep learning research of sign language. First, it is difficult to recognize sign languages since they contain multiple modalities including hand movements, hand directions, and facial expressions. Second, it is the absence of training data to conduct deep learning research. Currently, KETI dataset is the only known dataset for Korean sign language for deep learning. Sign language datasets for deep learning research are classified into two categories: Isolated sign language and Continuous sign language. Although several foreign sign language datasets have been collected over time. they are also insufficient for deep learning research of sign language. Therefore, we attempted to collect a large-scale Korean sign language dataset and evaluate it using a baseline model named TSPNet which has the performance of SOTA in the field of sign language translation. The collected dataset consists of a total of 11,402 image and text. Our experimental result with the baseline model using the dataset shows BLEU-4 score 3.63, which would be used as a basic performance of a baseline model for Korean sign language dataset. We hope that our experience of collecting Korean sign language dataset helps facilitate further research directions on Korean sign language.

RGB Channel Selection Technique for Efficient Image Segmentation (효율적인 이미지 분할을 위한 RGB 채널 선택 기법)

  • 김현종;박영배
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.10
    • /
    • pp.1332-1344
    • /
    • 2004
  • Upon development of information super-highway and multimedia-related technoiogies in recent years, more efficient technologies to transmit, store and retrieve the multimedia data are required. Among such technologies, firstly, it is common that the semantic-based image retrieval is annotated separately in order to give certain meanings to the image data and the low-level property information that include information about color, texture, and shape Despite the fact that the semantic-based information retrieval has been made by utilizing such vocabulary dictionary as the key words that given, however it brings about a problem that has not yet freed from the limit of the existing keyword-based text information retrieval. The second problem is that it reveals a decreased retrieval performance in the content-based image retrieval system, and is difficult to separate the object from the image that has complex background, and also is difficult to extract an area due to excessive division of those regions. Further, it is difficult to separate the objects from the image that possesses multiple objects in complex scene. To solve the problems, in this paper, I established a content-based retrieval system that can be processed in 5 different steps. The most critical process of those 5 steps is that among RGB images, the one that has the largest and the smallest background are to be extracted. Particularly. I propose the method that extracts the subject as well as the background by using an Image, which has the largest background. Also, to solve the second problem, I propose the method in which multiple objects are separated using RGB channel selection techniques having optimized the excessive division of area by utilizing Watermerge's threshold value with the object separation using the method of RGB channels separation. The tests proved that the methods proposed by me were superior to the existing methods in terms of retrieval performances insomuch as to replace those methods that developed for the purpose of retrieving those complex objects that used to be difficult to retrieve up until now.