• 제목/요약/키워드: text recognition

검색결과 678건 처리시간 0.026초

문서 처리 자동화를 위한 다양한 표 유형에서 표 구조 인식 방법 (Structure Recognition Method in Various Table Types for Document Processing Automation)

  • 이동석;권순각
    • 한국멀티미디어학회논문지
    • /
    • 제25권5호
    • /
    • pp.695-702
    • /
    • 2022
  • In this paper, we propose the method of a table structure recognition in various table types for document processing automation. A table with items surrounded by ruled lines are analyzed by detecting horizontal and vertical lines for recognizing the table structure. In case of a table with items separated by spaces, the table structure are recognized by analyzing the arrangement of row items. After recognizing the table structure, the areas of the table items are input into OCR engine and the character recognition result output to a text file in a structured format such as CSV or JSON. In simulation results, the average accuracy of table item recognition is about 94%.

텍스트로서의 조경드로잉 - 읽기의 틀과 실제 - (Landscape Drawing as a Text: Practical and Theoretical Approach)

  • 이광빈;조정송
    • 한국조경학회지
    • /
    • 제27권1호
    • /
    • pp.54-63
    • /
    • 1999
  • The Landscape drawing is used as main media in landscape design process like the language in daily life for human. Designers input many intentions and meaningful words in design process through landscape drawing. The common purpose of landscape drawing is to represent reality effectively, even though it has variable visual forms and materiality. The representation in landscape drawing in metaphorical as well as visual and functional. But current tendency is inclined to use landscape drawing in a functional aspect for visual representation and the landscape drawing is utilized straight-forwardly rather than metaphorically for clear communication. Such recognition on landscape drawing results from the difficulty to accept the symbolic aspect of the drawing. The difficulty makes the utilization and the interpretation of landscape drawing stay at conventional level in following visible factors. For the sake of solving the difficulty this study considers landscape drawing as the text that contains readable objects and symbolic words. This study presents layer-methods for reading a landscape drawing as a text; situational and contextural reading, iconological reading and reading the subject of drawing.

  • PDF

가변 문턱치와 순차결정법을 통한 문맥요구형 화자확인 (Text-Prompt Speaker Verification using Variable Threshold and Sequential Decision)

  • 안성주;강선미;고한석
    • 음성과학
    • /
    • 제7권4호
    • /
    • pp.41-47
    • /
    • 2000
  • This paper concerns an effective text-prompted speaker verification method to increase the performance of speaker verification. While various speaker verification methods have already been developed, their effectiveness has not yet been formally proven in terms of achieving an acceptable performance level. It is also noted that the traditional methods were focused primarily on single, prompted utterance for verification. This paper, instead, proposes sequential decision method using variable threshold focused at handling two utterances for text-prompted speaker verification. Experimental results show that the proposed speaker verification method outperforms that of the speaker verification scheme without using the sequential decision by a factor of up to 3 times. From these results, we show that the proposed method is highly effective and achieves a reliable performance suitable for practical applications.

  • PDF

Design of Image Generation System for DCGAN-Based Kids' Book Text

  • Cho, Jaehyeon;Moon, Nammee
    • Journal of Information Processing Systems
    • /
    • 제16권6호
    • /
    • pp.1437-1446
    • /
    • 2020
  • For the last few years, smart devices have begun to occupy an essential place in the life of children, by allowing them to access a variety of language activities and books. Various studies are being conducted on using smart devices for education. Our study extracts images and texts from kids' book with smart devices and matches the extracted images and texts to create new images that are not represented in these books. The proposed system will enable the use of smart devices as educational media for children. A deep convolutional generative adversarial network (DCGAN) is used for generating a new image. Three steps are involved in training DCGAN. Firstly, images with 11 titles and 1,164 images on ImageNet are learned. Secondly, Tesseract, an optical character recognition engine, is used to extract images and text from kids' book and classify the text using a morpheme analyzer. Thirdly, the classified word class is matched with the latent vector of the image. The learned DCGAN creates an image associated with the text.

청각장애인을 위한 음성 인식 및 합성 애플리케이션 개발 (Development of Speech Recognition and Synthetic Application for the Hearing Impairment)

  • 이원주;김우린;함혜원;윤상운
    • 한국컴퓨터정보학회:학술대회논문집
    • /
    • 한국컴퓨터정보학회 2020년도 제62차 하계학술대회논문집 28권2호
    • /
    • pp.129-130
    • /
    • 2020
  • 본 논문에서는 청각장애인의 의사소통을 위한 안드로이드 애플리케이션 시스템 구현 결과를 보인다. 구글 클라우드 플랫폼(Google Cloud Platform)의 STT(Speech to Text) API를 이용하여 음성 인식을 통해 대화의 내용을 텍스트의 형태로 출력한다. 그리고 TTS(Text to Speech)를 이용한 음성 합성을 통해 텍스트를 음성으로 출력한다. 또한, 포그라운드 서비스(Service)에서 가속도계 센서(Accelerometer Sensor)를 이용하여 스마트폰을 2~3회 흔들었을 때 해당 애플리케이션을 실행할 수 있도록 하여 애플리케이션의 활용성을 높인 시스템을 개발하였다.

  • PDF

Helping People with Visual Disability Using AI

  • Naif Al Otaibi;Tariq S Almurayziq
    • International Journal of Computer Science & Network Security
    • /
    • 제24권1호
    • /
    • pp.205-208
    • /
    • 2024
  • Artificial Intelligence (AI) technology has evolved rapidly in recent years and is used in everything from banking to email management to surgery, but without the help of the visible, most of the fun features of the Internet include visual impairment. It benefits people with disabilities. The main purpose of this study is to find ways to help people with visual impairments using AI technology. A visually impaired request is made for the visually impaired. For example, when a message arrives that the program will notify you by voice (reads the sender's name, read the message, and replies to it if necessary), this is a special program installed on your mobile phone. This program uses a customized algorithm developed in Python to convert written text to voice, read text, and convert voice to written text on a message when a visually impaired person wants to respond. Then it sends the response in the form of a text message. Therefore, the research should lead to programs for people with visual impairments. This program makes mobile phones easier and more comfortable to use and makes the daily life easier for visual impairments.

해상풍력 주민수용성 비교 연구: 군산 및 제주도 주민대표 인터뷰의 텍스트 네트워크 분석을 중심으로 (A Comparative Study on Residents' Acceptance of Offshore Wind Farms: Focusing on a Text Network Analysis of Interviews with Local Representatives in Gunsan and Jeju)

  • 이상혁;박재필
    • 풍력에너지저널
    • /
    • 제13권2호
    • /
    • pp.23-30
    • /
    • 2022
  • According to the "Offshore Wind Development Plan," large-scale project-oriented supply expansion is necessary to achieve 12 GW by 2030. But implementation may be delayed due to difficulty in securing the acceptance of residents. This study looked at residents' acceptance by comparing the perceptions of local representatives in Gunsan and Jeju. To this end, six in-depth interviews were conducted and the entire contents of the interviews were converted to text files. By using text network analysis (Netminer 4.4), the cognitive structure of local representatives was analyzed and compared. Based on the analysis results, Maldo, Myeongdo and Bangchukdo in Gunsan are promoting offshore wind farms in fishing license areas of the three islands in order to respond to opposition from other fishing village fraternities. In Dumo-ri, Jeju, important discussions and decisions related to offshore wind farms were decided in meetings (offshore wind power promotion committee, village assembly).

Comparative Analysis of Speech Recognition Open API Error Rate

  • Kim, Juyoung;Yun, Dai Yeol;Kwon, Oh Seok;Moon, Seok-Jae;Hwang, Chi-gon
    • International journal of advanced smart convergence
    • /
    • 제10권2호
    • /
    • pp.79-85
    • /
    • 2021
  • Speech recognition technology refers to a technology in which a computer interprets the speech language spoken by a person and converts the contents into text data. This technology has recently been combined with artificial intelligence and has been used in various fields such as smartphones, set-top boxes, and smart TVs. Examples include Google Assistant, Google Home, Samsung's Bixby, Apple's Siri and SK's NUGU. Google and Daum Kakao offer free open APIs for speech recognition technologies. This paper selects three APIs that are free to use by ordinary users, and compares each recognition rate according to the three types. First, the recognition rate of "numbers" and secondly, the recognition rate of "Ga Na Da Hangul" are conducted, and finally, the experiment is conducted with the complete sentence that the author uses the most. All experiments use real voice as input through a computer microphone. Through the three experiments and results, we hope that the general public will be able to identify differences in recognition rates according to the applications currently available, helping to select APIs suitable for specific application purposes.