• Title/Summary/Keyword: text-to-speech

Search Result 505, Processing Time 0.021 seconds

Designing and Evaluating an Audiobook Service Model on Android Platform for the Visually-Impaired (안드로이드 플랫폼 기반 시각장애인용 음성도서 서비스 모델 구축 및 평가)

  • Jang, Won-Hong;Oh, Sam-Gyun
    • Journal of the Korean Society for information Management
    • /
    • v.32 no.2
    • /
    • pp.221-236
    • /
    • 2015
  • This paper describes the process and methodology followed in developing the Android-based LG Sangnam Audiobook service and an evaluation of its usefulness to the public. The methods included a survey of user needs, analysis of usage statistics, and user interviews. The study found that visually impaired users: 1) were greatly interested and willing to use smartphones if there were no barrier in cost and access; 2) preferred downloads to streaming services; 3) did not mind performance differences between real and TTS (text-to-speech) voices; 4) showed marked differences in book preferences according to age, 5) made about 14,000 downloads in 2014; and 6) indicated bookmarking and moving between pages and tables of content as the most important functions in using audiobooks.

Deconstructing the Genealogy of Orientalism in Term of a Supplement (『오리엔탈리즘』 계보학의 해체론적 재해석 "Truths are illusions which we have forgotten are illusions") (진리란 그것이 환상임을 망각하고 있는 착각이다))

  • Choi, Su
    • English & American cultural studies
    • /
    • v.17 no.2
    • /
    • pp.29-61
    • /
    • 2017
  • Said's Orientalism criticized the European representations on the Middle-East by theorizing orientalism as a discourse. In this text, he explored and criticized the colonial forms of knowledge and language that distorted the image of the colonized. The justification of the discourse of orientalism is derived from the binary system that is originated from Plato which Derrida rejects on the ground that it always privileges one term over the other, that is, colonizer over colonized. Derrida names for this traditional heritage of Western binary system logocentrism which regards logos(the Greek term for speech or reason) as the central principle of language and philosophy, whereas mythos derives its meaning from the logos on the basis of binary oppositions. Thus according to logocentrism, the colonized is merely the defined who can have its meaning from the definers, colonizers. In this paper, utilizing Derrida's a (non)concept called supplement which means both to add on as a surplus and to make up something missing as a mere extra, I propose another alternative interpretation towards the critique of colonial representation by raising internal contradictions in the Platonic dichotomy between logos and mythos embedded in western colonialism discourse, orientalism. I attempt to show that logos(colonizer) and mythos(colonized) is inseparable in itself due to the fact that they exist as supplementary. For this purpose, I demonstrate how colonial binary system constituted and was constituted in terms of language. Through this paper I reinterpret the colonial rationality of privileging 'logos' over 'mythos' by substituting the colonial binary system with the supplement.

Understanding the semantic change of Hangeul using word embedding (단어 임베딩 기법을 이용한 한글의 의미 변화 파악)

  • Sun, Hyunseok;Lee, Yung-Seop;Lim, Changwon
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.3
    • /
    • pp.295-308
    • /
    • 2021
  • In recent years, as many people post their interests on social media or store documents in digital form due to the development of the internet and computer technologies, the amount of text data generated has exploded. Accordingly, the demand for technology to create valuable information from numerous document data is also increasing. In this study, through statistical techniques, we investigate how the meanings of Korean words change over time by using the presidential speech records and newspaper articles public data. Using this, we present a strategy that can be utilized in the study of the synchronic change of Hangeul. The purpose of this study is to deviate from the study of the theoretical language phenomenon of Hangeul, which was studied by the intuition of existing linguists or native speakers, to derive numerical values through public documents that can be used by anyone, and to explain the phenomenon of changes in the meaning of words.

Syntactic and Semantic Disambiguation for Interpretation of Numerals in the Information Retrieval (정보 검색을 위한 숫자의 해석에 관한 구문적.의미적 판별 기법)

  • Moon, Yoo-Jin
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.8
    • /
    • pp.65-71
    • /
    • 2009
  • Natural language processing is necessary in order to efficiently perform filtering tremendous information produced in information retrieval of world wide web. This paper suggested an algorithm for meaning of numerals in the text. The algorithm for meaning of numerals utilized context-free grammars with the chart parsing technique, interpreted affixes connected with the numerals and was designed to disambiguate their meanings systematically supported by the n-gram based words. And the algorithm was designed to use POS (part-of-speech) taggers, to automatically recognize restriction conditions of trigram words, and to gradually disambiguate the meaning of the numerals. This research performed experiment for the suggested system of the numeral interpretation. The result showed that the frequency-proportional method recognized the numerals with 86.3% accuracy and the condition-proportional method with 82.8% accuracy.

Research on Generative AI for Korean Multi-Modal Montage App (한국형 멀티모달 몽타주 앱을 위한 생성형 AI 연구)

  • Lim, Jeounghyun;Cha, Kyung-Ae;Koh, Jaepil;Hong, Won-Kee
    • Journal of Service Research and Studies
    • /
    • v.14 no.1
    • /
    • pp.13-26
    • /
    • 2024
  • Multi-modal generation is the process of generating results based on a variety of information, such as text, images, and audio. With the rapid development of AI technology, there is a growing number of multi-modal based systems that synthesize different types of data to produce results. In this paper, we present an AI system that uses speech and text recognition to describe a person and generate a montage image. While the existing montage generation technology is based on the appearance of Westerners, the montage generation system developed in this paper learns a model based on Korean facial features. Therefore, it is possible to create more accurate and effective Korean montage images based on multi-modal voice and text specific to Korean. Since the developed montage generation app can be utilized as a draft montage, it can dramatically reduce the manual labor of existing montage production personnel. For this purpose, we utilized persona-based virtual person montage data provided by the AI-Hub of the National Information Society Agency. AI-Hub is an AI integration platform aimed at providing a one-stop service by building artificial intelligence learning data necessary for the development of AI technology and services. The image generation system was implemented using VQGAN, a deep learning model used to generate high-resolution images, and the KoDALLE model, a Korean-based image generation model. It can be confirmed that the learned AI model creates a montage image of a face that is very similar to what was described using voice and text. To verify the practicality of the developed montage generation app, 10 testers used it and more than 70% responded that they were satisfied. The montage generator can be used in various fields, such as criminal detection, to describe and image facial features.

Augmented Reality based Museum Guidance System Selective Viewing (증강현실을 이용한 선택적 가이드 시스템 -관람자의 관심에 따라 박물관 관람을 안내 하는 가이드 시스템)

  • Park, Joon-Suk;Lee, Dong-Hyun;Park, Jun
    • 한국HCI학회:학술대회논문집
    • /
    • 2008.02a
    • /
    • pp.45-48
    • /
    • 2008
  • Using these systems, additional information on the paintings and exhibits may be provided in the forms of text, image, speech, and video However, at museums and exhibitions, many tourists are often interested in exhibits of some particular style, authors, or coteries. The proposed Augmented Reality based guidance system may guide the users to exhibits of their interest for selective viewing. Location of the next exhibit of interest may be informed to the users as well as additional multimedia information on the exhibits of interest Such information is shown on the Augmented Reality views of the user's display device. The proposed system is composed an Ultra-Mobile PC (UMPC), an inertia tracker, and a camera. In the beginning, the user may select his/her preference on the exhibits from the menu, and then the system starts guiding by showing the relative orientation, distance, and visual cue to find a next exhibit. When the user finds and locates the matching visual cue within a matching box of the display screen, the system provides multimedia information on the exhibit. According to the preliminary user test, the proposed system is convenient and useful for navigating through large-scale exhibition.

  • PDF

A Study On Intelligent Robot Control Based On Voice Recognition For Smart FA (스마트 FA를 위한 음성인식 지능로봇제어에 관한 연구)

  • Sim, H.S.;Kim, M.S.;Choi, M.H.;Bae, H.Y.;Kim, H.J.;Kim, D.B.;Han, S.H.
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.21 no.2
    • /
    • pp.87-93
    • /
    • 2018
  • This Study Propose A New Approach To Impliment A Intelligent Robot Control Based on Voice Recognition For Smart Factory Automation Since human usually communicate each other by voices, it is very convenient if voice is used to command humanoid robots or the other type robot system. A lot of researches has been performed about voice recognition systems for this purpose. Hidden Markov Model is a robust statistical methodology for efficient voice recognition in noise environments. It has being tested in a wide range of applications. A prediction approach traditionally applied for the text compression and coding, Prediction by Partial Matching which is a finite-context statistical modeling technique and can predict the next characters based on the context, has shown a great potential in developing novel solutions to several language modeling problems in speech recognition. It was illustrated the reliability of voice recognition by experiments for humanoid robot with 26 joints as the purpose of application to the manufacturing process.

Humanistic Imagination through the Case of Cultural Convergence Contents of Hwang Soon-won 「Sonagi」 (황순원 「소나기」의 문화융합 콘텐츠 사례를 통해 본 인문학적 상상력)

  • Lee, Nae-Kwan
    • Journal of the Korea Convergence Society
    • /
    • v.9 no.10
    • /
    • pp.199-208
    • /
    • 2018
  • In this paper, Hwang Soon-won's "Sonagi" is considered from the aspect of literary imagination reconstructed into a fusion content of various cultures such as HD TV Literature, Animation, Movie, Musical, CF and the part which changed from the original text. In the TV literature Museum "Sonagi" he created a newly person called a 'Seok-yi' who is a young brother of a boy, a girl's mother who did not exist in the original, and developed the composition more precisely. And In the animation, the main character's dialogue is presented as a speech and it is characteristic that the theme of the work is revealed more effectively. On the other hand, the heroine in the movie says, "I do not like the ending part of the showers." Also In the musical "Sonagi", about 2 tons of water was used to give more vividness and presence to the audience. In this way, the contents of the original works are transformed according to the characteristics of the medium in the various cultural fusion contents based on novels, and thus convey the unique imagination of the director to the audience.

Application of Korean Alphabet Domain-Names for Convenient Information Access in a Ubiquitous Information Network (유비쿼터스 정보네트워크에서의 편리한 정보액세스를 위한 한글 자음모음 도메인명의 응용)

  • Kim, Yung-Bok
    • The KIPS Transactions:PartC
    • /
    • v.12C no.7 s.103
    • /
    • pp.1067-1074
    • /
    • 2005
  • The mobile user interface becomes important to access information fast and conveniently, especially in the ubiquitous computing environment. Among many new services in the mobile computing environment, ubiquitous information networking service was studied using korean alphabet (consonant or vowel) domain-names including Korean single-character domain-names. Instead of handling long English/Korean URL-strings, as convenient user interface for information access, the Korean single-character/alphabet domain names are more convenient than long URL strings to retrieve information and to send information in the wired Internet as well as in the mobile Internet. We studied the convenience of Korean alphabet domain names with PCs as well as with mobile phones. We introduce the Implementation and the application of ubiquitous information portal, which has the functionality of Text to Speech (TTS) and is accessible with Korean single - character/alphabet domain - names.

Expiration Date Notification System Based on YOLO and OCR algorithms for Visually Impaired Person (YOLO와 OCR 알고리즘에 기반한 시각 장애우를 위한 유통기한 알림 시스템)

  • Kim, Min-Soo;Moon, Mi-Kyung;Han, Chang-Hee
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.16 no.6
    • /
    • pp.1329-1338
    • /
    • 2021
  • There are rarely effective methods to help visually impaired people when they want to know the expiration date of products excepted to only Braille. In this study, we developed an expiration date notification system based on YOLO and OCR for visually impaired people. The handicapped people can automatically know the expiration date of a specific product by using our system without the help of a caregiver, fast and accurately. The proposed system is worked by four different steps: (1) identification of a target product by scanning its barcode; (2) segmentation of an image area with the expiration date using YOLO; (3) classification of the expiration date by OCR: (4) notification of the expiration date by TTS. Our system showed an average classification accuracy of about 86.00% when blindfolded subjects used the proposed system in real-time. This result validates that the proposed system can be potentially used for visually impaired people.