• Title/Summary/Keyword: TTS(Text-to-Speech)

Search Result 139, Processing Time 0.059 seconds

Decision-Tree-Based Markov Model for Phrase Break Prediction

  • Kim, Sang-Hun;Oh, Seung-Shin
    • ETRI Journal
    • /
    • v.29 no.4
    • /
    • pp.527-529
    • /
    • 2007
  • In this paper, a decision-tree-based Markov model for phrase break prediction is proposed. The model takes advantage of the non-homogeneous-features-based classification ability of decision tree and temporal break sequence modeling based on the Markov process. For this experiment, a text corpus tagged with parts-of-speech and three break strength levels is prepared and evaluated. The complex feature set, textual conditions, and prior knowledge are utilized; and chunking rules are applied to the search results. The proposed model shows an error reduction rate of about 11.6% compared to the conventional classification model.

  • PDF

Design and Implementation of a Navigation System for Visually Impaired Persons (시각장애인을 위한 네비게이션 시스템 설계 및 구현)

  • Jang, Su-Min;Hwang, Dong-Gyo;Kang, Soo;Kim, Eun-Ju;Park, Jun-Ho;Jang, Ki-Hun;Yoo, Jae-Soo
    • The Journal of the Korea Contents Association
    • /
    • v.12 no.1
    • /
    • pp.38-47
    • /
    • 2012
  • In order to extend the activity range of visually impaired persons, we design and implement a navigation system that supports road information services and points of interest. The proposed navigation system consists of route creation modules and storage modules for visually impaired persons. In particular, the main interface of the navigation system are implemented using TTS(Text-to-Speech) program for sound and braille module that outputs braille with sense of touch. We also use google map APIs that can provide latest map information for the navigation system.

Development of technology to improve information accessibility of information vulnerable class using crawling & clipping

  • Jeong, Seong-Bae;Kim, Kyung-Shin
    • Journal of the Korea Society of Computer and Information
    • /
    • v.23 no.2
    • /
    • pp.99-107
    • /
    • 2018
  • This study started from the public interest purpose to help accessibility for the information acquisition of the vulnerable groups due to visual difficulties such as the elderly and the visually impaired. In this study, the server resources are minimized and implemented in most of the user smart phones. In addition, we implement a method to gather necessary information by collecting only pattern information by utilizing crawl & clipping without having to visit the site of the information of the various sites having the data necessary for the user, and to have it in the server. Especially, we applied the TTS(Text-To-Speech) service composed of smart phone apps and tried to develop a unified customized information collection service based on voice-based information collection method.

Hand-Gesture Dialing System for Safe Driving (안전성 확보를 위한 손동작 전화 다이얼링 시스템)

  • Jang, Won-Ang;Kim, Jun-Ho;Lee, Do Hoon;Kim, Min-Jung
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.13 no.10
    • /
    • pp.4801-4806
    • /
    • 2012
  • There are still problems have to solve for safety of driving comparing to the upgraded convenience of advanced vehicle. Most traffic accident is by uncareful driving cause of interface operations which are directive reasons of it in controlling the complicate multimedia device. According to interesting in smart automobile, various approaches for safe driving have been studied. The current multimedia interface embedded in vehicle is lacking the safety due to loss the sense and operation capacity by instantaneous view movement. In this paper, we propose a safe dialing system for safe driving to control dial and search dictionary by hand-gesture. The proposed system improved the user convenience and safety in automobile operation using intuitive gesture and TTS(Text to Speech).

Voice Recognition Speech Correction Application Using Big Data Analysis (빅데이터 분석을 활용한 음성 인식 스피치 교정 애플리케이션)

  • Kim, Han-Kyeol;Kim, Do-Woo;Lim, Sae-Myung;Hong, Du-Pyo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2019.10a
    • /
    • pp.533-535
    • /
    • 2019
  • 최근 청년 실업률의 증가에 따른 취업 경쟁이 날이 갈수록 심해지고 있다. 채용과정에서 면접의 비중을 높이는 기업도 갈수록 증가하고 있다. 또한 대기업에서는 면접의 객관성을 확보하기 위해 AI 면접을 도입했다. 이러한 면접의 도입으로 인해 취업 준비생들의 면접 준비에 드는 비용 부담이 증가하였다. 최근 AI분야에서 음성 인식과 자연어 처리에 대한 개발이 활발히 이루어지고 있다. 본 논문은 녹음된 면접 음성을 음성 인식 기술 중 STT(Speech To Text) 와 TTS(Text To Speech)를 활용하여 면접의 음성을 문자로, 면접 질문의 문장을 음성으로 변환한다. 또한 자연어 처리 및 감성어 사전(KNU)을 활용하여 면접 문장의 형태소 분석하고 긍정 및 부정 단어별 정보를 시각화 하여 나타낼 수 있게 구현하였다.

Text-to-speech with linear spectrogram prediction for quality and speed improvement (음질 및 속도 향상을 위한 선형 스펙트로그램 활용 Text-to-speech)

  • Yoon, Hyebin
    • Phonetics and Speech Sciences
    • /
    • v.13 no.3
    • /
    • pp.71-78
    • /
    • 2021
  • Most neural-network-based speech synthesis models utilize neural vocoders to convert mel-scaled spectrograms into high-quality, human-like voices. However, neural vocoders combined with mel-scaled spectrogram prediction models demand considerable computer memory and time during the training phase and are subject to slow inference speeds in an environment where GPU is not used. This problem does not arise in linear spectrogram prediction models, as they do not use neural vocoders, but these models suffer from low voice quality. As a solution, this paper proposes a Tacotron 2 and Transformer-based linear spectrogram prediction model that produces high-quality speech and does not use neural vocoders. Experiments suggest that this model can serve as the foundation of a high-quality text-to-speech model with fast inference speed.

A Study on the Voice Conversion with HMM-based Korean Speech Synthesis (HMM 기반의 한국어 음성합성에서 음색변환에 관한 연구)

  • Kim, Il-Hwan;Bae, Keun-Sung
    • MALSORI
    • /
    • v.68
    • /
    • pp.65-74
    • /
    • 2008
  • A statistical parametric speech synthesis system based on the hidden Markov models (HMMs) has grown in popularity over the last few years, because it needs less memory and low computation complexity and is suitable for the embedded system in comparison with a corpus-based unit concatenation text-to-speech (TTS) system. It also has the advantage that voice characteristics of the synthetic speech can be modified easily by transforming HMM parameters appropriately. In this paper, we present experimental results of voice characteristics conversion using the HMM-based Korean speech synthesis system. The results have shown that conversion of voice characteristics could be achieved using a few sentences uttered by a target speaker. Synthetic speech generated from adapted models with only ten sentences was very close to that from the speaker dependent models trained using 646 sentences.

  • PDF

Individual with mild autistic disorder Augmentative and alternative communication Training Program (경증 자폐성 장애인을 위한 보완·대체의사소통 지원프로그램)

  • Yoo, Sung-Ryeong;Park, Jeonghwa;Park, Suhyun
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2013.10a
    • /
    • pp.507-509
    • /
    • 2013
  • This paper covers the individual with mild autistic disorder complementary and alternative communication Support program by using Android. The complementary and alternative communication is the communicative system to help handicapped people who have problems with colloquial and non-colloquial communication. In this research, we will introduce the communication manner of autistic disorder, the method of how to measure the language disabled people's selection and frequency of the words, and the basic training method of Autism people's communication ways. In this paper, we developed complementary and alternative communication system which used language representative method to encourage language disabled people to study on communication in effective way. We utilized 'TTS technology' to enable handicapped people delivering their mind with the voice; moreover, by listening their voice by themselves, we accelerated their studies on communications. In addition, by offering 'Painting function', we promoted handicapped people to deliver their purpose widely and efficiently. Also, we built the smart system in 'Painting function' to collect frequency and educated degree data from the users by using this function, we can analyze the percentage of conscious and unconscious communication way of Autism cases to help them.

  • PDF

Using of The Korean Language Voice Synthesis For E-Mail Manager System (한국어 음성 합성을 이용한 이메일 매니저)

  • Jo, Gyu-Sang;Lee, Young-Hoon;Lee, Byeong-Ryeol;Seo, Dae-Young
    • Annual Conference on Human and Language Technology
    • /
    • 2009.10a
    • /
    • pp.266-270
    • /
    • 2009
  • IT 관련 산업의 발전에 의한 저변의 확대로 장애우들의 IT 사용 수요가 늘고 있다. 본 논문에서는 IT분야에서 가장 기초적으로 활용되는 E-Mail을 시각 장애우가 활용 하는 데에 불편함이 없도록 하는 이메일 매니저 개발에 관련된 기법에 대해 논하고자 한다. TTS(Text-To Speech : 문자 텍스트를 음성으로 전환하여 들려줌)와 음성키보드(키보드 입력 시 입력한 문자를 음성으로 알려줌) 기능으로 시각 장애우가 이메일을 사용함에 있어 불편함을 느끼지 않도록 하였으며 본 시스템의 TTS 알고리즘은 국어 표준발음법을 참고로 하여 자바로 구현 하였다.

  • PDF

Implementation of Formant Speech Analysis/Synthesis System (포만트 분석/합성 시스템 구현)

  • Lee, Joon-Woo;Son, Ill-Kwon;Bae, Keuo-Sung
    • Speech Sciences
    • /
    • v.1
    • /
    • pp.295-314
    • /
    • 1997
  • In this study, we will implement a flexible formant analysis and synthesis system. In the analysis part, the two-channel (i.e., speech & EGG signals) approach is investigated for accurate estimation of formant information. The EGG signal is used for extracting exact pitch information that is needed for the pitch synchronous LPC analysis and closed phase LPC analysis. In the synthesis part, Klatt formant synthesizer is modified so that the user can change synthesis parameters arbitarily. Experimental results demonstrate the superiority of the two-channel analysis method over the one-channel(speech signal only) method in analysis as well as in synthesis. The implemented system is expected to be very helpful for studing the effects of synthesis parameters on the quality of synthetic speech and for the development of Korean text-to-speech(TTS) system with the formant synthesis method.

  • PDF