• Title/Summary/Keyword: embedded TTS

Search Result 18, Processing Time 0.026 seconds

The implementation of database for high quality Embedded Text-to-speech system (고품질 내장형 음성합성 시스템을 위한 음성합성 DB구현)

  • Kwon, Oh-Il
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.42 no.4 s.304
    • /
    • pp.103-110
    • /
    • 2005
  • Speech Database is one of the most important part of Text-to-speech(TTS) system Especially, the embedded TTS system needs more small size of database than that of the server TTS system So, the compression and statistical reduction or database is a very important factor in the embedded TTS system But this compression and statistical reduction of database always rise a loss of quality of the synthesised speech. In this paper, we propose a method of constructing database for high quality embedded TTS system and verify the quality of synthesised speech with MOS(Mean Opinion Score) test.

Implementation of information access embedded system using two-dimensional bar code and TTS (2D 바코드와 TTS를 활용한 정보접근 임베디드 시스템 구현)

  • Lee, Jae-Kyun;Kim, Si-Woo;Lee, Chae-Wook;Lee, Dong-In
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.1 no.2
    • /
    • pp.31-36
    • /
    • 2006
  • As two dimensional bar code can collect data and information quickly, it is used and recognized as a useful tool for the many industrial application field. But the information capacity of two dimensional bar code is still limited. Recently, the two dimensional AD bar code (analog-digital code) that can increase its application range and overcome capacity limitation is developed. In this paper, we implement an effective system which can transform text information into voice using two dimensional AD bar code and TTS(Text To Speech). It can be transmitted to blind people by capturing the AD bar code on the papers or the books.

  • PDF

Implementation of Wideband Waveform Interpolation Coder for TTS DB Compression (TTS DB 압축을 위한 광대역 파형보간 부호기 구현)

  • Yang, Hee-Sik;Hahn, Min-Soo
    • MALSORI
    • /
    • v.55
    • /
    • pp.143-158
    • /
    • 2005
  • The adequate compression algorithm is essential to achieve high quality embedded TTS system. in this paper, we Propose waveform interpolation coder for TTS corpus compression after many speech coder investigation. Unlike speech coders in communication system, compression rate and anality are more important factors in TTS DB compression than other performance criteria. Thus we select waveform interpolation algorithm because it provides good speech quality under high compression rate at the cost of complexity. The implemented coder has bit rate 6kbps with quality degradation 0.47. The performance indicates that the waveform interpolation is adequate for TTS DB compression with some further study.

  • PDF

Development of a TTS based Book Reader for the Blind (시각장애인용 독서 스탠드 개발)

  • Kim, Dae-Yoo;Kim, Ho-Sung;Kim, Ji-Sang;Kim, Soo-Cheol;Hwang, Kwang-Il
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2011.11a
    • /
    • pp.422-424
    • /
    • 2011
  • 시각장애인이 책을 읽을 수 있는 방법은 점자책 또는 오디오북이 있다. 그러나 점자책과 오디오북은 그 개수가 한정적이다. 또한, 점자책과 오디오북을 제작하는 데에는 상당한 시간이 소요된다. 이로 인해 시각장애인의 기본적인 독서권이 침해 받고 있다. 이러한 문제점을 해결하기 위해 영상처리, OCR, TTS 기법을 적용해 시각장애인용 독서 스탠드를 개발하였다. 제안하는 시스템에서는 문자 인식률 향상을 위해 왜곡된 이미지를 보정한 후 단편 블록화 과정을 추가로 적용하여 문자 인식률을 93%까지 증가시켜 실용성을 높였다. 개발된 시스템은 도서관 및 서점 등에 설치되어 시각장애인의 독서권을 확보하는데 도움이 될 것으로 기대된다.

Implementation of Information Access Embedded System for the Blind People (시각 장애인을 위한 정보접근 임베디드 시스템의 구현)

  • Kim, Si-Woo;Lee, Jae-Kyun;Lee, Chae-Wook
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.33 no.2C
    • /
    • pp.167-172
    • /
    • 2008
  • Since a 2-dimensional (2D) bar code can retrieve data and information quickly, it is widely used and recognized as a useful tool for many industrial applications. However, the information capacity of the 2D bar code is still limited. Recently the analog-digital code (AD code), which has the largest storage capacity yet contained in a code, has been developed, thereby expanding the bar code's application range because it overcomes the limitation of data capacity. In this paper, we present the AD code and implement an effective embedded system which can transform text information into voice using the 2D AD code and Text To Speech (TTS). This voice information can also be transmitted to blind people as well as the old by capturing the AD code on paper or in books.

듀얼모드지원 응용 서비스 설계 및 구현

  • Kim, Do-Hyung;Yun, Min-Hong;Kim, Sun-Ja;Lee, Cheol-Hoon
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2006.10d
    • /
    • pp.411-414
    • /
    • 2006
  • 본 논문에서는 임베디드 리눅스 기반의 응용 서비스인 모바일 이야기꾼의 설계 및 구현에 대해서 기술한다. 모바일 이야기꾼은 음성 통신을 위해 CDMA 네트워크와 데이터 통신을 위해 와이브로 네트워크를 동시에 사용한다. 송신자가 CDMA와 WiBro를 지원하는 듀얼모드 단말에서 텍스트를 입력하면 텍스트는 와이브로 네트워크를 통해 인터넷 상에 위치한 TTS 서버에 전달된다. 텍스트를 전달받은 TTS 서버는 텍스트를 음성으로 변경하고, 듀얼모드 지원 단말에 음성 데이터를 전송하게 된다. 마지막으로, 듀얼모드 지원 단말은 변환된 음성을 CDMA 네트워크를 통해 수신자에게 전달하게 된다. 모바일 이야기꾼은 주변환경이 시끄럽거나, 송신자가 언어장애가 있는 경우에도 사용자로 하여금 음성 통화를 할 수 있도록 지원한다.

  • PDF

Speech Interactive Agent on Car Navigation System Using Embedded ASR/DSR/TTS

  • Lee, Heung-Kyu;Kwon, Oh-Il;Ko, Han-Seok
    • Speech Sciences
    • /
    • v.11 no.2
    • /
    • pp.181-192
    • /
    • 2004
  • This paper presents an efficient speech interactive agent rendering smooth car navigation and Telematics services, by employing embedded automatic speech recognition (ASR), distributed speech recognition (DSR) and text-to-speech (ITS) modules, all while enabling safe driving. A speech interactive agent is essentially a conversational tool providing command and control functions to drivers such' as enabling navigation task, audio/video manipulation, and E-commerce services through natural voice/response interactions between user and interface. While the benefits of automatic speech recognition and speech synthesizer have become well known, involved hardware resources are often limited and internal communication protocols are complex to achieve real time responses. As a result, performance degradation always exists in the embedded H/W system. To implement the speech interactive agent to accommodate the demands of user commands in real time, we propose to optimize the hardware dependent architectural codes for speed-up. In particular, we propose to provide a composite solution through memory reconfiguration and efficient arithmetic operation conversion, as well as invoking an effective out-of-vocabulary rejection algorithm, all made suitable for system operation under limited resources.

  • PDF

Design and Implementation of Embedded Linux-based Mobile Teller which supports CDMA and WiBro networks (듀얼모드 통신 지원 임베디드 리눅스 기반의 모바일 이야기꾼 설계 및 구현)

  • Kim, Do-Hyung;Yun, Min-Hong;Lee, Kyung-Hee;Lee, Cheol-Hoon
    • The KIPS Transactions:PartD
    • /
    • v.15D no.1
    • /
    • pp.131-138
    • /
    • 2008
  • This paper describes the implementations of the first application service based on embedded Linux; Mobile Teller which uses WiBro network for data communications and CDMA network for voice communications. Currently, with the appearance of WiBro service, dual-mode terminals which support two heterogeneous networks are available. But, the development of applications which effectively use these networks for providing better service to user is rarely prepared. At Mobile Teller, when a sender on a dual-mode terminal types texts, the texts are transmitted to a TTS server located in the Internet through WiBro network. Subsequently, the TTS server converts the texts into voices and transmits the voice data to the dual-mode terminal. At last the dual-mode terminal sends the voice to the receiver through the CDMA network. In case of noisy environment or when a user has difficulty in speaking, Mobile Teller makes voice communication possible

Grapheme-to-Phoneme Conversion of Arabic Numeral Expressions for Embedded TTS Systems (임베디드 TTS 시스템을 위한 아라비안 숫자의 문자 변환)

  • Jung, Young-Im;Yoon, Ae-Sun;Kwon, Hyuk-Chul
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2005.07b
    • /
    • pp.442-444
    • /
    • 2005
  • 본 논문에서는 아라비안 숫자의 중의성을 효과적으로 제거하고 숫자 표현의 발음을 정확하게 문자화할 수 있는 임베디드 시스템용 경량화된 아라비안 숫자 읽기 시스템을 제안한다. 이를 위해 7 가지의 숫자 읽기 방식(Headings of Arabic Numerals RAN)을 분류하였고, 문자화 규칙을 설정하기 위해. (1) 문맥 자질, (2) 패턴 자질, (3) 휴리스틱 정보를 숫자 표현의 의미에 따라 분석하였다. 그리고 숫자의 문자화 시스템을 최적화하여 임베디드 시스템에 탑재하기 위해 (1) 형태소 분석 모듈의 분리, (2) 사전 압축, (3) 인명과 지명의 제거를 하였고, 이를 홍해 심각한 정확도 손실 없이 메모리 사용량과 처리 시간을 크게 줄일 수 있었다. 경량화된 mini-TAN 은 $96.9\~98.3\%$의 정확도를 보이며, 기존 상용 TTS 시스템에 비해서도 숫자 읽기의 처리에 있어 높은 정확도를 보인다.

  • PDF

Decision-Tree-Based Markov Model for Phrase Break Prediction

  • Kim, Sang-Hun;Oh, Seung-Shin
    • ETRI Journal
    • /
    • v.29 no.4
    • /
    • pp.527-529
    • /
    • 2007
  • In this paper, a decision-tree-based Markov model for phrase break prediction is proposed. The model takes advantage of the non-homogeneous-features-based classification ability of decision tree and temporal break sequence modeling based on the Markov process. For this experiment, a text corpus tagged with parts-of-speech and three break strength levels is prepared and evaluated. The complex feature set, textual conditions, and prior knowledge are utilized; and chunking rules are applied to the search results. The proposed model shows an error reduction rate of about 11.6% compared to the conventional classification model.

  • PDF