• Title/Summary/Keyword: TTS system

Search Result 148, Processing Time 0.037 seconds

듀얼모드지원 응용 서비스 설계 및 구현

  • Kim, Do-Hyung;Yun, Min-Hong;Kim, Sun-Ja;Lee, Cheol-Hoon
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2006.10d
    • /
    • pp.411-414
    • /
    • 2006
  • 본 논문에서는 임베디드 리눅스 기반의 응용 서비스인 모바일 이야기꾼의 설계 및 구현에 대해서 기술한다. 모바일 이야기꾼은 음성 통신을 위해 CDMA 네트워크와 데이터 통신을 위해 와이브로 네트워크를 동시에 사용한다. 송신자가 CDMA와 WiBro를 지원하는 듀얼모드 단말에서 텍스트를 입력하면 텍스트는 와이브로 네트워크를 통해 인터넷 상에 위치한 TTS 서버에 전달된다. 텍스트를 전달받은 TTS 서버는 텍스트를 음성으로 변경하고, 듀얼모드 지원 단말에 음성 데이터를 전송하게 된다. 마지막으로, 듀얼모드 지원 단말은 변환된 음성을 CDMA 네트워크를 통해 수신자에게 전달하게 된다. 모바일 이야기꾼은 주변환경이 시끄럽거나, 송신자가 언어장애가 있는 경우에도 사용자로 하여금 음성 통화를 할 수 있도록 지원한다.

  • PDF

A new approach technique on Speech-to-Speech Translation (신호의 복원된 위상 공간을 이용한 오디오 상황 인지)

  • Le, Thanh Hien;Lee, Sung-young;Lee, Young-Koo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2009.11a
    • /
    • pp.239-240
    • /
    • 2009
  • We live in a flat world in which globalization fosters communication, travel, and trade among more than 150 countries and thousands of languages. To surmount the barriers among these languages, translation is required; Speech-to-Speech translation will automate the process. Thanks to recent advances in Automatic Speech Recognition (ASR), Machine Translation (MT), and Text-to-Speech (TTS), one can now utilize a system to translate a speech of source language to a speech of target language and vice versa in affordable manner. The three phase process establishes that the source speech be transcribed into a (set of) text of the source language (ASR) before the source text is translated into the target text (MT). Finally, the target speech is synthesized from the target text (TTS).

Currency Recognition System for Blind People (시각장애인을 위한 화폐 인식 시스템)

  • Dong-Jun Yoo;Sung-Jun Kim;Jun-Yeong Lee;Hyeon-Su Kang;Jun-Ho Son;Se-Jin Oh
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2024.01a
    • /
    • pp.257-258
    • /
    • 2024
  • 현재 시각장애인들이 현금을 사용하게 될 시 지폐가 얼마인지 확인할 방법이 없어 불편을 겪거나 금전적 사기를 당할 위험이 잦다. 한국은행에서는 이러한 사고를 막기 위해 점자 지폐를 만들어 발부하고 있지만 시각장애인 91%가 식별하지 못해 많은 불편을 겪고 있다. 본 논문에서는 딥러닝을 활용하여 화폐를 인식하고 TTS 기술을 사용하여 지폐의 값이 얼마인지 소리로 알려주는 시스템을 개발하였다. 지폐 인식을 위해 데이터를 직접 수집하여 YOLOv5 알고리즘을 활용하여 학습시킨 Weights 파일을 사용하였다. 이를 활용하여 시각장애인들은 더 안전하게 현금을 사용하고, 금전적인 문제를 예방할 수 있다.

  • PDF

Speech Interactive Agent on Car Navigation System Using Embedded ASR/DSR/TTS

  • Lee, Heung-Kyu;Kwon, Oh-Il;Ko, Han-Seok
    • Speech Sciences
    • /
    • v.11 no.2
    • /
    • pp.181-192
    • /
    • 2004
  • This paper presents an efficient speech interactive agent rendering smooth car navigation and Telematics services, by employing embedded automatic speech recognition (ASR), distributed speech recognition (DSR) and text-to-speech (ITS) modules, all while enabling safe driving. A speech interactive agent is essentially a conversational tool providing command and control functions to drivers such' as enabling navigation task, audio/video manipulation, and E-commerce services through natural voice/response interactions between user and interface. While the benefits of automatic speech recognition and speech synthesizer have become well known, involved hardware resources are often limited and internal communication protocols are complex to achieve real time responses. As a result, performance degradation always exists in the embedded H/W system. To implement the speech interactive agent to accommodate the demands of user commands in real time, we propose to optimize the hardware dependent architectural codes for speed-up. In particular, we propose to provide a composite solution through memory reconfiguration and efficient arithmetic operation conversion, as well as invoking an effective out-of-vocabulary rejection algorithm, all made suitable for system operation under limited resources.

  • PDF

A Study On Male-To-Female Voice Conversion (남녀 음성 변환 기술연구)

  • Choi Jung-Kyu;Kim Jae-Min;Han Min-Su
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.115-118
    • /
    • 2000
  • Voice conversion technology is essential for TTS systems because the construction of speech database takes much effort. In this paper. male-to-female voice conversion technology in Korean LPC TTS system has been studied. In general. the parameters for voice color conversion are categorized into acoustic and prosodic parameters. This paper adopts LSF(Line Spectral Frequency) for acoustic parameter, pitch period and duration for prosodic parameters. In this paper. Pitch period is shortened by the half, duration is shortened by $25\%, and LSFs are shifted linearly for the voice conversion. And the synthesized speech is post-filtered by a bandpass filter. The proposed algorithm is simpler than other algorithms. for example, VQ and Neural Net based methods. And we don't even need to estimate formant information. The MOS(Mean Opinion Socre) test for naturalness shows 2.25 and for female closeness, 3.2. In conclusion, by using the proposed algorithm. male-to-female voice conversion system can be simply implemented with relatively successful results.

  • PDF

Release Characteristics to Vitamin $B_{2}$ of Chitosan Ointments In vitro (In vitro에서 키토산 연고의 비타민 $B_{2}$ 방출 특성)

  • Oh, Se-Young;Hwang, Sung-Kwy;Hwang, Yong-Hyun
    • Journal of the Korean Applied Science and Technology
    • /
    • v.17 no.1
    • /
    • pp.43-48
    • /
    • 2000
  • Drug delivery system(DDS) applied to various fields, such as medicine, cosmetics, agriculture and necessities of life. Among these application fields, DDS is often used as the method of drug dosage into the epidermic skin. We investigated characters of transdermal therapeutic system(TTS) and the skin permeability of that with applying DDS. Chitosan was selected as material of TTS. We investigated the permeation of chitosan ointment containing drug in rat skin using horizontal membrane cell model. Permeation properties of materials were investigated for water-soluble drug such as riboflavin in vitro. We used glycerin, PEG 600 and oleic acid as enhancers. Since dermis has more content water(hydration) than the stratum corneum, skin permeation rate at steady state was highly influenced when glycerin was used in water-soluble drug. The permeation rate of content enhancer and drug was found to be faster than that of content water-soluble drug only. These results showed that skin permeation rate of drug across the composite was manly dependent on the property of ointment base and drug. Proper selection of the polymeric materials which resemble and enhance properties of the delivering drug was found to be important in controlling the skin permeation rate.

AP, IP Prediction For Corpus-based Korean Text-To-Speech (코퍼스 방식 음성합성에서의 개선된 운율구 경계 예측)

  • Kwon, O-Hil;Hong, Mun-Ki;Kang, Sun-Mee;Shin, Ji-Young
    • Speech Sciences
    • /
    • v.9 no.3
    • /
    • pp.25-34
    • /
    • 2002
  • One of the most important factor in the performance of Korean text-to-speech system is the prediction of accentual and intonational phrase boundary. The previous method of prediction shows only the 75-85% which is not proper in the practical and commercial system. Therefore, more accurate prediction must be needed in the practical system. In this study, we propose the simple and more accurate method of the prediction of AP, IP.

  • PDF

Context-adaptive Smoothing for Speech Synthesis (음성 합성기를 위한 문맥 적응 스무딩 필터의 구현)

  • 이기승;김정수;이재원
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.3
    • /
    • pp.285-292
    • /
    • 2002
  • One of the problems that should be solved in Text-To-Speech (TTS) is discontinuities at unit-joining points. To cope with this problem, a smoothing method using a low-pass filter is employed in this paper, In the proposed soothing method, a filter coefficient that controls the amount of smoothing is determined according to contort information to be synthesized. This method efficiently reduces both discontinuities at unit-joining points and artifacts caused by undesired smoothing. The amount of smoothing is determined with discontinuities around unit-joins points in the current synthesized speech and discontinuities predicted from context. The discontinuity predictor is implemented by CART that has context feature variables. To evaluate the performance of the proposed method, a corpus-based concatenative TTS was used as a baseline system. More than 6075 of listeners realized that the quality of the synthesized speech through the proposed smoothing is superior to that of non-smoothing synthesized speech in both naturalness and intelligibility.

Implementation of Yoga Posture Training Application Using Google ML Kit (Google ML Kit를 이용한 요가 자세 훈련 애플리케이션 구현)

  • Kim, Hyoung Min;Yoon, Jong Hyeon;Park, Su Hyun;Yu, Yun Seop
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.10a
    • /
    • pp.178-180
    • /
    • 2022
  • An application implementation that allows users to train yoga posture based on the landmark of yoga posture of yoga instructors obtained from the Google Firebase ML Kit was introduced. Using the ML Kit, the user's posture is classified and landmarks corresponding to each joint are obtained. The accuracy measurement reference value for the yoga posture is set through the angle formed by the joints of the obtained landmark. The accuracy between the reference landmark for the yoga posture of professional yoga instructors and the landmark for the user's pose through the ML Kit was compared. According to the accuracy reference value, information on malfunction and correct motion is provided to the user through Text-to-Speech (TTS). Users are managed effectively with Firebase, and a system that displays the amount of exercise through a counter and timer when the user performs an exercise that meets the accuracy reference value was explained.

  • PDF