• Title/Summary/Keyword: SpeechWeb

Search Result 100, Processing Time 0.025 seconds

A Study on the Sound Effect for Improving Customer's Speech Recognition in the TTS-based Shop Music Broadcasting Service (TTS를 이용한 매장음원방송에서 고객의 인지도 향상을 위한 음향효과 연구)

  • Kang, Sun-Mee;Kim, Hyun-Deuc;Chang, Moon-Soo
    • Phonetics and Speech Sciences
    • /
    • v.1 no.4
    • /
    • pp.105-109
    • /
    • 2009
  • This thesis describes the method for well voice announcement using the TTS(Text-To-Speech) technology in the shop music broadcasting service. Offering a high quality TTS sound service for each shop requires a great expense. According to a report on the architectural acoustics the room acoustic indexes such as reverberation time and early decay time are closely connected with a subjective awareness about acoustics. By using the result the customers will be able to recognize better the voice announcement by applying sound effect to speech files made by TTS. The result of an aural comprehension examination has shown better about almost all of the parameters by applying reverb effect to TTS sound.

  • PDF

'Hanmal' Korean Language Diphone Database for Speech Synthesis

  • Chung, Hyun-Song
    • Speech Sciences
    • /
    • v.12 no.1
    • /
    • pp.55-63
    • /
    • 2005
  • This paper introduces a 'Hanmal' Korean language diphone database for speech synthesis, which has been publicly available since 1999 in the MBROLA web site and never been properly published in a journal. The diphone database is compatible with the MBROLA programme of high-quality multilingual speech synthesis systems. The usefulness of the diphone database is introduced in the paper. The paper also describes the phonetic and phonological structure of the database, showing the process of creating a text corpus. A machine-readable Korean SAMPA convention for the control data input to the MBROLA application is also suggested. Diphone concatenation and prosody manipulation are performed using the MBR-PSOLA algorithm. A set of segment duration models can be applied to the diphone synthesis of Korean.

  • PDF

Integrating Pronunciation into a Classroom and on the Web Courseware

  • Kim, He-Kyung
    • MALSORI
    • /
    • no.41
    • /
    • pp.49-59
    • /
    • 2001
  • The aim of this paper is to suggest possible methods of integrating the teaching of Pronunciation into a typical communicative classroom and consequently on English teaming web courseware. It is hoped that this paper will prompt teachers to understand the current role of pronunciation in communicative English programs and that the web technology can help students improve their pronunciation, which will develop their speaking and listening skills. It also suggests the need for a database of visualized communicative expressions.

  • PDF

VoiceXML Dialog System Based on RSS for Contents Syndication (콘텐츠 배급을 위한 RSS 기반의 VoiceXML 다이얼로그 시스템)

  • Kwon, Hyeong-Joon;Kim, Jung-Hyun;Lee, Hyon-Gu;Hong, Kwang-Seok
    • The KIPS Transactions:PartB
    • /
    • v.14B no.1 s.111
    • /
    • pp.51-58
    • /
    • 2007
  • This paper suggests prototype of dialog system combining VXML(VoiceXML) that is the W3C's standard XML format for specifying interactive voice dialogues between human and computer, and RSS(RDF Site Summary or Really Simple Syndication) that is representative technology of semantic web for syndication and subscription of updated web-contents. Merits of the proposed system are as following: 1) It is a new method that recognize spoken contents using ire and wireless telephone networks and then provide contents to user via STT(Speech-to-Text) and TTS(Text-to-Speech) instead of traditional method using web only. 2) It can apply advantage of RSS that subscription of updated contents is converted to VXML without modifying traditional method to provide RSS service, 3) In terms of users, it can reduce restriction on time-spate in search of contents that is provided by RSS because it uses ire and wireless telephone networks, not internet environment. 4) In terms of information provider, it does not need special component for syndication of the newest contents using speech recognition and synthesis technology. We implemented a news service system using VXML and RSS for performance evaluation of the proposed system. In experiment results, we estimated the response time and the speech recognition rate in subscription and search of actuality contents, and confirmed that the proposed system can provide contents those are provided using RSS Feed.

Design of a Mirror for Fragrance Recommendation based on Personal Emotion Analysis (개인의 감성 분석 기반 향 추천 미러 설계)

  • Hyeonji Kim;Yoosoo Oh
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.28 no.4
    • /
    • pp.11-19
    • /
    • 2023
  • The paper proposes a smart mirror system that recommends fragrances based on user emotion analysis. This paper combines natural language processing techniques such as embedding techniques (CounterVectorizer and TF-IDF) and machine learning classification models (DecisionTree, SVM, RandomForest, SGD Classifier) to build a model and compares the results. After the comparison, the paper constructs a personal emotion-based fragrance recommendation mirror model based on the SVM and word embedding pipeline-based emotion classifier model with the highest performance. The proposed system implements a personalized fragrance recommendation mirror based on emotion analysis, providing web services using the Flask web framework. This paper uses the Google Speech Cloud API to recognize users' voices and use speech-to-text (STT) to convert voice-transcribed text data. The proposed system provides users with information about weather, humidity, location, quotes, time, and schedule management.

Voice Message System Supporting Massive Outbound Call (대량의 발신 호를 지원하는 음성 메시지 시스템)

  • Kim Jeonggon
    • MALSORI
    • /
    • no.49
    • /
    • pp.77-94
    • /
    • 2004
  • In this paper, new voice message system supporting massive outbound call is proposed. Basic idea of the proposed system is to pre-process all the text-to-speech conversion process, mixing of text and attached music file and to store the results of pre-process in the cache server which is connected to the IVR. New voice message system is optimized for the voice message system supporting massive outbound call by distributing the load of the web server caused by server-side script implementation which is accessing database and generating dynamic Voice XML document over client module and server module of web server. The proposed voice message system was test-deployed in one domestic voice message application service provider and it is shown that proposed voice message system reduced the response latency problem of test-bed voice message system.

  • PDF

COMPUTER AND INTERNET RESOURCES FOR PRONUNCIATION AND PHONETICS TEACHING

  • Makarova, Veronika
    • Proceedings of the KSPS conference
    • /
    • 2000.07a
    • /
    • pp.338-349
    • /
    • 2000
  • Pronunciation teaching is once again coming into the foreground of ELT. Japan is, however, lagging far behind many countries in the development of pronunciation curricula and in the actual speech performance of the Japanese learners of English. The reasons for this can be found in the prevalence of communicative methodologies unfavorable for pronunciation teaching, in the lack of trained professionals, and in the large numbers of students in Japanese foreign language classes. This paper offers a way to promote foreign language pronunciation teaching in Japan and other countries by means of employing computer and internet facilities. The paper outlines the major directions of using modem speech technologies in pronunciation classes, like EVF (electronic visual feedback) training at segmental and prosodic levels; automated error detection, testing, grading and fluency assessment. The author discusses the applicability of some specific software packages (CSLU, SUGIspeech, Multispeech, Wavesurfer, etc.) for the needs of pronunciation teaching. Finally, the author talks about the globalization of pronunciation education via internet resources, such as computer corpora and speech and pronunciation training related web pages.

  • PDF

Implementation of Speech Recognition System Using JAVA Applet

  • Park, Seungho;Park, Kwangkook;Kim, Kyungnam;Kim, Jingyoung;Kim, Kijung
    • Proceedings of the IEEK Conference
    • /
    • 2000.07a
    • /
    • pp.257-259
    • /
    • 2000
  • In this paper, a word-unit recognition is performed to implement a speech recognition system over the web, using JAVA Applet and continuous distributed HMM. The system based on Client/server model is designed. A client computer processes speech with Applet, and then transmits feature parameters to the server computer though the Internet. The speech recognition system in the server computer transmits the result applied by the forward algorithm to the client computer and the result is displayed in the client computer by text.

  • PDF

A Study on the Intelligent Personal Assistant Development Method Base on the Open Source (오픈소스기반의 지능형 개인 도움시스템(IPA) 개발방법 연구)

  • Kim, Kil-hyun;Kim, Young-kil
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2016.10a
    • /
    • pp.89-92
    • /
    • 2016
  • The latest the siri and like this is offering services that recognize and respond to words in the smartphone or web services. In order to handle intelligently these voices, It needs to search big data in the cloud and requires the implementation of parsing context accuracy given. In this paper, I would like to propose the study on the intelligent personal assistant development method base on the Open source with ASR(Automatic Speech Recognition), QAS(Question Answering System) and TTS(Text To Speech).

  • PDF

Analysis on Vowel and Consonant Sounds of Patent's Speech with Velopharyngeal Insufficiency (VPI) and Simulated Speech (구개인두부전증 환자와 모의 음성의 모음과 자음 분석)

  • Sung, Mee Young;Kim, Heejin;Kwon, Tack-Kyun;Sung, Myung-Whun;Kim, Wooil
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.7
    • /
    • pp.1740-1748
    • /
    • 2014
  • This paper focuses on listening test and acoustic analysis of patients' speech with velopharyngeal insufficiency (VPI) and normal speakers' simulation speech. In this research, a set consisting of 50-words, vowels and single syllables is determined for speech database construction. A web-based listening evaluation system is developed for a convenient/automated evaluation procedure. The analysis results show the trend of incorrect recognition for VPI speech and the one for simulation speech are similar. Such similarity is also confirmed by comparing the formant locations of vowel and spectrum of consonant sounds. These results show that the simulation method for VPI speech is effective at generating the speech signals similar to actual VPI patient's speech. It is expected that the simulation speech data can be effectively employed for our future work such as acoustic model adaptation.