• Title/Summary/Keyword: automatic voice system

Search Result 81, Processing Time 0.021 seconds

Speech Interactive Agent on Car Navigation System Using Embedded ASR/DSR/TTS

  • Lee, Heung-Kyu;Kwon, Oh-Il;Ko, Han-Seok
    • Speech Sciences
    • /
    • v.11 no.2
    • /
    • pp.181-192
    • /
    • 2004
  • This paper presents an efficient speech interactive agent rendering smooth car navigation and Telematics services, by employing embedded automatic speech recognition (ASR), distributed speech recognition (DSR) and text-to-speech (ITS) modules, all while enabling safe driving. A speech interactive agent is essentially a conversational tool providing command and control functions to drivers such' as enabling navigation task, audio/video manipulation, and E-commerce services through natural voice/response interactions between user and interface. While the benefits of automatic speech recognition and speech synthesizer have become well known, involved hardware resources are often limited and internal communication protocols are complex to achieve real time responses. As a result, performance degradation always exists in the embedded H/W system. To implement the speech interactive agent to accommodate the demands of user commands in real time, we propose to optimize the hardware dependent architectural codes for speed-up. In particular, we propose to provide a composite solution through memory reconfiguration and efficient arithmetic operation conversion, as well as invoking an effective out-of-vocabulary rejection algorithm, all made suitable for system operation under limited resources.

  • PDF

A Train Ticket Reservation Aid System Using Automated Call Routing Technology Based on Speech Recognition (음성인식을 이용한 자동 호 분류 철도 예약 시스템)

  • Shim Yu-Jin;Kim Jae-In;Koo Myung-Wan
    • MALSORI
    • /
    • no.52
    • /
    • pp.161-169
    • /
    • 2004
  • This paper describes the automated call routing for train ticket reservation aid system based on speech recognition. We focus on the task of automatically routing telephone calls based on user's fluently spoken response instead of touch tone menus in an interactive voice response system. Vector-based call routing algorithm is investigated and mapping table for key term is suggested. Korail database collected by KT is used for call routing experiment. We evaluate call-classification experiments for transcribed text from Korail database. In case of small training data, an average call routing error reduction rate of 14% is observed when mapping table is used.

  • PDF

WWW Based Instruction Systems for English Learning: GAIA

  • Park, Phan-Woo
    • Journal of The Korean Association of Information Education
    • /
    • v.3 no.2
    • /
    • pp.113-119
    • /
    • 2000
  • I studied a distance education model for English learning on the Internet. Basic WWW files, that contain courseware, are constructed with HTML, and functions, which are required in learning, are implemented with Java. Students and educators can access the preferred unit composed of the appropriate text, voice and image data by using a WWW browser at any time. The education system supports the automatic generation facility of English problems to practice reading and writing by making good use of the courseware data or various English text resources located on the Internet. Our system has functions to manage and control the flow of distance learning and to offer interaction between students and the system in a distributed environment. Educators can manage students' learning and can immediately be aware of who is attending and who is quitting the lesson in virtual space. Also, students and educators in different places can communicate and discuss a topic through the server. I implemented these functions, which are required in a client/server environment of distance education, with the use of Java. The URL for this system is "http://park.taegu-e.ac.kr" in the name of GAIA.

  • PDF

Automatic Control Faucet based on Voice recognition using AI (AI를 이용한 음성인식 기반 자동제어 수전)

  • Roh, Jae-Hee;Baek, Jee-Yoon;Hong, Ji-Hyeon;Lee, Young-Seop
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2019.10a
    • /
    • pp.1011-1013
    • /
    • 2019
  • 4차 산업 혁명에 따라 최근 스마트홈 연구가 활발히 이루어지고 있으며 기술이 발전함에 따라 스마트홈의 개념은 변해왔다. '음성' 인터페이스를 기반으로 Google에서 제공하는 지능형 가상 비서인 Google Assistant API[1]를 이용하여 AI를 기반으로 한 음성인식 제어 수전을 제안한다. 나아가 OECD가 발표한 '심각한 물 스트레스 국가'에 속하는 대한민국 국민들에게 물 사용량의 실태를 확인하고 과다한 물 사용량에 대한 경각심을 일깨워준다.

Automated Speech Analysis Applied to Sasang Constitution Classification (음성을 이용한 사상체질 분류 알고리즘)

  • Kang, Jae-Hwan;Yoo, Jong-Hyang;Lee, Hae-Jung;Kim, Jong-Yeol
    • Phonetics and Speech Sciences
    • /
    • v.1 no.3
    • /
    • pp.155-163
    • /
    • 2009
  • This paper introduces an automatic voice classification system for the diagnosis of individual constitution based on Sasang Constitutional Medicine (SCM) in Traditional Korean Medicine (TKM). For the developing of this algorithm, we used the voices of 473 speakers and extracted a total of 144 speech features from the speech data consisting of five sustained vowels and one sentence. The classification system, based on a rule-based algorithm that is derived from a non parametric statistical method, presents binary negative decisions. In conclusion, 55.7% of the speech data were diagnosed by this system, of which 72.8% were correct negative decisions.

  • PDF

The Character Recognition System of Mobile Camera Based Image (모바일 이미지 기반의 문자인식 시스템)

  • Park, Young-Hyun;Lee, Hyung-Jin;Baek, Joong-Hwan
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.11 no.5
    • /
    • pp.1677-1684
    • /
    • 2010
  • Recently, due to the development of mobile phone and supply of smart phone, many contents have been developed. Especially, since the small-sized cameras are equiped in mobile devices, people are interested in the image based contents development, and it also becomes important part in their practical use. Among them, the character recognition system can be widely used in the applications such as blind people guidance systems, automatic robot navigation systems, automatic video retrieval and indexing systems, automatic text translation systems. Therefore, this paper proposes a system that is able to extract text area from the natural images captured by smart phone camera. The individual characters are recognized and result is output in voice. Text areas are extracted using Adaboost algorithm and individual characters are recognized using error back propagated neural network.

Analysis of Delay Characteristics in Advanced Intelligent Network-Intelligent Peripheral (AIN IP) (차세대 지능망 지능형 정보제공 시스템의 지연 특성 분석)

  • 이일우;최고봉
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.25 no.8A
    • /
    • pp.1124-1133
    • /
    • 2000
  • Advanced Intelligent Network Intelligent Peripheral (AIN IP) is one of the AIN elements which consist of Service Control Point (SCP), Service Switching Point (SSP), and IP for AIN services, such as play announcement, digit collect, voice recognition/synthesis, voice prompt and receipt. This paper, featuring ISUP/INAP protocols, describes the procedures for call setup/release bearer channels between SSP/SCP and IP, todeliver specialized resources through the bearer channels, and it describes the structure and procedure for AIN services such as Automatic Collect Call (ACC), Universal Personal Telecommunication (UPT), and teleVOTing(VOT). In this environments, the delay characteristics of If system is investigated as the performance analysis, Policy establishment.

  • PDF

Development of Mobile Station in the CDMA Mobile System

  • Kim, Sun-Young;Uh, Yoon;Kweon, Hye-Yeoun;Lee, Hyuck-Jae
    • ETRI Journal
    • /
    • v.19 no.3
    • /
    • pp.202-227
    • /
    • 1997
  • This paper describes the development of the CDMA mobile station to support non-speech, mobile office services such as data, fax, and short message service in addition to voice. We developed some important functions of layer 2 and layer 3. To provide non-speech services, we developed a terminal adapter and user interface software. The description of development process, software architecture and external interfaces required to provide such services is given. The description of a TTA-62 message analysis tool, a mobile station monitoring software, and an automatic test system developed for integration tests and performance measurements is also given.

  • PDF

Knowledge Transfer Using User-Generated Data within Real-Time Cloud Services

  • Zhang, Jing;Pan, Jianhan;Cai, Zhicheng;Li, Min;Cui, Lin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.1
    • /
    • pp.77-92
    • /
    • 2020
  • When automatic speech recognition (ASR) is provided as a cloud service, it is easy to collect voice and application domain data from users. Harnessing these data will facilitate the provision of more personalized services. In this paper, we demonstrate our transfer learning-based knowledge service that built with the user-generated data collected through our novel system that deliveries personalized ASR service. First, we discuss the motivation, challenges, and prospects of building up such a knowledge-based service-oriented system. Second, we present a Quadruple Transfer Learning (QTL) method that can learn a classification model from a source domain and transfer it to a target domain. Third, we provide an overview architecture of our novel system that collects voice data from mobile users, labels the data via crowdsourcing, utilises these collected user-generated data to train different machine learning models, and delivers the personalised real-time cloud services. Finally, we use the E-Book data collected from our system to train classification models and apply them in the smart TV domain, and the experimental results show that our QTL method is effective in two classification tasks, which confirms that the knowledge transfer provides a value-added service for the upper-layer mobile applications in different domains.

Development of Automatic Lip-sync MAYA Plug-in for 3D Characters (3D 캐릭터에서의 자동 립싱크 MAYA 플러그인 개발)

  • Lee, Sang-Woo;Shin, Sung-Wook;Chung, Sung-Taek
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.18 no.3
    • /
    • pp.127-134
    • /
    • 2018
  • In this paper, we have developed the Auto Lip-Sync Maya plug-in for extracting Korean phonemes from voice data and text information based on Korean and produce high quality 3D lip-sync animation using divided phonemes. In the developed system, phoneme separation was classified into 8 vowels and 13 consonants used in Korean, referring to 49 phonemes provided by Microsoft Speech API engine SAPI. In addition, the pronunciation of vowels and consonants has variety Mouth Shapes, but the same Viseme can be applied to some identical ones. Based on this, we have developed Auto Lip-sync Maya Plug-in based on Python to enable lip-sync animation to be implemented automatically at once.