• Title/Summary/Keyword: Speech recognition platform

Search Result 32, Processing Time 0.03 seconds

Development of Speech Recognition and Synthetic Application for the Hearing Impairment (청각장애인을 위한 음성 인식 및 합성 애플리케이션 개발)

  • Lee, Won-Ju;Kim, Woo-Lin;Ham, Hye-Won;Yun, Sang-Un
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2020.07a
    • /
    • pp.129-130
    • /
    • 2020
  • 본 논문에서는 청각장애인의 의사소통을 위한 안드로이드 애플리케이션 시스템 구현 결과를 보인다. 구글 클라우드 플랫폼(Google Cloud Platform)의 STT(Speech to Text) API를 이용하여 음성 인식을 통해 대화의 내용을 텍스트의 형태로 출력한다. 그리고 TTS(Text to Speech)를 이용한 음성 합성을 통해 텍스트를 음성으로 출력한다. 또한, 포그라운드 서비스(Service)에서 가속도계 센서(Accelerometer Sensor)를 이용하여 스마트폰을 2~3회 흔들었을 때 해당 애플리케이션을 실행할 수 있도록 하여 애플리케이션의 활용성을 높인 시스템을 개발하였다.

  • PDF

Implementation of the Speech Emotion Recognition System in the ARM Platform (ARM 플랫폼 기반의 음성 감성인식 시스템 구현)

  • Oh, Sang-Heon;Park, Kyu-Sik
    • Journal of Korea Multimedia Society
    • /
    • v.10 no.11
    • /
    • pp.1530-1537
    • /
    • 2007
  • In this paper, we implemented a speech emotion recognition system that can distinguish human emotional states from recorded speech captured by a single microphone and classify them into four categories: neutrality, happiness, sadness and anger. In general, a speech recorded with a microphone contains background noises due to the speaker environment and the microphone characteristic, which can result in serious system performance degradation. In order to minimize the effect of these noises and to improve the system performance, a MA(Moving Average) filter with a relatively simple structure and low computational complexity was adopted. Then a SFS(Sequential Forward Selection) feature optimization method was implemented to further improve and stabilize the system performance. For speech emotion classification, a SVM pattern classifier is used. The experimental results indicate the emotional classification performance around 65% in the computer simulation and 62% on the ARM platform.

  • PDF

VR-simulated Sailor Training Platform for Emergency (긴급상황에 대한 가상현실 선원 훈련 플랫폼)

  • Park, Chur-Woong;Jung, Jinki;Yang, Hyun-Seung
    • Proceedings of the Korean Institute of Navigation and Port Research Conference
    • /
    • 2015.10a
    • /
    • pp.175-178
    • /
    • 2015
  • This paper presents a VR-simulated sailor training platform for emergency in order to prevent a human error that causes 60~80% of domestic/ abroad marine accidents. Through virtual reality technology, the proposed platform provides an interaction method for proficiency of procedures in emergency, and a crowd control method for controlling crowd agents in a virtual ship environment. The interaction method uses speech recognition and gesture recognition to enhance the immersiveness and efficiency of the training. The crowd control method provides natural simulations of crowd agents by applying a behavior model that reflects the social behavior model of human. To examine the efficiency of the proposed platform, a prototype whose virtual training scenario describes the outbreak of fire in a ship was implemented as a standalone system.

  • PDF

A Study On the ASP Module in Conversational Automatic Speech Recognition Flight Information System (대화형 음성 인식 항공정보 시스템에서의 ASP 모듈에 관한 연구)

  • 윤재석;장준식
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.6 no.4
    • /
    • pp.595-603
    • /
    • 2002
  • In this research, it has been shown that how the computer can recognize and understand spoken natural language and its symbolization using VoiceXML and Grammar Specific Language in developing telephone based conversational automatic speech recognition flight information system. In order for user to hear correct information, ASP Module has been revised and its effectivities has been experimented on the Voice portal airplane information system platform.

Speech Recognition based Smart Home System using 5W1H Programming Model (5W1H 프로그래밍 모델을 기반으로 한 음성인식 스마트 홈 시스템)

  • Baek, Yeong-Tae;Lee, Se-Hoon;Kim, Ji-Seong;Sin, Bo-Bae
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2017.01a
    • /
    • pp.43-44
    • /
    • 2017
  • 본 논문에서는 상용화된 음성-인식 디바이스가 다른 임베디드 모듈과 통신하며 스마트홈 중앙처리 서버역할을 수행하려 할 때 제작사에 의해 개발되어지지 않거나 제한된 모듈과 서비스만을 제공한다는 문제점을 해결하기 위해 사용자가 직접 간단한 작업으로 원하는 기능의 모듈을 개발하여 자유롭게 음성인식명령을 추가할 수 있는 플랫폼을 제안한다. 본 논문에서 제안하는 플랫폼의 개념은 특정 OS에 종속되지 않으므로 다양한 시스템에서 제공될 수 있도록 설계되었으며 실험 플랫폼은 Windows기반으로 제작되었으나 다른 시스템에도 같은 개념을 적용하여 제작할 수 있다.

  • PDF

Two-way Interactive Algorithms Based on Speech and Motion Recognition with Generative AI Technology (생성형 AI 기술을 적용한 음성 및 모션 인식 기반 양방향 대화형 알고리즘)

  • Dae-Sung Jang;Jong-Chan Kim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.19 no.2
    • /
    • pp.397-402
    • /
    • 2024
  • Speech recognition and motion recognition technologies are applied and used in various smart devices, but they are composed of simple command recognition forms and are used as simple functions. Apart from simple functions for recognition data, professional command execution capabilities are required based on data learned in various fields. Research is being conducted on a system platform that provides optimal data to users using Generative AI, which is currently competing around the world, and can interact through voice recognition and motion recognition. The main technical processes designed for this study were designed using technologies such as voice and motion recognition functions, application of AI technology, and two-way communication. In this paper, two-way communication between a device and a user can be achieved by various input methods through voice recognition and motion recognition technology applied with AI technology.

The Interactive Voice Services based on VoiceXML (VoiceXML 기반 음성인식시스템을 이용한 서비스 개발)

  • Kim Hak-Gyoon;Kim Eun-Hyang;Kim Jae-In;Koo Myoung-Wan
    • MALSORI
    • /
    • no.43
    • /
    • pp.113-125
    • /
    • 2002
  • As there are needs to search the Web information via wire or wireless telephones, VoiceXML forum was established to develop and promote the Voice eXtensible Markup Language (VoiceXML). VoiceXML simplifies the creation of personalized interactive voice response services on the Web, and allows voice and phone access to information on Web sites, call center databases. Also, it can utilize the Web-based technologies, such as CGI(Common Gateway Interface) scripts. In this paper, we have developed the voice portal service platform based on VoiceXML called TeleGateway. It enables integration of voice services with data services using the Automatic Speech Recognition (ASR) and Text-To-Speech (TTS) engines. Also, we have showed the various services on voice portal services.

  • PDF

A Study on the Development of Language Education Service Platform for Teaching Assistance Robots (교사도우미 로봇을 활용한 어학교육 서비스 플랫폼 구축방안 연구)

  • Yoo, Gab-Sang;Choi, Jong-Chon
    • Journal of Digital Convergence
    • /
    • v.14 no.8
    • /
    • pp.223-232
    • /
    • 2016
  • This study focuses on the new teaching assistance robot platform and the cloud-based education service model to support the server. In the client area we would like to use the teacher assistant robot in elementary school classrooms to utilize the language education service platform. Emerging IoT technology will be adopted to provide a comfortable classroom environment and various media interfaces. Extensive precedent review and case study have been conducted to identify basic requirements of proposed service platform. Embedded system and technology for image recognition, speech recognition, autonomous movement, display, touch screen, IR sensor, GPS, and temperature-humidity sensor were extensively investigated to complete the service. Key findings of this paper are optimized service platform with cloud server system and possibilities of potential smart classroom with intelligent robot by adopting IoT and BIM technology.

Telecommunication Services Based On Spoken Language Information Technology - In view of services provided by KT - (음성정보기술을 이용한 통신서비스 - KT 서비스를 중심으로 -)

  • Koo, Myoung-Wan;Kim, Jae-In;Jeong, Yeong-Jun;Kim, Mun-Sik;Kim, Won-U;Kim, Hak-Hun;Park, Seong-Jun;Ryu, Chang-Seon;Kim, Hui-Gyeong
    • Proceedings of the KSPS conference
    • /
    • 2004.05a
    • /
    • pp.125-130
    • /
    • 2004
  • In this paper, we explain telecommunication services based on spoken language information technology. There are three different kinds of services. The first is based on Advanced Intelligent services(AIN). We built a Intelligent Peripheral(IP)with speech recognition, speech synthesis and VoiceXML interpreter. The second is based on KT-HUVOIS, a proprietary speech platform based on VoiceXML. The third is based on VoiceXML interpreter. We explain various services depending on these platforms in detail.

  • PDF

Development of Interactive Hologram Education System based on Speech Recognition - Live Map (음성인식 기반 대화형 홀로그램 교육 시스템의 개발 및 평가에 관한 연구 - 라이브맵(Live Map))

  • Kwon, Chongsan;Lee, Dong-Heon;Moon, Mikyeong
    • Journal of Industrial Convergence
    • /
    • v.17 no.4
    • /
    • pp.69-75
    • /
    • 2019
  • In this study, we developed a world map learning system for elementary education that uses Google Cloud platform STT, Dialog Flow, and fan holograms to recognize the voices of learners and to show and explain three-dimensional images of suitable results as holograms. As a result of the experiments and interviews, it is expected to be helpful for improving the learning effect by inducing students' interest and immersion and is expected to be effectively used for collaborative learning and education for students with disabilities.