• Title/Summary/Keyword: Voice command

Search Result 97, Processing Time 0.02 seconds

Implementation of a Refusable Human-Robot Interaction Task with Humanoid Robot by Connecting Soar and ROS (Soar (State Operator and Result)와 ROS 연계를 통해 거절가능 HRI 태스크의 휴머노이드로봇 구현)

  • Dang, Chien Van;Tran, Tin Trung;Pham, Trung Xuan;Gil, Ki-Jong;Shin, Yong-Bin;Kim, Jong-Wook
    • The Journal of Korea Robotics Society
    • /
    • v.12 no.1
    • /
    • pp.55-64
    • /
    • 2017
  • This paper proposes combination of a cognitive agent architecture named Soar (State, operator, and result) and ROS (Robot Operating System), which can be a basic framework for a robot agent to interact and cope with its environment more intelligently and appropriately. The proposed Soar-ROS human-robot interaction (HRI) agent understands a set of human's commands by voice recognition and chooses to properly react to the command according to the symbol detected by image recognition, implemented on a humanoid robot. The robotic agent is allowed to refuse to follow an inappropriate command like "go" after it has seen the symbol 'X' which represents that an abnormal or immoral situation has occurred. This simple but meaningful HRI task is successfully experimented on the proposed Soar-ROS platform with a small humanoid robot, which implies that extending the present hybrid platform to artificial moral agent is possible.

Voice Interactions with A. I. Agent : Analysis of Domestic and Overseas IT Companies (A.I.에이전트와의 보이스 인터랙션 : 국내외 IT회사 사례연구)

  • Lee, Seo-Young
    • Journal of Korea Entertainment Industry Association
    • /
    • v.15 no.4
    • /
    • pp.15-29
    • /
    • 2021
  • Many countries and companies are pursuing and developing Artificial intelligence as it is the core technology of the 4th industrial revolution. Global IT companies such as Apple, Microsoft, Amazon, Google and Samsung have all released their own AI assistant hardware products, hoping to increase customer loyalty and capture market share. Competition within the industry for AI agent is intense. AI assistant products that command the biggest market shares and customer loyalty have a higher chance of becoming the industry standard. This study analyzed the current status of major overseas and domestic IT companies in the field of artificial intelligence, and suggested future strategic directions for voice UI technology development and user satisfaction. In terms of B2B technology, it is recommended that IT companies use cloud computing to store big data, innovative artificial intelligence technologies and natural language technologies. Offering voice recognition technologies on the cloud enables smaller companies to take advantage of such technologies at considerably less expense. Companies also consider using GPT-3(Generative Pre-trained Transformer 3) an open source artificial intelligence language processing software that can generate very natural human-like interactions and high levels of user satisfaction. There is a need to increase usefulness and usability to enhance user satisfaction. This study has practical and theoretical implications for industry and academia.

Real-Time Implementation of Acoustic Echo Canceller Using TMS320C6711 DSK

  • Heo, Won-Chul;Bae, Keun-Sung
    • Speech Sciences
    • /
    • v.15 no.1
    • /
    • pp.75-83
    • /
    • 2008
  • The interior of an automobile is a very noisy environment with both stationary cruising noise and the reverberated music or speech coming out from the audio system. For robust speech recognition in a car environment, it is necessary to extract a driver's voice command well by removing those background noises. Since we can handle the music and speech signals from an audio system in a car, the reverberated music and speech sounds can be removed using an acoustic echo canceller. In this paper, we implement an acoustic echo canceller with robust double-talk detection algorithm using TMS-320C6711 DSK. First we developed the echo canceller on the PC for verifying the performance of echo cancellation, then implemented it on the TMS320C6711 DSK. For processing of one speech sample with 8kHz sampling rate and 256 filter taps of the echo canceller, the implemented system used only 0.035ms and achieved the ERLE of 20.73dB.

  • PDF

Autonomous Aero-Robot and Disaster Response

  • Inoue, Koichi;Nakanishi, Hiroaki
    • Proceedings of the Korean Institute of Industrial Safety Conference
    • /
    • 2003.10a
    • /
    • pp.3-16
    • /
    • 2003
  • After a not-widely-known fact is revealed that Japan is a leading country in production and use of industrial unmanned helicopters, a kind of UAV. The voice command system and the autonomous flight control system with a variety of control algorithms including neural network, robust and adaptive control that have been developed in collaboration between Kyoto University and Yamaha Motor Co., and funded by the Ministry of Education and Science of Japan are described in some detail. Both already-proven and promising future applications of the autonomous unmanned helicopters are given.

  • PDF

The design of Speech Recognizer to Implement the Voice Command on the PDA (PDA 상에서 음성명령어를 구현하기 위한 음성인식기의 설계)

  • Kwak Sang-Hun;Kim Cheol;Choi Seung-Ho
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • autumn
    • /
    • pp.37-40
    • /
    • 2001
  • 본 논문에서는 PDA상에서 음성으로 명령어를 제어하기 위해 Window CE 3.0 환경에서 음성인식기를 설계하였다. 전처리과정에서 26차 특징파라미터를 추출하고, HTK를 통해 학습하였다. 트라이폰 기반의 가변어휘 음성인식기를 설계하였으며, PDA의 응용프로그램은 Embedded Visual C++언어를 사용하여 22개의 음성명령어를 제어하도록 하였다. 그 결과 PDA상에서 $92\%의 인식률이 나타났으며 이것은 음성인식이 모바일 환경에서도 접근이 가능함을 알 수 있었다.

  • PDF

A Study on the Voice-Controlled Wheelchair using Spatio-Temporal Pattern Recognition Neural Network (Spatio-Temporal Pattern Recognition Neural Network를 이용한 전동 휠체어의 음성 제어에 관한 연구)

  • Baek, S.W.;Kim, S.B.;Kwon, J.W.;Lee, E.H.;Hong, S.H.
    • Proceedings of the KOSOMBE Conference
    • /
    • v.1993 no.05
    • /
    • pp.90-93
    • /
    • 1993
  • In this study, Korean speech was recognized by using spatio-temporal recognition neural network. The subjects of speech are numeric speech from zero to nine and basic command which might be used for motorized wheelchair developed it own Lab. Rabiner and Sambur's method of speech detection was used in determining end-point of speech, speech parameter was extracted by using LPC 16 order. The recognition rate was over 90%.

  • PDF

Cursor Moving by Voice Command using DTW method (DTW방식을 이용한 음성 명령에 의한 커서 조작)

  • 추명경;손영선
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.11 no.1
    • /
    • pp.82-87
    • /
    • 2001
  • 본 논문에서는 마우스 대신에 음성으로 명령을 입력하여 퍼지 추론을 통해 위도우 화면상의 커서를 이동시키는 인터페이스를 구현하였다. 입력된 음성이 대체로 짧은 언어이기에 이를 인식하기 위하여 고립단어 인식에 강한 DTW방식을 사용하였다. DTW방식의 단점중인 하나가 음성길이가 비슷한 명령을 입력하였을 때 표준패턴 중 오차 값이 가장 작은 패턴으로 인식하는 것이다. 예를 들면 \"아주 많이 이동해\"하는 음성이 입력되었을 때 비슷한 음성길이를 가진 \"아주 많이 오른쪽\"으로 인식하는 경우가 있다. 이런 오류를 해결하고자 각 패턴의 DTW오차 거리 값과 표준 패턴의 음성길이를 기준으로 임계값을 퍼지 추론하여 명령으로서의 수락 여부를 결정하였다. 판단이 애매한 부분은 사용자에게 질의를 하여 응답에 따라 수락 여부를 결정하였다.

  • PDF

Design of Application Control System Using Google Home (구글 홈을 활용한 응용프로그램 제어 시스템의 설계)

  • Kim, Dong-Hyun;Kim, Hwi-Min
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2019.07a
    • /
    • pp.135-136
    • /
    • 2019
  • 일반적으로 컴퓨터에서 문서 작업을 하기 위해서는 사용자는 컴퓨터 화면을 볼 수 있는 시각과 키보드와 마우스를 조작하기 위하여 손을 사용해야 한다. 시각과 손이 불편한 대부분 장애우는 컴퓨터를 조작하기 어렵다. 장애우들을 보조해주는 정보통신 보조기기의 가격은 비싸며 기기 보급을 지원해주는 사업이 있지만, 사업에 선정되기 어렵다는 문제가 있다. 이 논문에서는 구글 홈을 이용하여 텍스트, 워드, 엑셀, 한글 등 다양한 응용프로그램을 음성을 이용하여 제어하기 위한 시스템을 제안한다. 제안한 시스템은 구글 어시스턴트가 다이어로그플로우로 설계한 인텐트를 웹 훅을 이용해 서버에서 컴퓨터로 접근하여 응용프로그램을 제어한다.

  • PDF

Edge Computing-Based Voice Command Smart Home Control System (에지 컴퓨팅 기반 음성 명령 스마트홈 제어 시스템 구축)

  • Kim, So-Chul;Yoon, Seo-Jeong;Ko, Hyungyu
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.11a
    • /
    • pp.764-766
    • /
    • 2022
  • 본 시스템은 스마트폰에서 사용자의 음성을 이용해 집 안이나 밖에서 IoT 단말을 효율적으로 제어할 수 있는 시스템으로, 인식된 음성에 맞춰 가전제품 기동, 조명 조절 등 IoT 단말을 컨트롤한다. 사용자의 음성은 Json 형태의 명령으로 변환되어 에지 컴퓨팅 기술을 통해 저사양 단말이 고사양 단말의 유휴자원을 활용하며 명령에 따른 IoT 단말 컨트롤이 진행된다. 이러한 아키텍처는 IoT 단말 데이터를 외부에 노출하지 않고 컴퓨팅 자원을 효율적으로 운용할 수 있는 시스템을 제공한다.

A Real-Time Embedded Speech Recognition System

  • Nam, Sang-Yep;Lee, Chun-Woo;Lee, Sang-Won;Park, In-Jung
    • Proceedings of the IEEK Conference
    • /
    • 2002.07a
    • /
    • pp.690-693
    • /
    • 2002
  • According to the growth of communication biz, embedded market rapidly developing in domestic and overseas. Embedded system can be used in various way such as wire and wireless communication equipment or information products. There are lots of developing performance applying speech recognition to embedded system, for instance, PDA, PCS, CDMA-2000 or IMT-2000. This study implement minimum memory of speech recognition engine and DB for apply real time embedded system. The implement measure of speech recognition equipment to fit on embedded system is like following. At first, DC element is removed from Input voice and then a compensation of high frequency was achieved by pre-emphasis with coefficients value, 0.97 and constitute division data as same size as 256 sample by lapped shift method. Through by Levinson - Durbin Algorithm, these data can get linear predictive coefficient and again, using Cepstrum - Transformer attain feature vectors. During HMM training, We used Baum-Welch reestimation Algorithm for each words training and can get the recognition result from executed likelihood method on each words. The used speech data is using 40 speech command data and 10 digits extracted form each 15 of male and female speaker spoken menu control command of Embedded system. Since, in many times, ARM CPU is adopted in embedded system, it's peformed porting the speech recognition engine on ARM core evaluation board. And do the recognition test with select set 1 and set 3 parameter that has good recognition rate on commander and no digit after the several tests using by 5 proposal recognition parameter sets. The recognition engine of recognition rate shows 95%, speech commander recognizer shows 96% and digits recognizer shows 94%.

  • PDF