• Title/Summary/Keyword: Voice Recognition Technology

Search Result 212, Processing Time 0.042 seconds

Realization of Aircraft Takeoff Systems Based on Voice Instructions (음성지시 기반 항공기 이륙 시스템의 구현)

  • Yang, Chung-Il;Jun, Byung-Kyu;Lim, Sang-Seok
    • Journal of Advanced Navigation Technology
    • /
    • v.12 no.6
    • /
    • pp.559-566
    • /
    • 2008
  • In this paper, we propose a voice instruction-based takeoff system for aircraft including unmanned aerial vehicle (UAV). The system consists of voice recognition (VR), flight state checking and instruction (command) execution. Employing VR technology, the proposed takeoff system can provide simplified and more reliable takeoff procedures to pilots. By virtue of the VR-based system it is expected that human errors during takeoff phase can be reduced and further navigation safety can be improved.

  • PDF

Interactive content development of voice pattern recognition (음성패턴인식 인터랙티브 콘텐츠 개발)

  • Na, Jong-Won
    • Journal of Advanced Navigation Technology
    • /
    • v.16 no.5
    • /
    • pp.864-870
    • /
    • 2012
  • Voice pattern recognition technology to solve the problems of the existing problems and common issues that you may have in language learning content analysis. This is the first problem of language-learning content, online learning posture. Game open another web page through the lesson, but the concentration of the students fell. Have not been able to determine the second issue according Speaking has made the learning process actually reads. Third got a problem with the mechanical process by a learning management system, as well by the teacher in the evaluation of students and students who are learning progress between the difference in the two. Finally, the biggest problem, while maintaining their existing content made to be able to solve the above problem. Speaking learning dedicated learning programs under this background, voice pattern recognition technology learning process for speech recognition and voice recognition capabilities for learning itself has been used in the recognition process the data of the learner's utterance as an audio file of the desired change to a transfer to a specific location of the server or SQL server may be easily inserted into any system or program, any and all applicable content that has already been created without damaging the entire component because the new features were available. Contributed to this paper, active participation in class more interactive teaching methods to change.

Performance Evaluation of Real-time Voice Traffic over IEEE 802.15.4 Beacon-enabled Mode (IEEE 802.15.4 비컨 가용 방식에 의한 실시간 음성 트래픽 성능 평가)

  • Hur, Yun-Kang;Kim, You-Jin;Huh, Jae-Doo
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.2 no.1
    • /
    • pp.43-52
    • /
    • 2007
  • IEEE 802.15.4 specification which defines low-rate wireless personal area network(LR-WPAN) has application to home or building automation, remote control and sensing, intelligent management, environmental monitoring, and so on. Recently, it has been considered as an alternative technology to provide multimedia services such as automation via voice recognition, wireless headset and wireless camera for surveillance. In order to evaluate capability of voice traffic on the IEEE 802.15.4 LR-WPAN, we supposed two scenarios, voice traffic only and coexistence of voice and sensing traffic. For both cases we examined delay and packet loss rate in case of with and without acknowledgement, and various beacon period varying with beacon and superframe order values. In LR-WPAN with voice devices only, total 5 voice devices could be applicable and in the other case, i.e., coexisted cases of voice and sensor devices, a voice device was able to coexist with about 60 sensor devices.

  • PDF

Design and Implementation of VoiceXML VUI Browser (VoiceXML VUI Browser 설계/구현)

  • 장민석;예상후
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2002.11a
    • /
    • pp.788-791
    • /
    • 2002
  • The present Web surroundings is composed of HTML(Hypertext Mark-up Language) and thereby users obtains web informations mainly in GUI(Graphical User Interface) environment by clicking mouse in order to keep up with hyperlinked informations. However it is very inconvenient to work in this environment comparing with easily accessed one in which human's voice is utilized for obtaining informations. Using VoiceXML, resulted from XML, for supplying the information through telephone on the basis of the contemporary matured technology of voice recognition/synthesis to work out the inconvenience problem, this paper presents the research results about VoiceXML Web Browser designed and implemented for realizing its technology.

  • PDF

Robust Speech Recognition Algorithm of Voice Activated Powered Wheelchair for Severely Disabled Person (중증 장애우용 음성구동 휠체어를 위한 강인한 음성인식 알고리즘)

  • Suk, Soo-Young;Chung, Hyun-Yeol
    • The Journal of the Acoustical Society of Korea
    • /
    • v.26 no.6
    • /
    • pp.250-258
    • /
    • 2007
  • Current speech recognition technology s achieved high performance with the development of hardware devices, however it is insufficient for some applications where high reliability is required, such as voice control of powered wheelchairs for disabled persons. For the system which aims to operate powered wheelchairs safely by voice in real environment, we need to consider that non-voice commands such as user s coughing, breathing, and spark-like mechanical noise should be rejected and the wheelchair system need to recognize the speech commands affected by disability, which contains specific pronunciation speed and frequency. In this paper, we propose non-voice rejection method to perform voice/non-voice classification using both YIN based fundamental frequency(F0) extraction and reliability in preprocessing. We adopted a multi-template dictionary and acoustic modeling based speaker adaptation to cope with the pronunciation variation of inarticulately uttered speech. From the recognition tests conducted with the data collected in real environment, proposed YIN based fundamental extraction showed recall-precision rate of 95.1% better than that of 62% by cepstrum based method. Recognition test by a new system applied with multi-template dictionary and MAP adaptation also showed much higher accuracy of 99.5% than that of 78.6% by baseline system.

Design And Implementation of a Speech Recognition Interview Model based-on Opinion Mining Algorithm (오피니언 마이닝 알고리즘 기반 음성인식 인터뷰 모델의 설계 및 구현)

  • Kim, Kyu-Ho;Kim, Hee-Min;Lee, Ki-Young;Lim, Myung-Jae;Kim, Jeong-Lae
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.12 no.1
    • /
    • pp.225-230
    • /
    • 2012
  • The opinion mining is that to use the existing data mining technology also uploaded blog to web, to use product comment, the opinion mining can extract the author's opinion therefore it not judge text's subject, only judge subject's emotion. In this paper, published opinion mining algorithms and the text using speech recognition API for non-voice data to judge the emotions suggested. The system is open and the Subject associated with Google Voice Recognition API sunwihwa algorithm, the algorithm determines the polarity through improved design, based on this interview, speech recognition, which implements the model.

Emotion Recognition Implementation with Multimodalities of Face, Voice and EEG

  • Udurume, Miracle;Caliwag, Angela;Lim, Wansu;Kim, Gwigon
    • Journal of information and communication convergence engineering
    • /
    • v.20 no.3
    • /
    • pp.174-180
    • /
    • 2022
  • Emotion recognition is an essential component of complete interaction between human and machine. The issues related to emotion recognition are a result of the different types of emotions expressed in several forms such as visual, sound, and physiological signal. Recent advancements in the field show that combined modalities, such as visual, voice and electroencephalography signals, lead to better result compared to the use of single modalities separately. Previous studies have explored the use of multiple modalities for accurate predictions of emotion; however the number of studies regarding real-time implementation is limited because of the difficulty in simultaneously implementing multiple modalities of emotion recognition. In this study, we proposed an emotion recognition system for real-time emotion recognition implementation. Our model was built with a multithreading block that enables the implementation of each modality using separate threads for continuous synchronization. First, we separately achieved emotion recognition for each modality before enabling the use of the multithreaded system. To verify the correctness of the results, we compared the performance accuracy of unimodal and multimodal emotion recognitions in real-time. The experimental results showed real-time user emotion recognition of the proposed model. In addition, the effectiveness of the multimodalities for emotion recognition was observed. Our multimodal model was able to obtain an accuracy of 80.1% as compared to the unimodality, which obtained accuracies of 70.9, 54.3, and 63.1%.

A Basic Performance Evaluation of the Speech Recognition APP of Standard Language and Dialect using Google, Naver, and Daum KAKAO APIs (구글, 네이버, 다음 카카오 API 활용앱의 표준어 및 방언 음성인식 기초 성능평가)

  • Roh, Hee-Kyung;Lee, Kang-Hee
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.7 no.12
    • /
    • pp.819-829
    • /
    • 2017
  • In this paper, we describe the current state of speech recognition technology and identify the basic speech recognition technology and algorithms first, and then explain the code flow of API necessary for speech recognition technology. We use the application programming interface (API) of Google, Naver, and Daum KaKao, which have the most famous search engine among the speech recognition APIs, to create a voice recognition app in the Android studio tool. Then, we perform a speech recognition experiment on people's standard words and dialects according to gender, age, and region, and then organize the recognition rates into a table. Experiments were conducted on the Gyeongsang-do, Chungcheong-do, and Jeolla-do provinces where the degree of tongues was severe. And Comparative experiments were also conducted on standardized dialects. Based on the resultant sentences, the accuracy of the sentence is checked based on spacing of words, final consonant, postposition, and words and the number of each error is represented by a number. As a result, we aim to introduce the advantages of each API according to the speech recognition rate, and to establish a basic framework for the most efficient use.

A Study on Cockpit Voice Command System for Fighter Aircraft (전투기용 음성명령 시스템에 대한 연구)

  • Kim, Seongwoo;Seo, Mingi;Oh, Yunghwan;Kim, Bonggyu
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.41 no.12
    • /
    • pp.1011-1017
    • /
    • 2013
  • The human voice is the most natural means of communication. The need for speech recognition technology is increasing gradually to increase the ease of human and machine interface. The function of the avionics equipment is getting various and complicated in consequence of the growth of digital technology development, so that the load of pilots in the fighter aircraft must become increased since they don't concentrate only the attack function, but also operate the complicated avionics equipments. Accordingly, if speech recognition technology is applied to the aircraft cockpit as regards the operating the avionics equipments, pilots can spend their time and effort on the mission of fighter aircraft. In this paper, the cockpit voice command system applicable to the fighter aircraft has been developed and the function and the performance of the system verified.

Implementation of the Timbre-based Emotion Recognition Algorithm for a Healthcare Robot Application (헬스케어 로봇으로의 응용을 위한 음색기반의 감정인식 알고리즘 구현)

  • Kong, Jung-Shik;Kwon, Oh-Sang;Lee, Eung-Hyuk
    • Journal of IKEEE
    • /
    • v.13 no.4
    • /
    • pp.43-46
    • /
    • 2009
  • This paper deals with feeling recognition from people's voice to fine feature vectors. Voice signals include the people's own information and but also people's feelings and fatigues. So, many researches are being progressed to fine the feelings from people's voice. In this paper, We analysis Selectable Mode Vocoder(SMV) that is one of the standard 3GPP2 codecs of ETSI. From the analyzed result, we propose voices features for recognizing feelings. And then, feeling recognition algorithm based on gaussian mixture model(GMM) is proposed. It uses feature vectors is suggested. We verify the performance of this algorithm from changing the mixture component.

  • PDF