• Title/Summary/Keyword: Voice recognition system

Search Result 334, Processing Time 0.024 seconds

Development of Integrated Public Address System for Intelligent Building (지능형 빌딩을 위한 디지털 통합 전관 방송 시스템 개발)

  • Kim, Jung-Sook;Song, Chee-Won
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.21 no.2
    • /
    • pp.212-217
    • /
    • 2011
  • In this paper, we developed an intelligent minimized integrated digital public address which can provide context awareness of various events occurring in future intelligent buildings. It is able to recognize both voices and sounds, such as a fire bell and a disaster bell, which signal to escape in emergence situations and it can sense information which is sent from various sensors, for example, the inner temperature, humidity, and environment status in an intelligent building. Also, the intelligent digital integrated public address can broadcast information to individual places, according to context awareness that is from sensing information, by using network with an ID. And we are developing a minimized integrated digital public address system that has facilities such as external input, Mic., CD, MP3 and Radio. Developing an integrated digital public address system with operational MICOM will make it possible to control the facilities of digital devices centrally. The operational MICOM is composed of 3 layers which are a control layer, a processing layer and a user interface layer.

Design and Implementation of Vehicle Control Network Using WiFi Network System (WiFi 네트워크 시스템을 활용한 차량 관제용 네트워크의 설계 및 구현)

  • Yu, Hwan-Shin
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.20 no.3
    • /
    • pp.632-637
    • /
    • 2019
  • Recent researches on autonomous driving of vehicles are becoming very active, and it is a trend to assist safe driving and improve driver's convenience. Autonomous vehicles are required to combine artificial intelligence, image recognition capability, and Internet communication between objects. Because mobile telecommunication networks have limitations in their processing, they can be easily implemented and scale using an easily expandable Wi-Fi network. We propose a wireless design method to construct such a vehicle control network. We propose the arrangement of AP and the software configuration method to minimize loss of data transmission / reception of mobile terminal. Through the design of the proposed network system, the communication performance of the moving vehicle can be dramatically increased. We also verify the packet structure of GPS, video, voice, and data communication that can be used for the vehicle through experiments on the movement of various terminal devices. This wireless design technology can be extended to various general purpose wireless networks such as 2.4GHz, 5GHz and 10GHz Wi-Fi. It is also possible to link wireless intelligent road network with autonomous driving.

NUI/NUX of the Virtual Monitor Concept using the Concentration Indicator and the User's Physical Features (사용자의 신체적 특징과 뇌파 집중 지수를 이용한 가상 모니터 개념의 NUI/NUX)

  • Jeon, Chang-hyun;Ahn, So-young;Shin, Dong-il;Shin, Dong-kyoo
    • Journal of Internet Computing and Services
    • /
    • v.16 no.6
    • /
    • pp.11-21
    • /
    • 2015
  • As growing interest in Human-Computer Interaction(HCI), research on HCI has been actively conducted. Also with that, research on Natural User Interface/Natural User eXperience(NUI/NUX) that uses user's gesture and voice has been actively conducted. In case of NUI/NUX, it needs recognition algorithm such as gesture recognition or voice recognition. However these recognition algorithms have weakness because their implementation is complex and a lot of time are needed in training because they have to go through steps including preprocessing, normalization, feature extraction. Recently, Kinect is launched by Microsoft as NUI/NUX development tool which attracts people's attention, and studies using Kinect has been conducted. The authors of this paper implemented hand-mouse interface with outstanding intuitiveness using the physical features of a user in a previous study. However, there are weaknesses such as unnatural movement of mouse and low accuracy of mouse functions. In this study, we designed and implemented a hand mouse interface which introduce a new concept called 'Virtual monitor' extracting user's physical features through Kinect in real-time. Virtual monitor means virtual space that can be controlled by hand mouse. It is possible that the coordinate on virtual monitor is accurately mapped onto the coordinate on real monitor. Hand-mouse interface based on virtual monitor concept maintains outstanding intuitiveness that is strength of the previous study and enhance accuracy of mouse functions. Further, we increased accuracy of the interface by recognizing user's unnecessary actions using his concentration indicator from his encephalogram(EEG) data. In order to evaluate intuitiveness and accuracy of the interface, we experimented it for 50 people from 10s to 50s. As the result of intuitiveness experiment, 84% of subjects learned how to use it within 1 minute. Also, as the result of accuracy experiment, accuracy of mouse functions (drag(80.4%), click(80%), double-click(76.7%)) is shown. The intuitiveness and accuracy of the proposed hand-mouse interface is checked through experiment, this is expected to be a good example of the interface for controlling the system by hand in the future.

Augmented Reality Logo System Based on Android platform (안드로이드 기반 로고를 이용한 증강현실 시스템)

  • Jung, Eun-Young;Jeong, Un-Kuk;Lim, Sun-Jin;Moon, Chang-Bae;Kim, Byeong-Man
    • The KIPS Transactions:PartB
    • /
    • v.18B no.4
    • /
    • pp.181-192
    • /
    • 2011
  • A mobile phone is becoming no longer a voice communication tool due to smartphones and mobile internet. Also, it now becomes a total entertainment device on which we can play game and get services by variety applications through the Web. As smartphones are getting more popular, their usages are also increased, which makes the interest of advertising industry in mobile advertisement increased but it is bound to be limited by the size of the screen. In this paper, we suggest an augmented reality logo system based on Android platform to maximize the effect of logo advertisement. After developing software and mounting it on a real smartphone, its performances are analyzed in various ways. The results show the possibility of its application to real world but it's not enough to provide real time service because of the low performance of hardware.

Digital Mirror System with Machine Learning and Microservices (머신 러닝과 Microservice 기반 디지털 미러 시스템)

  • Song, Myeong Ho;Kim, Soo Dong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.9
    • /
    • pp.267-280
    • /
    • 2020
  • Mirror is a physical reflective surface, typically of glass coated with a metal amalgam, and it is to reflect an image clearly. They are available everywhere anytime and become an essential tool for us to observe our faces and appearances. With the advent of modern software technology, we are motivated to enhance the reflection capability of mirrors with the convenience and intelligence of realtime processing, microservices, and machine learning. In this paper, we present a development of Digital Mirror System that provides the realtime reflection functionality as mirror while providing additional convenience and intelligence including personal information retrieval, public information retrieval, appearance age detection, and emotion detection. Moreover, it provides a multi-model user interface of touch-based, voice-based, and gesture-based. We present our design and discuss how it can be implemented with current technology to deliver the realtime mirror reflection while providing useful information and machine learning intelligence.

A study on combination of loss functions for effective mask-based speech enhancement in noisy environments (잡음 환경에 효과적인 마스크 기반 음성 향상을 위한 손실함수 조합에 관한 연구)

  • Jung, Jaehee;Kim, Wooil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.40 no.3
    • /
    • pp.234-240
    • /
    • 2021
  • In this paper, the mask-based speech enhancement is improved for effective speech recognition in noise environments. In the mask-based speech enhancement, enhanced spectrum is obtained by multiplying the noisy speech spectrum by the mask. The VoiceFilter (VF) model is used as the mask estimation, and the Spectrogram Inpainting (SI) technique is used to remove residual noise of enhanced spectrum. In this paper, we propose a combined loss to further improve speech enhancement. In order to effectively remove the residual noise in the speech, the positive part of the Triplet loss is used with the component loss. For the experiment TIMIT database is re-constructed using NOISEX92 noise and background music samples with various Signal to Noise Ratio (SNR) conditions. Source to Distortion Ratio (SDR), Perceptual Evaluation of Speech Quality (PESQ), and Short-Time Objective Intelligibility (STOI) are used as the metrics of performance evaluation. When the VF was trained with the mean squared error and the SI model was trained with the combined loss, SDR, PESQ, and STOI were improved by 0.5, 0.06, and 0.002 respectively compared to the system trained only with the mean squared error.

Age classification of emergency callers based on behavioral speech utterance characteristics (발화행태 특징을 활용한 응급상황 신고자 연령분류)

  • Son, Guiyoung;Kwon, Soonil;Baik, Sungwook
    • The Journal of Korean Institute of Next Generation Computing
    • /
    • v.13 no.6
    • /
    • pp.96-105
    • /
    • 2017
  • In this paper, we investigated the age classification from the speaker by analyzing the voice calls of the emergency center. We classified the adult and elderly from the call center calls using behavioral speech utterances and SVM(Support Vector Machine) which is a machine learning classifier. We selected two behavioral speech utterances through analysis of the call data from the emergency center: Silent Pause and Turn-taking latency. First, the criteria for age classification selected through analysis based on the behavioral speech utterances of the emergency call center and then it was significant(p <0.05) through statistical analysis. We analyzed 200 datasets (adult: 100, elderly: 100) by the 5 fold cross-validation using the SVM(Support Vector Machine) classifier. As a result, we achieved 70% accuracy using two behavioral speech utterances. It is higher accuracy than one behavioral speech utterance. These results can be suggested age classification as a new method which is used behavioral speech utterances and will be classified by combining acoustic information(MFCC) with new behavioral speech utterances of the real voice data in the further work. Furthermore, it will contribute to the development of the emergency situation judgment system related to the age classification.

The Effect of AI Agent's Multi Modal Interaction on the Driver Experience in the Semi-autonomous Driving Context : With a Focus on the Existence of Visual Character (반자율주행 맥락에서 AI 에이전트의 멀티모달 인터랙션이 운전자 경험에 미치는 효과 : 시각적 캐릭터 유무를 중심으로)

  • Suh, Min-soo;Hong, Seung-Hye;Lee, Jeong-Myeong
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.8
    • /
    • pp.92-101
    • /
    • 2018
  • As the interactive AI speaker becomes popular, voice recognition is regarded as an important vehicle-driver interaction method in case of autonomous driving situation. The purpose of this study is to confirm whether multimodal interaction in which feedback is transmitted by auditory and visual mode of AI characters on screen is more effective in user experience optimization than auditory mode only. We performed the interaction tasks for the music selection and adjustment through the AI speaker while driving to the experiment participant and measured the information and system quality, presence, the perceived usefulness and ease of use, and the continuance intention. As a result of analysis, the multimodal effect of visual characters was not shown in most user experience factors, and the effect was not shown in the intention of continuous use. Rather, it was found that auditory single mode was more effective than multimodal in information quality factor. In the semi-autonomous driving stage, which requires driver 's cognitive effort, multimodal interaction is not effective in optimizing user experience as compared to single mode interaction.

Design and Implementation of Smart Device Application for Instructional Analysis (스마트 디바이스 기반 수업분석 프로그램 설계 및 구현 -한국어 특성 반영과 교사활용도 증진을 위한 UI설계를 적용하여-)

  • Kang, Doo Bong;Jeong, Ju Hun;Kim, Young Hwan
    • The Journal of Korean Association of Computer Education
    • /
    • v.18 no.4
    • /
    • pp.31-40
    • /
    • 2015
  • The objective of this study is to develop and implement a smart device based instructional analysis application to enhance the efficiency of teaching in class. The main design features for this application are as follows: first, User Interface(UI) has been simplified to provide teachers a clear and easy-to-understand way to utilize the application. Second, the characteristics of Korean language were considered, such as sentence structure. Third, multi-aspect analysis is possible through adopting three analysis types - Flanders' interaction analysis, Tuckman's analysis, Mcgraw's concentration of instruction analysis. The practical instructional analysis application has been developed through this study, and this user-oriented application will be able to help teachers improve the quality of teaching in class. Also, this study can be a starting point for further researches on design principles of instructional analysis, especially with the recent technology and theories, such as a voice-recognition system, an edutainment applied instruction and an experiential learning.

A Human-Robot Interaction Entertainment Pet Robot (HRI 엔터테인먼트 애완 로봇)

  • Lee, Heejin
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.24 no.2
    • /
    • pp.179-185
    • /
    • 2014
  • In this paper, a quadruped walking pet robot for human-robot interaction, a robot-controller using a smart phone application program, and a home smart control system using sensor informations providing from the robot are described. The robot has 20 degree of freedom and consists of various sensors such as Kinect sensor, infrared sensor, 3 axis motion sensor, temperature/humidity sensor, gas sensor and graphic LCD module. We propose algorithms for the robot entertainment: walking algorithm of the robot, motion and voice recognition algorithm using Kinect sensor. emotional expression algorithm, smart phone application algorithm for a remote control of the robot, and home smart control algorithm for controlling home appliances. The experiments of this paper show that the proposed algorithms applied to the pet robot, smart phone, and computer are well operated.