• Title/Summary/Keyword: Voice recognition system

Search Result 334, Processing Time 0.028 seconds

Analysis of Feature Extraction Methods for Distinguishing the Speech of Cleft Palate Patients (구개열 환자 발음 판별을 위한 특징 추출 방법 분석)

  • Kim, Sung Min;Kim, Wooil;Kwon, Tack-Kyun;Sung, Myung-Whun;Sung, Mee Young
    • Journal of KIISE
    • /
    • v.42 no.11
    • /
    • pp.1372-1379
    • /
    • 2015
  • This paper presents an analysis of feature extraction methods used for distinguishing the speech of patients with cleft palates and people with normal palates. This research is a basic study on the development of a software system for automatic recognition and restoration of speech disorders, in pursuit of improving the welfare of speech disabled persons. Monosyllable voice data for experiments were collected for three groups: normal speech, cleft palate speech, and simulated clef palate speech. The data consists of 14 basic Korean consonants, 5 complex consonants, and 7 vowels. Feature extractions are performed using three well-known methods: LPC, MFCC, and PLP. The pattern recognition process is executed using the acoustic model GMM. From our experiments, we concluded that the MFCC method is generally the most effective way to identify speech distortions. These results may contribute to the automatic detection and correction of the distorted speech of cleft palate patients, along with the development of an identification tool for levels of speech distortion.

Utilizing Korean Ending Boundary Tones for Accurately Recognizing Emotions in Utterances (발화 내 감정의 정밀한 인식을 위한 한국어 문미억양의 활용)

  • Jang In-Chang;Lee Tae-Seung;Park Mikyoung;Kim Tae-Soo;Jang Dong-Sik
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.30 no.6C
    • /
    • pp.505-511
    • /
    • 2005
  • Autonomic machines interacting with human should have capability to perceive the states of emotion and attitude through implicit messages for obtaining voluntary cooperation from their clients. Voice is the easiest and most natural way to exchange human messages. The automatic systems capable to understanding the states of emotion and attitude have utilized features based on pitch and energy of uttered sentences. Performance of the existing emotion recognition systems can be further improved withthe support of linguistic knowledge that specific tonal section in a sentence is related with the states of emotion and attitude. In this paper, we attempt to improve recognition rate of emotion by adopting such linguistic knowledge for Korean ending boundary tones into anautomatic system implemented using pitch-related features and multilayer perceptrons. From the results of an experiment over a Korean emotional speech database, the improvement of $4\%$ is confirmed.

Development of Data Fusion Human Identification System Based on Finger-Vein Pattern-Matching Method and photoplethysmography Identification

  • Ko, Kuk Won;Lee, Jiyeon;Moon, Hongsuk;Lee, Sangjoon
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.7 no.2
    • /
    • pp.149-154
    • /
    • 2015
  • Biometric techniques for authentication using body parts such as a fingerprint, face, iris, voice, finger-vein and also photoplethysmography have become increasingly important in the personal security field, including door access control, finance security, electronic passport, and mobile device. Finger-vein images are now used to human identification, however, difficulties in recognizing finger-vein images are caused by capturing under various conditions, such as different temperatures and illumination, and noise in the acquisition camera. The human photoplethysmography is also important signal for human identification. In this paper To increase the recognition rate, we develop camera based identification method by combining finger vein image and photoplethysmography signal. We use a compact CMOS camera with a penetrating infrared LED light source to acquire images of finger vein and photoplethysmography signal. In addition, we suggest a simple pattern matching method to reduce the calculation time for embedded environments. The experimental results show that our simple system has good results in terms of speed and accuracy for personal identification compared to the result of only finger vein images.

An Architecture for Mobile Instruction: Application to Mathematics Education through the Web

  • Kim, Steven H.;Kwon, Oh-Nam;Kim, Eun-Jung
    • Research in Mathematical Education
    • /
    • v.4 no.1
    • /
    • pp.45-55
    • /
    • 2000
  • The rapid proliferation of wireless networks provides a ubiquitous channel for delivering instructional materials at the convenience of the user. By delivering content through portable devices linked to the Internet, the full spectrum of multimedia capabilities is available for engaging the user's interest. This capability encompasses not only text but images, video, speech generation and voice recognition. Moreover, the incorporation of machine learning capabilities at the source provides the ability to tailor the material to the general level of expertise of the user as well as the immediate needs of the moment: for instance, a request for information regarding a particular city might be covered by a leisurely presentation if solicited from the home, but more tersely if the user happens to be driving a car. This paper presents system architecture to support mobile instruction in conjunction with knowledge-based tutoring capabilities. For concreteress, the general concepts are examined in the context of a system for mathematics education on the Web.

  • PDF

Improving the Performance of a Speech Recognition System in a Vehicle by Distinguishing Male/Female Voice (성별 구별방법에 의한 자동차 내 음성 인식 성능 향상)

  • Yang, Jin-Woo;Kim, Sun-Hyeop
    • Journal of KIISE:Software and Applications
    • /
    • v.27 no.12
    • /
    • pp.1174-1182
    • /
    • 2000
  • 본 논문은 주행중인 자동차 환경에서 운전자의 안전성 및 편의성의 동시 확보를 위하여, 보조적인 스위치 조작 없이 상시 음성의 입, 출력이 가능한 시스템을 제안하였다. 이대 잡음에 강인한 threshold 값을 구하기 위하여, 1.5초마다 기준 에너지와 영 교차율을 변경하였으며 대역 통과 여과기를 이용하여 1차, 2차로 나누어 실시간 상태에서 자동으로, 정확하게 끝점 검출을 처리하였다. 또한 남성, 여성을 피치검출로 구분하여 모델을 선택하게 하였고, 주행중인 자동차 속도에 따라 가장 적합한 모델을 사용하기 위하여 Idle-40km, 40-80km, 80-100km로 구분하여 남성, 여성 모델을 각각 구분하여 인식할 수 있게 하였다. 그리고, 음성의 특징 벡터와 인식 알고리즘은 PLP 13차와 OSDP(one-Stage Dynamic Programming)을 사용하였다. 본 실험은 서울시내 도로 및 내부 순환도로에서 각각 속도별로 구분하여 화자독립 인식 실험을 한 결과 40-80km 상태에서 남자는 96.8%, 여자는 95.1%, 80-100km 상태에서는 남자 91.6%, 여자는 90.6%의 인식결과를 얻을 수 있었고, 화자종속 인식실험 결과 40-80km 상태에서 남자는 98%, 여자는 96%, 80-100km 상태에서는 남자는 96%, 여자는 94%의 높은 인식률을 얻었으므로, system의 유효성을 입증하였다.

  • PDF

Study of Korean Symptom Expression in 119 Emergency Calls (119 구급 신고 전화의 한국어 증상 표현 연구)

  • Jang, Yoonhee;Kang, Kyunghee;Jang, Kyungho;Kim, Kyeonghae
    • Fire Science and Engineering
    • /
    • v.30 no.4
    • /
    • pp.135-140
    • /
    • 2016
  • To help emergency medical dispatchers receive rapid and accurate identification and corrective action status determination of an emergency call, and to support the automatic processing of a voice recognition system to the Korean emergency medical dispatch system, emergency call records were analyzed. Furthermore, a list of Korean symptoms expression were produced and the characteristics of the symptoms that appear on the actual wording of the telephone records were identified. This language list and its characteristics will be useful for training emergency medical dispatchers.

A Study on Finger Language Translation System using Machine Learning and Leap Motion (머신러닝과 립 모션을 활용한 지화 번역 시스템 구현에 관한 연구)

  • Son, Da Eun;Go, Hyeong Min;Shin, Haeng yong
    • Annual Conference of KIPS
    • /
    • 2019.10a
    • /
    • pp.552-554
    • /
    • 2019
  • Deaf mutism (a hearing-impaired person and speech disorders) communicates using sign language. There are difficulties in communicating by voice. However, sign language can only be limited in communicating with people who know sign language because everyone doesn't use sign language when they communicate. In this paper, a finger language translation system is proposed and implemented as a means for the disabled and the non-disabled to communicate without difficulty. The proposed algorithm recognizes the finger language data by leap motion and self-learns the data using machine learning technology to increase recognition rate. We show performance improvement from the simulation results.

AIoT-based High-risk Industrial Safety Management System of Artificial Intelligence (AIoT 기반 고위험 산업안전관리시스템 인공지능 연구)

  • Yeo, Seong-koo;Park, Dea-woo
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.05a
    • /
    • pp.168-170
    • /
    • 2022
  • The government enacted and promulgated the 'Severe Accident Punishment Act' in January 2021, and is enforcing the law for workplaces with 50 or more full-time workers. However, the number of industrial accident accidents in 2021 increased by 10.7% compared to the same period of the previous year, and chemical gas Safety accidents due to leaks and explosions also occur frequently. Therefore, in high-risk industrial sites, comprehensive Safety measures are urgently needed. In this study, BLE Mesh networking in industrial sites with poor communication environment apply technology. The complex sensor AIoT device recognizes a dangerous situation as a gas sensing value, voice, and motion value, and transmits it to the server. The server monitors the risk situation in real time through information value analysis and judgment through artificial intelligence LSTM algorithm and CNN algorithm for AIoT transmission information. Through this study, through the development of AIoT devices capable of gas sensing, voice and motion recognition, and AI-applied safety management systems, It will contribute to the expansion of the social safety net by expanding its application.

  • PDF

A Study on Interactive Talking Companion Doll Robot System Using Big Data for the Elderly Living Alone (빅데이터를 이용한 독거노인 돌봄 AI 대화형 말동무 아가야(AGAYA) 로봇 시스템에 관한 연구)

  • Song, Moon-Sun
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.5
    • /
    • pp.305-318
    • /
    • 2022
  • We focused on the care effectiveness of the interactive AI robots. developed an AI toy robot called 'Agaya' to contribute to personalization with more human-centered care. First, by applying P-TTS technology, you can maximize intimacy by autonomously selecting the voice of the person you want to hear. Second, it is possible to heal in your own way with good memory storage and bring back memory function. Third, by having five senses of the role of eyes, nose, mouth, ears, and hands, seeking better personalised services. Fourth, it attempted to develop technologies such as warm temperature maintenance, aroma, sterilization and fine dust removal, convenient charging method. These skills will expand the effective use of interactive robots by elderly people and contribute to building a positive image of the elderly who can plan the remaining old age productively and independently

Implementation of Public Address System Using Anchor Technology

  • Seungwon Lee;Soonchul Kwon;Seunghyun Lee
    • International journal of advanced smart convergence
    • /
    • v.12 no.3
    • /
    • pp.1-12
    • /
    • 2023
  • A public address (PA) system installed in a building is a system that delivers alerts, announcements, instructions, etc. in an emergency or disaster situation. As for the products used in PA systems, with the development of information and communication technology, PA products with various functions have been introduced to the market. PA systems recently launched in the market may be connected through a single network to enable efficient management and operation, or use voice recognition technology to deliver quick information in case of an emergency. In addition, a system capable of locating a user inside a building using a location-based service and guiding or responding to a safe area in the event of an emergency is being launched on the market. However, the new PA systems currently on the market add some functions to the existing PA system configuration to make system operation more convenient, but they do not change the complex PA system configuration to reduce facility costs, maintenance, and management costs. In this paper, we propose a novel PA system configuration for buildings using audio networks and control hierarchy over peer-to-peer (Anchor) technology based on audio over IP (AoIP), which simplifies the complex PA system configuration and enables convenient operation and management. As a result of the study, through the emergency signal processing algorithm, fire broadcasting was made possible according to the detection of the existence of a fire signal in the Anchor system. In addition, the control device of the PA system was replaced with software to reduce the equipment installation cost, and the PA system configuration was simplified. In the future, it is expected that the PA system using Anchor technology will become the standard for PA facilities.