Search | Korea Science

Speech Recognition based Message Transmission System for the Hearing Impaired Persons (청각장애인을 위한 음성인식 기반 메시지 전송 시스템)

Kim, Sung-jin;Cho, Kyoung-woo;Oh, Chang-heon
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.22 no.12
- /
- pp.1604-1610
- /
- 2018
The speech recognition service is used as an ancillary means of communication by converting and visualizing the speaker's voice into text to the hearing impaired persons. However, in open environments such as classrooms and conference rooms it is difficult to provide speech recognition service to many hearing impaired persons. For this, a method is needed to efficiently provide it according to the surrounding environment. In this paper, we propose a system that recognizes the speaker's voice and transmits the converted text to many hearing impaired persons as messages. The proposed system uses the MQTT protocol to deliver messages to many users at the same time. The end-to-end delay was measured to confirm the service delay of the proposed system according to the QoS level setting of the MQTT protocol. As a result of the measurement, the delay between the most reliable Qos level 2 and 0 is 111ms, confirming that it does not have a great influence on conversation recognition.
https://doi.org/10.6109/jkiice.2018.22.12.1604 인용 PDF KSCI HTML

Speaker Adaptation for Voice Dialing (음성 다이얼링을 위한 화자적응)

;Chin-Hui Lee
- The Journal of the Acoustical Society of Korea
- /
- v.21 no.5
- /
- pp.455-461
- /
- 2002
This paper presents a method that improves the performance of the personal voice dialling system in which speaker independent phoneme HMM's are used. Since the speaker independent phoneme HMM based voice dialing system uses only the phone transcription of the input sentence, the storage space could be reduced greatly. However, the performance of the system is worse than that of the system which uses the speaker dependent models due to the phone recognition errors generated when the speaker independent models are used. In order to solve this problem, a new method that jointly estimates transformation vectors for the speaker adaptation and transcriptions from training utterances is presented. The biases and transcriptions are estimated iteratively from the training data of each user with maximum likelihood approach to the stochastic matching using speaker-independent phone models. Experimental result shows that the proposed method is superior to the conventional method which used transcriptions only.
PDF KSCI

VoiceXML Dialog System Based on RSS for Contents Syndication (콘텐츠 배급을 위한 RSS 기반의 VoiceXML 다이얼로그 시스템)

Kwon, Hyeong-Joon;Kim, Jung-Hyun;Lee, Hyon-Gu;Hong, Kwang-Seok
- The KIPS Transactions:PartB
- /
- v.14B no.1 s.111
- /
- pp.51-58
- /
- 2007
This paper suggests prototype of dialog system combining VXML(VoiceXML) that is the W3C's standard XML format for specifying interactive voice dialogues between human and computer, and RSS(RDF Site Summary or Really Simple Syndication) that is representative technology of semantic web for syndication and subscription of updated web-contents. Merits of the proposed system are as following: 1) It is a new method that recognize spoken contents using ire and wireless telephone networks and then provide contents to user via STT(Speech-to-Text) and TTS(Text-to-Speech) instead of traditional method using web only. 2) It can apply advantage of RSS that subscription of updated contents is converted to VXML without modifying traditional method to provide RSS service, 3) In terms of users, it can reduce restriction on time-spate in search of contents that is provided by RSS because it uses ire and wireless telephone networks, not internet environment. 4) In terms of information provider, it does not need special component for syndication of the newest contents using speech recognition and synthesis technology. We implemented a news service system using VXML and RSS for performance evaluation of the proposed system. In experiment results, we estimated the response time and the speech recognition rate in subscription and search of actuality contents, and confirmed that the proposed system can provide contents those are provided using RSS Feed.
https://doi.org/10.3745/KIPSTB.2007.14-B.1.051 인용 PDF KSCI

A Study on the Speech Recognition For the Voice Dialing System (Voice Dialing System을 위한 음성인식)

이성권
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1998.06e
- /
- pp.365-368
- /
- 1998
본 연구는 음소 단위의 CHMM(Continuous Hidden Markov Model)을 이용한 Voice Dialing System을 위한 연속 음성인식에 관한 내용이다. 연구실 환경에서 음성으로 전화를 걸기 위하여 전국 지역명과 연속 숫자음 인식을 수행하였다. ETRI 445 데이터를 사용하여 초기의 모델은 ML(Maximum Likelihood) 추정법을 이용하여 작성하였고 적응화를 위해 최대 사후 확률 추정법을 사용하였다. 음성으로 다이얼링을 수행하기 위하여 문맥자유문법을 이용하여 제한적이나마 대화체문장으로 수행할 수 있도록 하였다. 그리하여 숫자음에 대하여 5인의 화자에 대하여 4연속 숫자음에 대하여 96%의 인식률을 보이고 있으며 7연속 숫자음에 대하여도 약 91%의 결과를 보여주고 있다. 문장으로도 음성 다이얼링을 수행하였을 경우 문장내에 단어와 숫자음에 대하여 약 80%의 인식률을 보였다.
PDF

Design and Implementation of Voice-based Interactive Service KIOSK (음성기반 대화형 서비스 키오스크 설계 및 구현)

Kim, Sang-woo;Choi, Dae-june;Song, Yun-Mi;Moon, Il-Young
- Journal of Practical Engineering Education
- /
- v.14 no.1
- /
- pp.99-108
- /
- 2022
As the demand for kiosks increases, more users complain of discomfort. Accordingly, a kiosk that enables easy menu selection and order by producing a voice-based interactive service is produced and provided in the form of a web. It implements voice functions based on the Annyang API and SpeechSynthesis API, and understands the user's intention through Dialogflow. And discuss how to implement this process based on Rest API. In addition, the recommendation system is applied based on collaborative filtering to improve the low consumer accessibility of existing kiosks, and to prevent infection caused by droplets during the use of voice recognition services, it provides the ability to check the wearing of masks before using the service.
https://doi.org/10.14702/JPEE.2022.099 인용 PDF KSCI

Design for Access Control System based on Voice Recognition for Infectious Disease Prevention (전염성 확산 차단을 위한 음성인식 기반의 출입통제시스템 설계)

Mun, Hyung-Jin;Han, Kun-Hee
- Journal of the Korea Convergence Society
- /
- v.11 no.7
- /
- pp.19-24
- /
- 2020
WHO declared a global pandemic on March 11th for Corona 19. However, there is a situation where you have to go to building for face-to-face education or seminars for economic and social activities. The first check method of COVID-19 infection is to measure body temperature, so the primary entrance and exit is blocked for near-field body temperature measurement. However, since it is troublesome to check directly, thermal camera is installed at the entrance of the building, and body temperature is measured indirectly using the infrared camera to control access. In case of middle and high schools, universities, and lifelong education center, we need a system that is possible to interoperate with attendance checks and automatically recognizes whether to wear masks and can authenticate students. We proposed the system that is to confirm whether to wear a mask with a camera that is embedded in a smart mirror, and that authenticates the user through voice recognition of the user who wants to enter the building by using voice recognition technology and determines whether to enter them or not. The proposed system can check attendance if it is linked with near-field temperature measurement and attendance check APP of student's smart phone.
https://doi.org/10.15207/JKCS.2020.11.7.019 인용 PDF KSCI

Establishment of the Korean Standard Vocal Sound into Character Conversion Rule (한국어 음가를 한글 표기로 변환하는 표준규칙 제정)

이계영;임재걸
- Journal of the Institute of Electronics Engineers of Korea CI
- /
- v.41 no.2
- /
- pp.51-64
- /
- 2004
The purpose of this paper is to establish the Standard Korean Vocal Sound into Character Conversion Rule (Standard VSCC Rule) by reversely applying the Korean Standard Pronunciation Rule that regulates the way of reading written Hangeul sentences. The Standard VSCC Rule performs a crucially important role in Korean speech recognition. The general method of speech recognition is to find the most similar pattern among the standard voice patterns to the input voice pattern. Each of the standard voice patterns is an average of several sample voice patterns. If the unit of the standard voice pattern is a word, then the number of entries of the standard voice pattern will be greater than a few millions (taking inflection and postpositional particles into account). This many entries require a huge database and an impractically too many comparisons in the process of finding the most similar pattern. Therefore, the unit of the standard voice pattern should be a syllable. In this case, we have to resolve the problem of the difference between the Korean vocal sounds and the writing characters. The process of converting a sequence of Korean vocal sounds into a sequence of characters requires our Standard VSCC Rule. Making use of our Standard VSCC Rule, we have implemented a Korean vocal sounds into Hangeul character conversion system. The Korean Standard Pronunciation Rule consists of 30 items. In order to show soundness and completeness of our Standard VSCC Rule, we have tested the conversion system with various data sets reflecting all the 30 items. The test results will be presented in this paper.
PDF KSCI

An Implementation of Speech DB Gathering System Using VoiceXML (VoiceXML을 이용한 음성 DB 수집 시스템 구현)

Kim Dong-Hyun;Roh Yong-Wan;Hong Kwang-Seok
- Journal of Internet Computing and Services
- /
- v.6 no.1
- /
- pp.39-50
- /
- 2005
Speech DB is basically required factor when we are study for phonetics, speech recognition and speech synthesis and so on. The quantity and quality of speech DB decide the efficiency of system that we develop. therefore. speech DB has an extremely important factor, Recently, development of the various telephone service technique such as voice portal. it is actual condition where the necessity of collection of telephone speech DB. The existing IVR application telephone speech DB collection system used C/C++ language or the exclusive development tool. Thus it is the actual condition where the recycle of each application service for resources is difficult and have a problem of many labors and time necessity. But. VoiceXML is a language having tag form ipredicated in XML. which has easy and simple grammar system. Therefore, if we make a few efforts we could draw up easily. it has a merit reducing labors and time, Also, VoiceXML has many advantages of various telephone speech DB gathering because of changing contents of DB. In this paper, we introduce telephone speech DB gathering system which is the mast important factor for development of speech information processing technique.
PDF

A Study of the Pattern Kernels for a Lip Print Recognition

Paik, Kyoung-Seok;Chung, Chin-Hyun
- 제어로봇시스템학회:학술대회논문집
- /
- 1998.10a
- /
- pp.64-69
- /
- 1998
This paper presents a lip print recognition by the pattern kernels for a personal identification. A lip print recognition is developed less than the other physical attributes of a fingerprint, a voice pattern, a retinal blood/vessel pattern, or a facial recognition. A new method is proposed to recognize a lip print bi the pattern kernels. The pattern kernels are a function consisted of some local lip print pattern masks. This function converts the information on a lip print into the digital data. The recognition in the multi-resolution system is more reliable than recognition in the single-resolution system. The results show that the proposed algorithm by the multi-resolution architecture can be efficiently realized.
PDF

Design and Implementation of an Emotion Recognition System using Physiological Signal (생체신호를 이용한 감정인지시스템의 설계 및 구현)

O, Ji-Soo;Kang, Jeong-Jin;Lim, Myung-Jae;Lee, Ki-Young
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.10 no.1
- /
- pp.57-62
- /
- 2010
Recently in the mobile market, the communication technology which bases on the sense of sight, sound, and touch has been developed. However, human beings uses all five - vision, auditory, palatory, olfactory, and tactile - senses to communicate. Therefore, the current paper presents a technology which enables individuals to be aware of other people's emotions through a machinery device. This is achieved by the machine perceiving the tone of the voice, body temperature, pulse, and other biometric signals to recognize the emotion the dispatching individual is experiencing. Once the emotion is recognized, a scent is emitted to the receiving individual. A system which coordinates the emission of scent according to emotional changes is proposed.
PDF KSCI

Search Result 334, Processing Time 0.025 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)