Search | Korea Science

A Comparative Study of Voice Activity Detection Algorithms in Adverse Environments (잡음 환경에서의 음성 검출 알고리즘 비교 연구)

Yang Kyong-Chul;Yook Dong-Suk
- Proceedings of the KSPS conference
- /
- 2006.05a
- /
- pp.45-48
- /
- 2006
As the speech recognition systems are used in many emerging applications, robust performance of speech recognition systems under extremely noisy conditions become more important. The voice activity detection (VAD) has been taken into account as one of the important factors for robust speech recognition. In this paper, we investigate conventional VAD algorithms and analyze the weak and the strong points of each algorithm.
PDF

An Interactive Voice Web Browser Usable as a Multimodal Interface in Information Devices by Using VoiceXML

Jang, Min-Seok
- Journal of the Korean Institute of Intelligent Systems
- /
- v.14 no.6
- /
- pp.771-775
- /
- 2004
The present Web surroundings is mostly composed of HTML(Hypertext Mark-up Language) and thereby users obtain web informations mainly in GUI(Graphical User Interface) environment by clicking mouse in order to keep up with hyperlinked informations. However it is very inconvenient to work in this environment comparing with easily accessed one in which human`s voice is utilized for obtaining informations. Using VoiceXML, resulted from XML, for supplying the information through telephone on the basis of the contemporary matured technology of voice recognition/synthesis to work out the inconvenience problem, this paper presents the research results about VoiceXML VUI(Voice User Interface) Browser designed and implemented for realizing its technology and also the VoiceXML Dialog designed for the purpose of the browser's efficient use.
https://doi.org/10.5391/JKIIS.2004.14.6.771 인용 PDF KSCI

Implementation of speech interface for windows 95 (Windows95 환경에서의 음성 인터페이스 구현)

한영원;배건성
- Journal of the Korean Institute of Telematics and Electronics S
- /
- v.34S no.5
- /
- pp.86-93
- /
- 1997
With recent development of speech recognition technology and multimedia computer systems, more potential applications of voice will become a reality. In this paper, we implement speech interface on the windows95 environment for practical use fo multimedia computers with voice. Speech interface is made up of three modules, that is, speech input and detection module, speech recognition module, and application module. The speech input and etection module handles th elow-level audio service of win32 API to input speech data on real time. The recognition module processes the incoming speech data, and then recognizes the spoken command. DTW pattern matching method is used for speech recognition. The application module executes the voice command properly on PC. Each module of the speech interface is designed and examined on windows95 environments. Implemented speech interface and experimental results are explained and discussed.
PDF

The University Gusdance System using the Alexa (Alexa를 이용한 대학안내 시스템)

Kim, Tae Jin;Kim, Dong Hyun
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.21 no.11
- /
- pp.2061-2066
- /
- 2017
The voice recognition technology is to recognize the voice of an user and execute the command. Recently, the voice recognition is evolving to the artificial intelligence voice recognition by adding the scheme of the natural language processing. The AI voice recognition is exploited to control the IoT devices or provide the information, such as the news or the wether. The University Information which is one of fields serviced by the information provider is mainly presented on the web. However, since too much information are presented on the web, it is difficult for an user to find efficiently the specific information which the user want to know. In this paper, we design and implement the university guidance system to recognize the user voice searching the information and provide the result using the voice. To do this, we classify the university data and design the lambda function to provide the data.
https://doi.org/10.6109/jkiice.2017.21.11.2061 인용 PDF KSCI

Development of a Voice User Interface for Web Browser using VoiceXML (VoiceXML을 이용한 VUI 지원 웹브라우저 개발)

Yea SangHoo;Jang MinSeok
- Journal of KIISE:Computing Practices and Letters
- /
- v.11 no.2
- /
- pp.101-111
- /
- 2005
The present web informations are mainly described in terms of HTML, which users obtain through input devices such as mouse, keyboard, etc. Thus the existing GUI environment have not supported human's most natural information acquisition means, that is, voice. To solve the problem, several vendors are developing voice user interface. However these products are deficient in man -machine interactivity and their accommodation of existing web environment. This paper presents a VUI(Voice User Interface) supporting web browser by utilizing more and more maturing speech recognition technology and VoiceXML, a markup language derived from XML. It provides users with both interfaces, VUI as well as GUI. In addition, XML Island technology is applied to the bowser in a way that VoiceXML fragments are nested in HTML documents to accommodate the existing web environment. Also for better interactivity, dialogue scenarios for menu, bulletin, and search engine are suggested.
PDF KSCI

An AI Technology-based Intelligent Senior Assistant Voice Recognition System (AI 기술 기반 지능형 시니어 도우미 음성인식 시스템)

Hong, Phil-Doo
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2019.05a
- /
- pp.355-357
- /
- 2019
Now that we are entering an aging society, the user interface for new devices and IoT technology is very inconvenient for senior generation. To improve this, we propose an AI technology-based intelligent senior assistant voice recognition system. This system implements Cloud platform based API to accumulate data for machine learning processing, provides content for diagnosis and prevention of dementia, and provide chat-bot content for senior generation. We hope that senior generations will increase the accessibility and convenience of IoT devices and new technology devices with our system.
PDF

A study on the humanistic measure about cultural changes of voice recognition technology (음성인식기술의 문화변동에 대한 인문학적 대응에 관한 연구)

Yuk, Hyun-Seung;Cho, Byung-Chul
- Journal of Digital Convergence
- /
- v.13 no.8
- /
- pp.21-31
- /
- 2015
The Journal of Digital Policy & Management. This space is for the abstract of your study in English. Recently, advancements in voice recognition technology lead to a new oral cultural era. Text based on new oral cultures, can bring about a cultural revolution. This research is rooted within the humanistic approach, including oral and text. The goal of the research is the humanistic measurements in regards to these cultural issues. Just like the complementary relationship between oral and text for the future. First of all, we will discuss the aspects that have resulted in the change between a text culture to an oral culture. After checking these changes with regards to voice recognition technology, we will be able to discuss the possibilities and problems of this cultural change. We discussed expected outcomes, such as the complementarity of speaking and writing, the expansion from the private culture to the public culture, the possibilities of a simultaneous concurrency. We also discussed the necessity such as a new semiotic approach of the voice and preparation for the expansion of the world of life. Specifically, the necessity for the advancement and control of the Korean culture against the dominance of a global corporation will be explored. In this study, basic research will be undertaken to look at the possibility of the new voice recognition technology and cultural changes, that are expected to be able to be effectively utilized and continue into more detailed research.
https://doi.org/10.14400/JDC.2015.13.8.21 인용 PDF KSCI

The Interactive Voice Services based on VoiceXML (VoiceXML 기반 음성인식시스템을 이용한 서비스 개발)

Kim Hak-Gyoon;Kim Eun-Hyang;Kim Jae-In;Koo Myoung-Wan
- MALSORI
- /
- no.43
- /
- pp.113-125
- /
- 2002
As there are needs to search the Web information via wire or wireless telephones, VoiceXML forum was established to develop and promote the Voice eXtensible Markup Language (VoiceXML). VoiceXML simplifies the creation of personalized interactive voice response services on the Web, and allows voice and phone access to information on Web sites, call center databases. Also, it can utilize the Web-based technologies, such as CGI(Common Gateway Interface) scripts. In this paper, we have developed the voice portal service platform based on VoiceXML called TeleGateway. It enables integration of voice services with data services using the Automatic Speech Recognition (ASR) and Text-To-Speech (TTS) engines. Also, we have showed the various services on voice portal services.
PDF

An Ultrasonic Wave Encoder and Decoder for Indoor Positioning of Mobile Marketing System

Kim, Young-Mo;Jang, Se-Young;Park, Byeong-Chan;Bang, Kyung-Sik;Kim, Seok-Yoon
- Journal of the Korea Society of Computer and Information
- /
- v.24 no.7
- /
- pp.93-100
- /
- 2019
In this paper, we propose an intelligent marketing service system that can provide custom advertisements and events to both businesses and customers by identifying the location and contents using the ultrasonic signals and feature information in voice signals. We also develop the encoding and decoding algorithm of ultrasonic signals for this system and analyze the performance evaluation results. With the development of the hyper-connected society, the on-line marketing has been activated and is growing in size. Existing store marketing applications have disadvantages that customers have to find out events or promotional materials that the headquarters or stores throughusing the corresponding applications whenever they visit them. To solve these problems, there are attempts to create intelligent marketing tools using GPS technology and voice recognition technology. However, this approach has difficulties in technology development due to accuracy of location and speed of comparison and retrieval of voice recognition technology, and marketing services for customer relation are also much simplified.
https://doi.org/10.9708/jksci.2019.24.07.093 인용 PDF KSCI HTML

A Voice-Activated Dialing System with Distributed Speech Recognition in WiFi Environments (무선랜 환경에서의 분산 음성 인식을 이용한 음성 다이얼링 시스템)

Park Sung-Joon;Koo Myoung_wan
- MALSORI
- /
- no.56
- /
- pp.135-145
- /
- 2005
In this paper, a WiFi phone system with distributed speech recognition is implemented. The WiFi phone with voice-activated dialing and its functions are explained. Features of the input speech are extracted and are sent to the interactive voice response (IVR) server according to the real-time transport protocol (RTP). Feature extraction is based on the European Telecommunication Standards Institute (ETSI) standard front-end, but is modified to reduce the processing time. The time for front-end processing on a WiFi phone is compared with that in a PC.
PDF

Search Result 212, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)