• Title/Summary/Keyword: Speech-to-text services

Search Result 38, Processing Time 0.022 seconds

A Study on the Intelligent Personal Assistant Development Method Base on the Open Source (오픈소스기반의 지능형 개인 도움시스템(IPA) 개발방법 연구)

  • Kim, Kil-hyun;Kim, Young-kil
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2016.10a
    • /
    • pp.89-92
    • /
    • 2016
  • The latest the siri and like this is offering services that recognize and respond to words in the smartphone or web services. In order to handle intelligently these voices, It needs to search big data in the cloud and requires the implementation of parsing context accuracy given. In this paper, I would like to propose the study on the intelligent personal assistant development method base on the Open source with ASR(Automatic Speech Recognition), QAS(Question Answering System) and TTS(Text To Speech).

  • PDF

Implementation of Music Broadcasting Service System in the Shopping Center Using Text-To-Speech Technology (TTS를 이용한 매장 음악 방송 서비스 시스템 구현)

  • Chang, Moon-Soo;Kang, Sun-Mee
    • Speech Sciences
    • /
    • v.14 no.4
    • /
    • pp.169-178
    • /
    • 2007
  • This thesis describes the development of a service system for small-sized shops which support not only music broadcasting, but editing and generating voice announcement using the TTS(Text-To-Speech) technology. The system has been developed based on web environments with an easy access whenever and wherever it is needed. The system is able to control the sound using silverlight media player based on the ASP .NET 2.0 technology without any additional application software. Use of the Ajax control allows for multiple users to get the maximum load when needed. TTS is built in the server side so that the service can be provided without user's computer. Due to convenience and usefulness of the system, the business sector can provide better service to many shops. Further additional functions such as statistical analysis will undoubtedly help shop management provide desirable services.

  • PDF

A study on the Institutionalization of Speech-to-text Services for the Deaf People (난청인을 위한 문자통역서비스 제도화 연구)

  • Chun, Dong-Il;Seo, Jeong-Min
    • Journal of Digital Convergence
    • /
    • v.15 no.4
    • /
    • pp.53-63
    • /
    • 2017
  • The purpose of this study is to look at the way that speech-to-text (STT) services are used at present, and to explore measures to institutionalize such services for ease of communication for the hearing impaired. The results of this study show the following: 1) 17.8% of those surveyed had experience of using STT services, with younger individuals showing a higher rate of use; and 2) In terms of organizations providing STT services, social welfare organizations followed by civic groups (18.3%) and public organizations (18.3%). The following institutional measures are needed for STT services. First, STT services should be actively promoted as one of the reasonable conveniences defined in the 'Act on the Prohibition of Discrimination Against Disabled Persons, Remedy Against Infringement of Their Rights, etc.' Second, STT services should be additionally listed as one of the clauses of the 'Act on Welfare of Persons with Disabilities'. In particular, establishing a communication system for those with hearing impairments should serve as a catalyst for integration with sign language interpretation and welfare services. If STT services for face-to-face contacts can be improved or further enhanced using ICT, it will not only open the way for a new influx of disabled workers to join vocational rehabilitation, but also help to improve quality of life for the hearing impaired.

The Interactive Voice Services based on VoiceXML (VoiceXML 기반 음성인식시스템을 이용한 서비스 개발)

  • Kim Hak-Gyoon;Kim Eun-Hyang;Kim Jae-In;Koo Myoung-Wan
    • MALSORI
    • /
    • no.43
    • /
    • pp.113-125
    • /
    • 2002
  • As there are needs to search the Web information via wire or wireless telephones, VoiceXML forum was established to develop and promote the Voice eXtensible Markup Language (VoiceXML). VoiceXML simplifies the creation of personalized interactive voice response services on the Web, and allows voice and phone access to information on Web sites, call center databases. Also, it can utilize the Web-based technologies, such as CGI(Common Gateway Interface) scripts. In this paper, we have developed the voice portal service platform based on VoiceXML called TeleGateway. It enables integration of voice services with data services using the Automatic Speech Recognition (ASR) and Text-To-Speech (TTS) engines. Also, we have showed the various services on voice portal services.

  • PDF

Corpus-based evaluation of French text normalization (코퍼스 기반 프랑스어 텍스트 정규화 평가)

  • Kim, Sunhee
    • Phonetics and Speech Sciences
    • /
    • v.10 no.3
    • /
    • pp.31-39
    • /
    • 2018
  • This paper aims to present a taxonomy of non-standard words (NSW) for developing a French text normalization system and to propose a method for evaluating this system based on a corpus. The proposed taxonomy of French NSWs consists of 13 categories, including 2 types of letter-based categories and 9 types of number-based categories. In order to evaluate the text normalization system, a representative test set including NSWs from various text domains, such as news, literature, non-fiction, social-networking services (SNSs), and transcriptions, is constructed, and an evaluation equation is proposed reflecting the distribution of the NSW categories of the target domain to which the system is applied. The error rate of the test set is 1.64%, while the error rate of the whole corpus is 2.08%, reflecting the NSW distribution in the corpus. The results show that the literature and SNS domains are assessed as having higher error rates compared to the test set.

Improvement of Shop Music Broadcasting Services Using Music Lists and User Experience (방송목록과 사용자 경험 정보를 이용한 매장 음원 방송 서비스의 개선)

  • Kang, Sun-Mee;Kim, Hyun-Deuc;Chang, Moon-Soo
    • Speech Sciences
    • /
    • v.15 no.4
    • /
    • pp.121-130
    • /
    • 2008
  • This paper proposes the way of improvement and system build-up for shop music broadcasting services provided by the Internet. Comparing the shop music broadcasting services and personal music broadcasting services, we propose the way of shop music broadcasting services customers prefer to. That is, such a function is provided that a user can control the broadcasting music lists a specialist provides according to the current circumstance of shop. This paper proposes the whole system such a service is possible and verifies the efficiency by experiments.

  • PDF

Constructing Ontology based on Korean Parts of Speech and Applying to Vehicle Services (한국어 품사 기반 온톨로지 구축 방법 및 차량 서비스 적용 방안)

  • Cha, Si-Ho;Ryu, Minwoo
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.17 no.4
    • /
    • pp.103-108
    • /
    • 2021
  • Knowledge graph is a technology that improves search results by using semantic information based on various resources. Therefore, due to these advantages, the knowledge graph is being defined as one of the core research technologies to provide AI-based services recently. However, in the case of the knowledge graph, since the form of knowledge collected from various service domains is defined as plain text, it is very important to be able to analyze the text and understand its meaning. Recently, various lexical dictionaries have been proposed together with the knowledge graph, but since most lexical dictionaries are defined in a language other than Korean, there is a problem in that the corresponding language dictionary cannot be used when providing a Korean knowledge service. To solve this problem, this paper proposes an ontology based on the parts of speech of Korean. The proposed ontology uses 9 parts of speech in Korean to enable the interpretation of words and their semantic meaning through a semantic connection between word class and word class. We also studied various scenarios to apply the proposed ontology to vehicle services.

Implementation of Interface to Support Mobile Accessibility Using Speech I/O APIs (음성 입출력 API를 이용한 모바일 접근성 지원 인터페이스 구현)

  • Oh, Seungchur;Yun, Young-Sun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.1
    • /
    • pp.71-80
    • /
    • 2013
  • Due to the increased use of mobile devices, there is a lot of discussion on mobile accessibility. Mobile accessibility means that everyone, who includes the disabled, the elderly people, can easily use the functions of mobile devices. In this paper, we presented and implemented a mobile interface using a speech I/O APIs to improve the accessibility. The proposed interfaces are implemented on Android platforms and they used speech recognition and text-to-speech APIs supported as built-in services. In addition, to facilitate the internet access for visually impaired or blind people, we also implemented the web browsing application (web reader).

Speech Interactive Agent on Car Navigation System Using Embedded ASR/DSR/TTS

  • Lee, Heung-Kyu;Kwon, Oh-Il;Ko, Han-Seok
    • Speech Sciences
    • /
    • v.11 no.2
    • /
    • pp.181-192
    • /
    • 2004
  • This paper presents an efficient speech interactive agent rendering smooth car navigation and Telematics services, by employing embedded automatic speech recognition (ASR), distributed speech recognition (DSR) and text-to-speech (ITS) modules, all while enabling safe driving. A speech interactive agent is essentially a conversational tool providing command and control functions to drivers such' as enabling navigation task, audio/video manipulation, and E-commerce services through natural voice/response interactions between user and interface. While the benefits of automatic speech recognition and speech synthesizer have become well known, involved hardware resources are often limited and internal communication protocols are complex to achieve real time responses. As a result, performance degradation always exists in the embedded H/W system. To implement the speech interactive agent to accommodate the demands of user commands in real time, we propose to optimize the hardware dependent architectural codes for speed-up. In particular, we propose to provide a composite solution through memory reconfiguration and efficient arithmetic operation conversion, as well as invoking an effective out-of-vocabulary rejection algorithm, all made suitable for system operation under limited resources.

  • PDF

Design of a Mirror for Fragrance Recommendation based on Personal Emotion Analysis (개인의 감성 분석 기반 향 추천 미러 설계)

  • Hyeonji Kim;Yoosoo Oh
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.28 no.4
    • /
    • pp.11-19
    • /
    • 2023
  • The paper proposes a smart mirror system that recommends fragrances based on user emotion analysis. This paper combines natural language processing techniques such as embedding techniques (CounterVectorizer and TF-IDF) and machine learning classification models (DecisionTree, SVM, RandomForest, SGD Classifier) to build a model and compares the results. After the comparison, the paper constructs a personal emotion-based fragrance recommendation mirror model based on the SVM and word embedding pipeline-based emotion classifier model with the highest performance. The proposed system implements a personalized fragrance recommendation mirror based on emotion analysis, providing web services using the Flask web framework. This paper uses the Google Speech Cloud API to recognize users' voices and use speech-to-text (STT) to convert voice-transcribed text data. The proposed system provides users with information about weather, humidity, location, quotes, time, and schedule management.