Search | Korea Science

Singing Voice Synthesis Using HMM Based TTS and MusicXML (HMM 기반 TTS와 MusicXML을 이용한 노래음 합성)

Khan, Najeeb Ullah;Lee, Jung-Chul
- Journal of the Korea Society of Computer and Information
- /
- v.20 no.5
- /
- pp.53-63
- /
- 2015
Singing voice synthesis is the generation of a song using a computer given its lyrics and musical notes. Hidden Markov models (HMM) have been proved to be the models of choice for text to speech synthesis. HMMs have also been used for singing voice synthesis research, however, a huge database is needed for the training of HMMs for singing voice synthesis. And commercially available singing voice synthesis systems which use the piano roll music notation, needs to adopt the easy to read standard music notation which make it suitable for singing learning applications. To overcome this problem, we use a speech database for training context dependent HMMs, to be used for singing voice synthesis. Pitch and duration control methods have been devised to modify the parameters of the HMMs trained on speech, to be used as the synthesis units for the singing voice. This work describes a singing voice synthesis system which uses a MusicXML based music score editor as the front-end interface for entry of the notes and lyrics to be synthesized and a hidden Markov model based text to speech synthesis system as the back-end synthesizer. A perceptual test shows the feasibility of our proposed system.
https://doi.org/10.9708/jksci.2015.20.5.053 인용 PDF KSCI

VoiceXML

정석영;강선미;정태의
- Korea Information Processing Society Review
- /
- v.8 no.3
- /
- pp.17-26
- /
- 2001
PDF

Telecommunication Services Based On Spoken Language Information Technology - In view of services provided by KT - (음성정보기술을 이용한 통신서비스 - KT 서비스를 중심으로 -)

Koo, Myoung-Wan;Kim, Jae-In;Jeong, Yeong-Jun;Kim, Mun-Sik;Kim, Won-U;Kim, Hak-Hun;Park, Seong-Jun;Ryu, Chang-Seon;Kim, Hui-Gyeong
- Proceedings of the KSPS conference
- /
- 2004.05a
- /
- pp.125-130
- /
- 2004
In this paper, we explain telecommunication services based on spoken language information technology. There are three different kinds of services. The first is based on Advanced Intelligent services(AIN). We built a Intelligent Peripheral(IP)with speech recognition, speech synthesis and VoiceXML interpreter. The second is based on KT-HUVOIS, a proprietary speech platform based on VoiceXML. The third is based on VoiceXML interpreter. We explain various services depending on these platforms in detail.
PDF

HomeN manager system based on multimodal context-aware middleware (멀티모달 상황인지 미들웨어 기반의 홈앤(HomeN) 매니저 시스템)

Ahn, Se-Yeol;Park, Sung-Chan;Park, Seong-Soo;Koo, Myung-Wan;Jeong, Yeong-Joon;Kim, Myung-Sook
- Proceedings of the KSPS conference
- /
- 2006.11a
- /
- pp.120-123
- /
- 2006
The provision of personalized user interfaces for mobile devices is expected to be used for different devices with a wide variety of capabilities and interaction modalities. In this paper, we implemented a multimodal context-aware middleware incorporating XML-based languages such as XHTML, VoiceXML. SCXML uses parallel states to invoke both XHTML and VoiceXML contents as well as to gather composite multimodal inputs or synchronize inter-modalities through man-machine I/Os. We developed home networking service named "HomeN" based on our middleware framework. It demonstrates that users could maintain multimodal scenarios in a clear, concise and consistent manner under various user's interactions.
PDF

A Study On the ASP Module in Conversational Automatic Speech Recognition Flight Information System (대화형 음성 인식 항공정보 시스템에서의 ASP 모듈에 관한 연구)

윤재석;장준식
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.6 no.4
- /
- pp.595-603
- /
- 2002
In this research, it has been shown that how the computer can recognize and understand spoken natural language and its symbolization using VoiceXML and Grammar Specific Language in developing telephone based conversational automatic speech recognition flight information system. In order for user to hear correct information, ASP Module has been revised and its effectivities has been experimented on the Voice portal airplane information system platform.
PDF KSCI

Home Appliance Control through Speech Recognition User Interface (음성 인식 사용자 인터페이스를 통한 가전기기 제어 기법)

Song, Wook;Jang, Hyun-Su;Eom, Young-Ik
- Proceedings of the Korea Information Processing Society Conference
- /
- 2006.11a
- /
- pp.265-268
- /
- 2006
유비쿼터스 컴퓨팅 환경이 확대됨에 따라, 기존의 키보드와 마우스만을 사용자 인터페이스로 주로 사용했던 방법에서 벗어나 좀 더 사용자 중심의 멀티모달 유저 인터페이스 적응이 요구되고 있다. 이에 XHTML+Voice는 음성 및 시각을 모두 제공할 수 있는 새로운 서비스 패러다임으로서 기존의 음성정보만을 제공하거나 시각적인 정보만을 제공하는 시스템과는 달리 XHTML내에 VoiceXML을 삽입함으로써 두 언어의 장점을 모두 활용할 수 있다. 본 논문에서는 VoiceXML의 이러한 장점을 살려 스마트 홈을 구성하는 여러 가전기기들의 인터페이스를 미리 템플릿으로 만들어 두어 모바일 디바이스를 통해 이것들을 제어하는 시나리오를 제안하고 구현하는 방법에 대해 실험하였다.
PDF

Stocks information Implementation System based on the SAIP at CTI module (SAPI 기반 CPI 모듈을 이용한 주식정보 시스템 구현)

오세일;고진한;박원배
- Proceedings of the Korean Information Science Society Conference
- /
- 2001.04a
- /
- pp.439-441
- /
- 2001
보이스 포탈(Voice Portal) 서비스란 알고자 하는 정보를 음성으로 전화를 통해 명령하면 원하는 정보를 음성으로 서비스이다. 본 논문에서 구현된 시스템은 원하는 주식 정보를 음성으로 명령하면 Voice XML 서버가 찾고자 하는 주식 종목을 검색하여 다시 음성으로 알려주는 시스템이다. 인증의 절차를 수행하는 SMS(Short Message Service) 서버 모듈, PSTN 망과 Database 서버사이의 인터페이스를 제공하는 CTI(Computer Telephony Integration) 모듈, CTI 서버와 WWW(World Wide Web) 사이의 Voice XML 모듈, 정보를 검색하기 위한 Searching 모듈들이 필요하다. 음성 인식 기술을 기반으로 한 CTI 모듈 설계를 구현하였다. 또한 인정 방식으로 Random한 일회용 패스워드를 기반으로 한 SML 인증을 택하므로 더욱 더 안정된 서비스 제공을 목적으로 하였다.

유무선 전화 음성 기반 VoiceXML 학습 평가 시스템

이인숙;홍기형
- Korea Multimedia Society
- /
- v.5 no.4
- /
- pp.59-70
- /
- 2001
PDF

A Study of Speech Recognition Web Services Environment for Voice Browser (Voice Browser를 위한 음성 인식 웹서비스 환경에 관한 연구)

Hong, In-Suk;Kim, Yoon-Joong
- Proceedings of the Korea Information Processing Society Conference
- /
- 2009.04a
- /
- pp.142-145
- /
- 2009
음성인터페이스 관련 표준화는 음성 대화, 음성인식/합성, 전화망 등의 접속망을 상호 분리하여 음성정보시스템 구성요소들 각각의 상호 독립적인 개발을 보장해 주며, 각 요소의 이해가 없이도 음성정보시스템을 개발할 수 있도록 함으로써 음성정보기술의 보급 및 확산에 크게 기여하고 있다. 이에 W3C에서는 Voice Browser에 대한 표준화를 현재 진행 중에 있으며 Vocie Browser WG에서 Voice Browser를 위한 SIF(Speech Interface Framework)를 제안하였다. 제안된 SIF에서 Voice Browser가 음성인식을 실행하기 위해서는 많은 자원의 소요와 부하가 생길 수 있다. 이러한 문제점을 해결하기 위해 본 논문에서는 음성인식 웹 서비스를 기존의 SIF에 추가한 새로운 형태의 SIF를 제안하고자 한다. 음성인식은 원격 시스템에서 수행하고 그 결과를 Voice Browser가 사용할 수 있도록 음성인식 웹서비스 환경을 구축하였다. 그리고, XML-SRGS 포멧의 grammar를 음성인식기가 사용하는 EBNF 포멧의 grammar로 변환시키는 변환기를 구현하였다.
https://doi.org/10.3745/PKIPS.y2009m04a.142 인용 PDF

Design and Implementation of CTI System for Hearing-Impaired People in Mobile Environment (모바일 환경에서 청각장애인을 위한 CTI시스템 설계 및 구현)

Yang, Seung-Su;Park, Seok-Cheon
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.13 no.6
- /
- pp.47-54
- /
- 2013
In this paper, analyze the technical elements of the CTI system to design the proposed system, understand the requirements of CTI IP based system. in the Hearing-Impaired designed a CTI system of mobile phone-based services available to the CTI call center system based on this. Furthermore, we implemented voiceXML scenario data analysis modules using the JAVA language to implement the system was designed, the server provides. And an implementation of the CTI system of mobile phone base for the Hearing-Impaired by integrating the modules that have been implemented. Finally, create a scenario that uses the CTI system for mobile base to test and evaluation, based on the test scenario each functional, we conducted repeated tests. It was possible to confirm the results of time for the acquisition of the test result information has been reduced about 20 seconds on average than the audio system based on conventional.
https://doi.org/10.7236/JIIBC.2013.13.6.47 인용 PDF KSCI

Search Result 101, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)