통합 검색 | Korea Science

장문수;정경채;강선미
- 음성과학
- /
- 제13권4호
- /
- pp.187-200
- /
- 2006
The speech synthesis technology is widely used and its application area is also being broadened to an automatic response service, a learning system for handicapped person, etc. However, the sound quality of the speech synthesizer has not yet reached to the satisfactory level of users. To make a synthesized speech, the existing synthesizer generates rhythms only by the interval information such as space and comma or by several punctuation marks such as a question mark and an exclamation mark so that it is not easy to generate natural rhythms of people even though it is based on mass speech database. To make up for the problem, there is a way to select rhythms after processing language from a higher level information. This paper proposes a method for generating tags for controling rhythms by analyzing the meaning of sentence with speech situation information. We use the Systemic Functional Grammar (SFG) [4] which analyzes the meaning of sentence with speech situation information considering the sentence prior to the given one, the situation of a conversation, the relationship among people in the conversation, etc. In this study, we generate Semantic Speech Control Tag (SSCT) by the result of SFG's meaning analysis and the voice wave analysis.
PDF

Lee, Heung-Kyu;Kwon, Oh-Il;Ko, Han-Seok
- 음성과학
- /
- 제11권2호
- /
- pp.181-192
- /
- 2004
This paper presents an efficient speech interactive agent rendering smooth car navigation and Telematics services, by employing embedded automatic speech recognition (ASR), distributed speech recognition (DSR) and text-to-speech (ITS) modules, all while enabling safe driving. A speech interactive agent is essentially a conversational tool providing command and control functions to drivers such' as enabling navigation task, audio/video manipulation, and E-commerce services through natural voice/response interactions between user and interface. While the benefits of automatic speech recognition and speech synthesizer have become well known, involved hardware resources are often limited and internal communication protocols are complex to achieve real time responses. As a result, performance degradation always exists in the embedded H/W system. To implement the speech interactive agent to accommodate the demands of user commands in real time, we propose to optimize the hardware dependent architectural codes for speed-up. In particular, we propose to provide a composite solution through memory reconfiguration and efficient arithmetic operation conversion, as well as invoking an effective out-of-vocabulary rejection algorithm, all made suitable for system operation under limited resources.
PDF

최예린
- 한국콘텐츠학회논문지
- /
- 제10권6호
- /
- pp.373-377
- /
- 2010
영어의 경우는 모음포만트 분석이 질적이나 양적으로 이미 오래전부터 많이 이루어져 왔다. 그러나 한국어 모음이 음향음성학적으로는 제대로 분석되지 않고 있는 실정이다. 본 연구의 목적은 한국어 모음에 대한 음향음성학적 측면에서 정량적으로 충분한 자료 확보를 위한 과정의 일환으로 정상 남자 20대와 30대를 대상으로 한국어 모음의 음향학적 측면에서 정량적 자료를 얻고자 하였다. 한국어 표준어를 산출하는 20~30대의 남자 총 31명을 대상으로 기본 5 모음 인 /아, 에(애), 이, 오, 우/를 3회 반복산출한 것을 Cool edit에 녹음하여 MATLAB음향분석 프로그램을 이용하여 모음의 F1, F2, F3, F4를 구하였다. F1과 F2 모두에서 본 연구의 모음 포만트가 선행연구보다 전반적으로 낮은 경향을 보였으나 전체적인 패턴은 매우 유사하였다. 연령별, 어음재료에 따른 한국어 모음 자료들에 대한 연구가 더 요구될 것으로 사료된다.
https://doi.org/10.5392/JKCA.2010.10.6.373 인용 PDF KSCI