• Title/Summary/Keyword: Speech Synthesizer

Search Result 52, Processing Time 0.025 seconds

An Efficient Korean Morpheme Analyzer and Synthesizer using Dictionary Information and Chart Data Structure (사전 정보와 차트 자료 구조를 이용한 효율적인 형태소 분석기 및 합성기(KoMAS))

  • 김정해;이상조
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.31B no.3
    • /
    • pp.123-131
    • /
    • 1994
  • This paper describes on the analysis of morphemes and it's synthesis being constituted of Korean word phrases. To analyze morphemes, we propose the introduction of "morph" for morpheme features in lexicon and the usage of chart data structures. it controls over the generation of unnecessary morpheme, and extracts every possible morpheme unit in a word phrase which minimized lexicon investigation by using heuristic information. Moreover, to synthesize morphemes, it is composed of every possible analyzed morphemes in word phrases to take advantage of speech and union information which can be obtained for program. Therefore, the systhesis of analyzed morphemes were designed to aid a syntactic analysis next step of natural language processing. This system for analyzing and systhesizing morpheme was to generate a word phrase by unifying syntactic and semantic features of analyzed morphemes in lexicon, and then established by C language of the personal computer.

  • PDF

An Implementation of integrated CAD system of IC design (IC 설계용 집적형 캐드 시스템의 구현)

  • 공진흥;김성중;김재협
    • Journal of the Korean Institute of Telematics and Electronics A
    • /
    • v.30A no.1
    • /
    • pp.73-85
    • /
    • 1993
  • This paper presents a design and implementation of CAD(Computer-Aided Design) system with tools and design environments for IC(Intergrated Circuits)design. The CAD system can be easily installed in various sites with limited resources, since most CAD tools and design environments are available in the public-domain and Unix & X Window-based PC-386 and Workstation is used for the hardware platform. In order to improve the flexibility of the CAD system, objects are defined in the context of tools and environments` and object tables are programmed to describe the integration of CAD tools and design environments. During the execution, tool-objects deal with intertool communication and round-robin mechanism to incrementally control the execution of CAD tools. The IC design of LPC(Linear Predictive Coding) Speech Synthesizer is carried out to find out improvements and bugs of the CAD system.

  • PDF

The Development of Speech Synthesizer In Korean TTS System (한국어 문어변환 시스템 내에서의 음성 합성기 개발)

  • 강찬희;진용옥
    • The Journal of the Acoustical Society of Korea
    • /
    • v.12 no.2
    • /
    • pp.14-27
    • /
    • 1993
  • 본 논문은 매 40ms 정도의 음성파형으로부터 추출된 6내지 9ms 정도의 1피치주기 파형을 합성단위로 사용하여 합성시킨 시간영역에서의합성방식을 한국어 문어 변환 시스템내에서의 음성합성기에 적용시킨 연구결과이다. 시험 결과, 4가지 유형의 한국어 음절 합성이 가능하고, 장단강약과 같은 운율요소의 제어가 용이하고, 또한 합성 알고리즘이 간단하여 실시간 처리가 가능하였으나, 문장 단위의 음성을 합성하기 위하여는 문장내에서의 다양한 피치 패턴에 대한 연구와 이의 효율적인 제어에 관한 연구가 이루어져야 할 것이다. 합성음에 대한 평가방법으로는 원음과 합성음에 대한 시간영역에서의 파형비교, 주파수 영역에서의 스펙트럼 포락선 유사성 비교 및 합성음에 대한 청취도 실험을 행하였다.

  • PDF

Text-to-Speech Synthesizer with the Process of Minimizing Concatenation Distortion (접합 왜곡의 최소화 과정이 포함된 음성합성기)

  • 박훈재;김상훈;정재호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.17 no.4
    • /
    • pp.38-44
    • /
    • 1998
  • 대용량의 음성합성용 데이터베이스를 용이하게 구축하기 위해 음성인식 시스템을 이용한 음소 경계 분할이 이루어지고 있다. 그러나 자동 분할 결과를 직접 이용하여 합성음 을 생성할 경우 음소 경계 에러로 인하여 접합 왜곡이 많이 발생하게 된다. 이러한 문제를 해결하기 위해서, 본 연구에서는 단위 접합시 경계 에러를 고려하여 적합한 접합 위치를 찾 고자 하였다. 여기서 적합한 접합 위치는 스펙트럼의 불연속이 최소화된 접합점을 의미한다. 합성음에 대한 MOS(Mean Opinion Score) 테스트와 스펙트로그램(spectrogram)의 모양을 비교하므로써 제안된 방법의 성능을 평가하였다. 제안된 방법은 두 단계로 이루어져 있다. 첫째, 레퍼런스 패턴(reference pattern)과 두 개의 테스트 패턴(test pattern)을 선택하는 단 계와, 둘째, 앞과 뒤 테스트 패턴 사이의 적합한 접합위치를 찾는 단계이다. 본 연구에서는 패턴 사이의 스펙트로그램 비교를 위해 켑스트럼(cepstrum) 피라미터와 패턴 분류기 (pattern classifier)인 DTW(Dynamic Time Warping) 알고리즘을 사용하였다. 제안된 알고 리즘을 평가한 청취 테스트의 결과에서 제안된 알고리즘을 적용하여 합성된 합성음의 음질 이 자동 분절로 생성된 단위를 그대로 이용한 경우의 음질보다 우수함을 보였다.

  • PDF

A Study on the Korean Consonants Synthesis using Switched-Capaciter Filter (Switched Capacitor Filter를 이용한 한국어자음합성에 관한 연구)

  • 이영훈;이대영
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.9 no.1
    • /
    • pp.30-38
    • /
    • 1984
  • In this paper, we designed the programmable 2nd order switched capacitor filter that the center frequency can be varied linearly with the clock frequency, and that the peak gaion and the selectivity can be controlled with digital signal by the capacitor array. In addition, speech synthesizer system was constructed with this filter, korean consonants being synthesized. Therefore, this filter shows the possibility that most Korean language sounds can be synthesized in the real time mode.

  • PDF

Fundamental Acoustic Investigation of Korean Male 5 Monophthongs (한국 남성의 단모음 [아, 에, 이, 오, 우]에 대한 음향음성학적 기반연구)

  • Choi, Yae-Lin
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.6
    • /
    • pp.373-377
    • /
    • 2010
  • Numerous quantitative and qualitative studies have already been published related to English vowels. However, only minimal amounts of studies based on the acoustic analysis of Korean vowels have been accomplished. The purpose of this study is to obtain sufficient quantitative data based on the acoustic aspects of Korean vowels produced by males between the ages of 20s and 30s. A total of 31 males in their 20s and 30s produced the five fundamental vowels /a, e, i, o, u/ by repeating each of them three times in the standard Korean dialect. Such speech productions were recorded with 'Cool edit' and F1, F2, F3, F4 were extracted through the MATLAB acoustic analysis program. Results indicated that the overall patterns of formants were similar to previous studies, except that the formant levels of F1 and F2 of the vowels produced in this study were generally lower than that in previous studies. Future studies need to focus on obtaining vowel data by considering other factors such as age and other speech materials.

Singing Voice Synthesis Using HMM Based TTS and MusicXML (HMM 기반 TTS와 MusicXML을 이용한 노래음 합성)

  • Khan, Najeeb Ullah;Lee, Jung-Chul
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.5
    • /
    • pp.53-63
    • /
    • 2015
  • Singing voice synthesis is the generation of a song using a computer given its lyrics and musical notes. Hidden Markov models (HMM) have been proved to be the models of choice for text to speech synthesis. HMMs have also been used for singing voice synthesis research, however, a huge database is needed for the training of HMMs for singing voice synthesis. And commercially available singing voice synthesis systems which use the piano roll music notation, needs to adopt the easy to read standard music notation which make it suitable for singing learning applications. To overcome this problem, we use a speech database for training context dependent HMMs, to be used for singing voice synthesis. Pitch and duration control methods have been devised to modify the parameters of the HMMs trained on speech, to be used as the synthesis units for the singing voice. This work describes a singing voice synthesis system which uses a MusicXML based music score editor as the front-end interface for entry of the notes and lyrics to be synthesized and a hidden Markov model based text to speech synthesis system as the back-end synthesizer. A perceptual test shows the feasibility of our proposed system.

A Study on the Phoneme Based Analysis of Korean Initial Plosives Using Statistical Method and Perception Tests (통계적 방법과 인지실험을 통한 한국어 초성파열음의 음소단위 분석에 관한 연구)

  • Jo Cheol-Woo;Lee Woo-Sun;Lee Cyu-Ho;Kim Jong-Ahn;Lim Gwang-Il;Lee Tae-Won
    • The Journal of the Acoustical Society of Korea
    • /
    • v.8 no.5
    • /
    • pp.78-85
    • /
    • 1989
  • This paper describes a statistical methods and perception test for extracting the parameters to be used for the synthesis-by-rule of Korean plosives. Formant synthesizer is chosen for the synthesis of the phonemes. Speech materials for the analysis consists of 72 CV monosyllables from the single male speaker. The analysis is done mainly focused on the variation of parameters in time and frequency domain, then perception tests are executed to estimate the effects of variations of the formant transitions.

  • PDF

Robust Backward Adaptive Pitch Prediction for Tree Coding (트리 코팅에서 전송에러에 강한 역방향 적응 피치 예측)

  • 이인성
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.19 no.8
    • /
    • pp.1587-1594
    • /
    • 1994
  • The pitch predictor is one of the most important part for the robust tree coder. The hybrid backward pitch adapation which is a combination of a block adaptation and a recursive adaptation is used for the pitch predictor. In order to improve the error performance and track the pitch period change of the input speech, it is proposed to smooth the input of the pitch predictor. The smoother with three taps can have fixed coefficients or variable coefficients depending on the estimated autocorrelation function of the output of the pitch synthesizer. The inclusion of a variable smoother can track the pitch period change within a block and reduce the effect of channel errors.

  • PDF

Cancelation of Baseline Wandering of Electroglottograph Signal using Empirical Mode Decomposition (경험적 모드 재구성 방법을 이용한 성문파형 신호의 기계선 변동 제거)

  • Jang, Seung-Jin;Kim, Hyo-Min;Park, Young-Cheol;Choi, Hong-Shik;Yoon, Young-Ro
    • Proceedings of the KIEE Conference
    • /
    • 2007.10a
    • /
    • pp.475-476
    • /
    • 2007
  • Electroglottography (EGG) is a technique used to register laryngeal behavior indirectly by a measuring the change in electrical impedance across the throat during speaking. However, EGG waveform is affected by laryngeal muscles which fluctuate the vocal cords, and which result in baseline wander. It is required to reduce baseline wander in EGG waveform, because EGG waveform is used for input signal of nonlinear speech synthesizer in next chapter. In vocal cords, the abduction-adduction of glottis is mainly controlled by the posterior cricoarytenoid (abductor) and interarytenoid (adductor) muscles respectively. Empirical Mode Decomposition method was adopted in cancellation of EGG waveform baseline wandering, and showd better performance than that of high pass filter with 500 order.

  • PDF