Search | Korea Science

Speech Synthesis using Diphone Clustering and Improved Spectral Smoothing (다이폰 군집화와 개선된 스펙트럼 완만화에 의한 음성합성)

Jang, Hyo-Jong;Kim, Kwan-Jung;Kim, Gye-Young;Choi, Hyung-Il
- The KIPS Transactions:PartB
- /
- v.10B no.6
- /
- pp.665-672
- /
- 2003
This paper describes a speech synthesis technique by concatenating unit phoneme. At that time, a major problem is that discontinuity is happened from connection part between unit phonemes, especially from connection part between unit phonemes recorded by different persons. To solve the problem, this paper uses clustered diphone, and proposes a spectral smoothing technique, not only using formant trajectory and distribution characteristic of spectrum but also reflecting human's acoustic characteristic. That is, the proposed technique performs unit phoneme clustering using distribution characteristic of spectrum at connection part between unit phonemes and decides a quantity and a scope for the smoothing by considering human's acoustic characteristic at the connection part of unit phonemes, and then performs the spectral smoothing using weights calculated along a time axes at the border of two diphones. The proposed technique removes the discontinuity and minimizes the distortion which can be occurred by spectrum smoothing. For the purpose of the performance evaluation, we test on five hundred diphones which are extracted from twenty sentences recorded by five persons, and show the experimental results.
https://doi.org/10.3745/KIPSTB.2003.10B.6.665 인용 PDF KSCI

A Study on Implementation of Emotional Speech Synthesis System using Variable Prosody Model (가변 운율 모델링을 이용한 고음질 감정 음성합성기 구현에 관한 연구)

Min, So-Yeon;Na, Deok-Su
- Journal of the Korea Academia-Industrial cooperation Society
- /
- v.14 no.8
- /
- pp.3992-3998
- /
- 2013
This paper is related to the method of adding a emotional speech corpus to a high-quality large corpus based speech synthesizer, and generating various synthesized speech. We made the emotional speech corpus as a form which can be used in waveform concatenated speech synthesizer, and have implemented the speech synthesizer that can be generated various synthesized speech through the same synthetic unit selection process of normal speech synthesizer. We used a markup language for emotional input text. Emotional speech is generated when the input text is matched as much as the length of intonation phrase in emotional speech corpus, but in the other case normal speech is generated. The BIs(Break Index) of emotional speech is more irregular than normal speech. Therefore, it becomes difficult to use the BIs generated in a synthesizer as it is. In order to solve this problem we applied the Variable Break[3] modeling. We used the Japanese speech synthesizer for experiment. As a result we obtained the natural emotional synthesized speech using the break prediction module for normal speech synthesize.
https://doi.org/10.5762/KAIS.2013.14.8.3992 인용 PDF KSCI

A Study on the Artificial Neural Networks for the Sentence-level Prosody Generation (문장단위 운율발생용 인공신경망에 관한 연구)

신동엽;민경중;강찬구;임운천
- Proceedings of the Acoustical Society of Korea Conference
- /
- autumn
- /
- pp.53-56
- /
- 2000
무제한 어휘 음성합성 시스템의 문-음성 합성기는 합성음의 자연감을 높이기 위해 여러 가지 방법을 사용하게되는데 그중 하나가 자연음에 내재하는 운을 법칙을 정확히 구현하는 것이다. 합성에 필요한 운율법칙은 언어학적 정보를 이용해 구현하거나, 자연음을 분석해 구한 운을 정보로부터 운율 법칙을 추출하여 합성에 이용하고 있다. 이와 같이 구한 운을 법칙이 자연음에 존재하는 운율 법칙을 전부 반영하지 못했거나, 잘못 구현되는 경우에는 합성음의 자연성이 떨어지게 된다. 이런 점을 고려하여 우리는 자연음의 운율 정보를 이용해 인공 신경망을 훈련시켜, 문장단위 운율을 발생시킬 수 있는 방식을 제안하였다. 운율의 세 가지 요소는 피치, 지속시간, 크기 변화가 있는데, 인공 신경망은 문장이 입력되면, 각 해당 음소의 지속시간에 따른 피치 변화와 크기 변화를 학습할 수 있도록 설계하였다. 신경망을 훈련시키기 위해 고립 단어 군과 음소균형 문장 군을 화자로 하여금 발성하게 하여, 녹음하고, 분석하여 구한 운을 정보를 데이터베이스로 구축하였다. 문장 내의 각 음소에 대해 지속시간과 피치 변화 그리고 크기 변화를 구하고, 곡선적응 방법을 이용하여 각 변화 곡선에 대한 다항식 계수와 초기치를 구해 운을 데이터베이스를 구축한다. 이 운을 데이터베이스의 일부를 인공 신경망을 훈련시키는데 이용하고, 나머지를 이용해 인공 신경망의 성능을 평가한 결과 운을 데이터베이스를 계속 확장하면 좀더 자연스러운 운율을 발생시킬 수 있음을 관찰하였다.
PDF

VCV Chain Analysis for Korean Speech Synthesis (한국어 음성 합성을 위한 VCV연쇄음 분석에 관한 연구)

Kim, Sung-Joo;Oh, Yung-Hwan
- Annual Conference on Human and Language Technology
- /
- 1992.10a
- /
- pp.173-184
- /
- 1992
본 논문에서는 일반적인 음성 합성 시스템과 모음-자음-모음(VCV) 연쇄음을 단위로 한 규칙 합성에 대해 고찰하고, 한국어의 음성 합성을 위한 VCV 연쇄음의 종류와 각 연쇄음의 빈도 및 사용예를 조사하기 위하여 약11만 단어의 어휘 목록과 3만 6천행 가량의 한글 문서를 분석, 연구한 결과를 기술하였다. 본 연구의 결과, 한국어의 음성 합성에는 약 2500여 증류의 VCV 연쇄음이 필요함을 확인하였다.
PDF

Improvement of Synthetic Speech Quality using a New Spectral Smoothing Technique (새로운 스펙트럼 완만화에 의한 합성 음질 개선)

장효종;최형일
- Journal of KIISE:Software and Applications
- /
- v.30 no.11
- /
- pp.1037-1043
- /
- 2003
This paper describes a speech synthesis technique using a diphone as an unit phoneme. Speech synthesis is basically accomplished by concatenating unit phonemes, and it's major problem is discontinuity at the connection part between unit phonemes. To solve this problem, this paper proposes a new spectral smoothing technique which reflects not only formant trajectories but also distribution characteristics of spectrum and human's acoustic characteristics. That is, the proposed technique decides the quantity and extent of smoothing by considering human's acoustic characteristics at the connection part of unit phonemes, and then performs spectral smoothing using weights calculated along a time axis at the border of two diphones. The proposed technique reduces the discontinuity and minimizes the distortion which is caused by spectral smoothing. For the purpose of performance evaluation, we tested on five hundred diphones which are extracted from twenty sentences using ETRI Voice DB samples and individually self-recorded samples.
PDF KSCI

소표본 통계단위에서의 집세 변동률 추정

Park, Won-Ran
- Proceedings of the Korean Statistical Society Conference
- /
- 2003.05a
- /
- pp.63-68
- /
- 2003
도시가구의 지출 중 집세가 차지하는 비율이 높으며 그 변동에 따라 도시가구의 생활에 미치는 영향도 커서 중요한 통계자료로 인식되고 있다. 집세 계약기간이 통상 2년 단위이기 때문에 집세변동의 발생빈도가 적어서 이러한 소표본 군의 통계단위로 일반적인 집세지수를 작성하는데는 많은 어려움이 따른다. 그렇다고 해서 소표본 군의 표본을 확대하는 것도 어렵기 때문에 이러한 산술적인 표본확대가 어려운 소표본 군의 문제점을 해결하기 위해 소지역 추정법을 도입하였다. 이러한 소표본 통계단위에서의 집세 변동률 추정방법을 경기도 지역의 6개 도시에서의 집세변동을 추정하는데 적용하였으며 검토해 보았다.
PDF

Synthesis of Dendrimer with PEG Core by Click Chemistry (클릭 화학에 의한 PEG 핵을 갖는 덴드리머의 합성)

Han, Seung-Choul;Jin, Sung-Ho;Lee, Jae-Wook
- Polymer(Korea)
- /
- v.36 no.3
- /
- pp.295-301
- /
- 2012
Efficient stitching methods for the synthesis of Fr$\acute{e}$chet-type dendrimers with linear PEG units at a core were elaborated. The synthetic strategy involved an inexpensive 1,3-dipolar cycloaddition reaction between an alkyne and an azide in the presence of Cu(I) species which is known as the best example of click chemistry. The linear core building blocks, two diazido-PEG units, were chosen to serve as the azide functionalities for dendrimer growth via click reactions with the alkyne-dendrons. These two building blocks were employed together with the alkyne-functionalized Fr$\acute{e}$chet-type dendrons in a convergent strategy to synthesize two kinds of Fr$\acute{e}$chet-type dendrimers with different linear core units.
https://doi.org/10.7317/pk.2012.36.3.295 인용 PDF KSCI

A Performance Improvement Method using Variable Break in Corpus Based Japanese Text-to-Speech System (가변 Break를 이용한 코퍼스 기반 일본어 음성 합성기의 성능 향상 방법)

Na, Deok-Su;Min, So-Yeon;Lee, Jong-Seok;Bae, Myung-Jin
- The Journal of the Acoustical Society of Korea
- /
- v.28 no.2
- /
- pp.155-163
- /
- 2009
In text-to-speech systems, the conversion of text into prosodic parameters is necessarily composed of three steps. These are the placement of prosodic boundaries. the determination of segmental durations, and the specification of fundamental frequency contours. Prosodic boundaries. as the most important and basic parameter. affect the estimation of durations and fundamental frequency. Break prediction is an important step in text-to-speech systems as break indices (BIs) have a great influence on how to correctly represent prosodic phrase boundaries, However. an accurate prediction is difficult since BIs are often chosen according to the meaning of a sentence or the reading style of the speaker. In Japanese, the prediction of an accentual phrase boundary (APB) and major phrase boundary (MPB) is particularly difficult. Thus, this paper presents a method to complement the prediction errors of an APB and MPB. First, we define a subtle BI in which it is difficult to decide between an APB and MPB clearly as a variable break (VB), and an explicit BI as a fixed break (FB). The VB is chosen using the classification and regression tree, and multiple prosodic targets in relation to the pith and duration are then generated. Finally. unit-selection is conducted using multiple prosodic targets. In the MOS test result. the original speech scored a 4,99. while proposed method scored a 4.25 and conventional method scored a 4.01. The experimental results show that the proposed method improves the naturalness of synthesized speech.
https://doi.org/10.7776/ASK.2009.28.2.155 인용 PDF KSCI

Evaluations of Shear performance and Compressive strength of Light-weight hybrid panel (경량합성벽체의 전단성능 및 압축내력 평가)

Lee, Dong Hyuck;Lee, Sang Sup;Bae, Kyu Woong;Moon, Tae Sup
- Journal of Korean Society of Steel Construction
- /
- v.17 no.1 s.74
- /
- pp.33-43
- /
- 2005
This paper presents the test results and evaluations for the energy dissipation capacity and compressive performance of light-weight hybrid panels. A total of 26 full-scale specimens of light-weight hybrid panels were tested. The parameters include the presence of light-weight foamed mortar, the specific gravity of light-weight foamed mortar (0.6, 0.8, 1.0, 1.2), the finishing materials (light-weight foamed mortar, OSB [Oriented Strand Board], gypsum board), the shape of bracing (x, ~), and the size of panels (1P-900 mm 2,400 mm, 2P-1,800 mm 2,400 mm). The results of the cyclic tests are somewhat different from those of monotonic tests, due to the different specific gravity of light-weight foamed mortar. It was found from the compressive tests that the ultimate strength and initial stiffness are increased by means of light-weight foamed mortar (2~2.5 times in ultimate strength and 2~3 times in initial stiffness).
PDF KSCI

D-p-hydroxyphenylglycine의 합성 및 생산 - 생물 전환 기술

김학성
- The Microorganisms and Industry
- /
- v.20 no.1
- /
- pp.42-45
- /
- 1994
아미노산은 식품이나 사료의 첨가물로, 의약품으로 그리고 화학합성에 있어서의 단위구조체(building block)로써 널리 이용되어 왔는데, 위에서 광학적으로 순수한 single enantiomer로서 D- 혹은 L-form의 .alpha.-아미노산은 의약품의 합성에 있어서 그 산업적 중요성이 날로 증대되고 있다. 이중 D-form의 아미노산은 .betha.-lactam계 항생제, peptide hormone, 살충제 등의 합성에서 중간 물질로 사용되고 있는데, 특히 D-p-hydroxyphenylglycine(이하 D-HPG)의 경우 amoxicillin, cefadroxil, cefatrizine, cefaparole, cefaperazone등의 penicillin이나 cephalosporin계의 반합성항생제의 전구체로서 전 세계적으로 널리 사용되고 있다.
PDF

Search Result 623, Processing Time 0.037 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)