Search | Korea Science

A DB Pruning Method in a Large Corpus-Based TTS with Multiple Candidate Speech Segments (대용량 복수후보 TTS 방식에서 합성용 DB의 감량 방법)

Lee, Jung-Chul;Kang, Tae-Ho
- The Journal of the Acoustical Society of Korea
- /
- v.28 no.6
- /
- pp.572-577
- /
- 2009
Large corpus-based concatenating Text-to-Speech (TTS) systems can generate natural synthetic speech without additional signal processing. To prune the redundant speech segments in a large speech segment DB, we can utilize a decision-tree based triphone clustering algorithm widely used in speech recognition area. But, the conventional methods have problems in representing the acoustic transitional characteristics of the phones and in applying context questions with hierarchic priority. In this paper, we propose a new clustering algorithm to downsize the speech DB. Firstly, three 13th order MFCC vectors from first, medial, and final frame of a phone are combined into a 39 dimensional vector to represent the transitional characteristics of a phone. And then the hierarchically grouped three question sets are used to construct the triphone trees. For the performance test, we used DTW algorithm to calculate the acoustic similarity between the target triphone and the triphone from the tree search result. Experimental results show that the proposed method can reduce the size of speech DB by 23% and select better phones with higher acoustic similarity. Therefore the proposed method can be applied to make a small sized TTS.
https://doi.org/10.7776/ASK.2009.28.6.572 인용 PDF KSCI

A DB for facial expression and its user-interface (얼굴표정 DB 및 사용자 인터페이스 개발)

한재현;문수종;김진관;김영아;홍상욱;심연숙;반세범;변혜란;오경자
- Proceedings of the Korean Society for Emotion and Sensibility Conference
- /
- 1999.11a
- /
- pp.373-378
- /
- 1999
얼굴 및 얼굴표정 연구의 기초 자료를 제공하고 실제 표정을 디자인하는 작업의 지침으로 사용되도록 하기 위하여 대규모의 표정 DB를 구축하였다. 이 DB 내에는 여러 가지 방법으로 수집된 배우 24명의 자연스럽고 다양한 표정 영상자료 약 1,500장이 저장되어 있다. 수집된 표정자료 각각에 대하여 내적상태의 범주모형과 차원모형을 모두 고려하여 다수의 사람들이 반응한 내적상태 평정 정보를 포함하도록 하였으며 사진별로 평정의 일치율을 기록함으로써 자료 이용에 참고할 수 있도록 하였다. 표정인식 및 합성 시스템에 사용될 수 있도록 각 표정자료들을 한국인 표준형 상모형에 정합하였을 때 측정된 MPEG-4 FAP 기준 39개 꼭지점들(vertices)의 좌표값들 및 표정추출의 맥락정보를 저장하였다. 실제 DB를 사용할 사람들이 가진 한정된 정보로써 전체 DB의 영상자료들을 용이하게 검색할 수 있도록 사용자 인터페이스를 개발하였다.
PDF

Performance Analysis of Steel-FRP Composite Safety Barrier by Vehicle Crash Simulation (충돌 시뮬레이션을 활용한 강재-FRP 합성 방호울타리의 성능평가)

Lee, Min-Chul;Kwon, Ki-Young;Kim, Seung-Eock
- Journal of the Korean Society for Advanced Composite Structures
- /
- v.2 no.4
- /
- pp.11-18
- /
- 2011
In this study, the performance of a steel-FRP composite bridge safety barrier was evaluated through vehicle crash simulation. Surface veil, DB and Roving fibers were used for FRP. The MAT58 material model provided by LS-DYNA software was used to model FRP material. Spot weld option was used for modeling contact between steel and FRP beam. The structural strength performance, the passenger protection performance, and the vehicle behavior after crash were evaluated corresponding to the vehicle crash manual. As the result, A steel-FRP composite safety barrier was satisfied with the required performance.
https://doi.org/10.11004/kosacs.2011.2.4.011 인용 PDF

SKU-Net: Improved U-Net using Selective Kernel Convolution for Retinal Vessel Segmentation

Hwang, Dong-Hwan;Moon, Gwi-Seong;Kim, Yoon
- Journal of the Korea Society of Computer and Information
- /
- v.26 no.4
- /
- pp.29-37
- /
- 2021
In this paper, we propose a deep learning-based retinal vessel segmentation model for handling multi-scale information of fundus images. we integrate the selective kernel convolution into U-Net-based convolutional neural network. The proposed model extracts and segment features information with various shapes and sizes of retinal blood vessels, which is important information for diagnosing eye-related diseases from fundus images. The proposed model consists of standard convolutions and selective kernel convolutions. While the standard convolutional layer extracts information through the same size kernel size, The selective kernel convolution extracts information from branches with various kernel sizes and combines them by adaptively adjusting them through split-attention. To evaluate the performance of the proposed model, we used the DRIVE and CHASE DB1 datasets and the proposed model showed F1 score of 82.91% and 81.71% on both datasets respectively, confirming that the proposed model is effective in segmenting retinal blood vessels.
https://doi.org/10.9708/jksci.2021.26.04.029 인용 PDF KSCI HTML

Speech Analysis Tools for Text-to-Speech Synthesizer (무제한 음성합성기를 위한음성 분석 장치)

김재인
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1995.06a
- /
- pp.115-118
- /
- 1995
무제한 음성합성기를 구현하기 위하여 꼭 필요한 음성분석장치의 개발에 대하여 논하엿다. 이 분석장치는 신호처리 보드를 사용하여 PC에서 사용할 수 있도록 되어 있으며, 음성의 A/D, D/A 및 spectrogram display는 물론 pitch pulse 위치를 Glottal instint closure에 맞추어 삽입할 수 있어 linear prediction base의 무제한 합성기에서 필요한 음성 data base를 구축하기 용이하도록 개발하였다. 또한 음성인식을 위한 음성 DB나 현재 사용중인 ARS를 구축하고자 할 때에도 적은 노력과 시간이 소요되도록 하였다.
PDF

Experimental Study on the Composite Beam Strengthened by External Post Tensioning (외부 후긴장으로 보강한 합성보에 대한 실험적 연구)

Kim, Gi Bong;Chung, Young Soo;Choi, Hyok Chu;Kang, Bo Soon
- Journal of Korean Society of Steel Construction
- /
- v.10 no.4 s.37
- /
- pp.701-708
- /
- 1998
The application of additional post-tensioning tendons has recently been widely considered and requires very little interference with the existing structure. Several bridges on the national roadway designed as the second class were strengthened by external post-tensioning for the capacity enhancement purpose, but much of that strengthening work was not verified analytically as well as experimentally. This paper examines experimentally the behavior of simple composite steel-concrete beams designed as the second class(DB-18) When they are strengthened with external post-tensioning tendon. Test results show that external post-tensioning increases the ultimate load and the magnitude of tendon force increases the yield load but not the ultimate load.
PDF

Effects of Opuntia ficus-indica on Lipid Metabolism in the db/db Mouse (노팔 복합물이 II형 당뇨생쥐에서 지질대사에 미치는 효과)

Yoon, Jin A
- Journal of the Korean Society of Food Science and Nutrition
- /
- v.42 no.6
- /
- pp.861-868
- /
- 2013
This study investigated the effects of Opuntia ficus-indica and other natural resources (OF) in db/db and C57 mice. Plasma triglycerides, cholesterol, alanine aminotransferase (ALT) activity, aspartate aminotransferase (AST) activity, fecal bile acid excretion, the histopathological appearance of the liver, and cholesterol-related mRNA expression were determined. Mice (12 db/db mice and 12 C57 mice) were assigned to diabetic-control (db-C), diabetic-OF treatment (db-OF), normal-control (C57-C), and normal-OF treatment (C57-OF) groups. Animals in the control group were fed an AIN-76 recommended diet and animals in the OF group were fed an experimental diet containing 5% of OF for 4 weeks. Concentrations of total plasma cholesterol, triglyceride, low density lipoprotein (LDL)-cholesterol, and very low density lipoprotein (VLDL)-cholesterol decreased with the administration of OF. In contrast, high density lipoprotein (HDL)-cholesterol levels were minimally affected by the experimental diet. Plasma AST and ALT showed lower activities in the db-OF group, and the fecal excretion of bile acid was reduced in the db-OF group. Histopathological analysis of the liver showed that fatty liver conditions in the db-OF group were more improved than db-C. Low-density lipoprotein receptor (LDL-R) and cholesterol 7${\alpha}$-hydroxylase (CYP7A1) mRNA expression were increased in the db-OF group as well. However, 3-hydroxy-3-methylglutaryl-coenzyme A reductase (HMG-CoA-R) mRNA expression was lower in the db-OF group. These results provide experimental evidence about improved lipid metabolism of the OF feeding in the db/db mice.
https://doi.org/10.3746/jkfn.2013.42.6.861 인용 PDF KSCI

Study on the Non-uniform synthesis unit selection and FO modeling for concatenative speech synthesis system (연결형 합성시스템을 위한 비정형 합성단위 추출 및 F0 모델링에 관한 검토)

김영일
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1998.08a
- /
- pp.93-98
- /
- 1998
자연스러운 한국어 음성을 합성할 수 있는 비정형 합성단위 선택기술 및 접합을 이용한 한국어 합성 시스템의 갭라을 최종 목표로 하고 있다. 이러한 최종 목쵸에 도달하기 위해 본 연구팀에서 검토중인 연구방향과 시스템의 구조 및 이를 토대로 현재까지 진행된 결과를 보고한다. 현재 검토중인 시스템은 입력된 문장으로부터 목적치 패턴을 생성하고, 이에 근사한 임의 길이 합성단위를 대량의 음성DB 로부터 선택하여 접합시키는 방식을 이용하고자 한다. 본 논문에서는 음성의 왜곡을 최소화할 수 있는 비정형 합성단위의 추출법에 관한 검토 결과와 본 연구팀에서 성능평가 중인 F0 자동 생성 알고리즘에 대하여 보고한다.
PDF

Improvement of Naturalness for a HMM-based Korean TTS using the prosodic boundary information (운율경계정보를 이용한 HMM기반 한국어 TTS 자연성 향상 연구)

Lim, Gi-Jeong;Lee, Jung-Chul
- Journal of the Korea Society of Computer and Information
- /
- v.17 no.9
- /
- pp.75-84
- /
- 2012
HMM-based Text-to-Speech systems generally utilize context dependent tri-phone units from a large corpus speech DB to enhance the synthetic speech. To downsize a large corpus speech DB, acoustically similar tri-phone units are clustered based on the decision tree using context dependent information. Context dependent information includes phoneme sequence as well as prosodic information because the naturalness of synthetic speech highly depends on the prosody such as pause, intonation pattern, and segmental duration. However, if the prosodic information was complicated, many context dependent phonemes would have no examples in the training data, and clustering would provide a smoothed feature which will generate unnatural synthetic speech. In this paper, instead of complicate prosodic information we propose a simple three prosodic boundary types and decision tree questions that use rising tone, falling tone, and monotonic tone to improve naturalness. Experimental results show that our proposed method can improve naturalness of a HMM-based Korean TTS and get high MOS in the perception test.
https://doi.org/10.9708/jksci/2012.17.9.075 인용 PDF KSCI

The Error Pattern Analysis of the HMM-Based Automatic Phoneme Segmentation (HMM기반 자동음소분할기의 음소분할 오류 유형 분석)

Kim Min-Je;Lee Jung-Chul;Kim Jong-Jin
- The Journal of the Acoustical Society of Korea
- /
- v.25 no.5
- /
- pp.213-221
- /
- 2006
Phone segmentation of speech waveform is especially important for concatenative text to speech synthesis which uses segmented corpora for the construction of synthetic units. because the quality of synthesized speech depends critically on the accuracy of the segmentation. In the beginning. the phone segmentation was manually performed. but it brings the huge effort and the large time delay. HMM-based approaches adopted from automatic speech recognition are most widely used for automatic segmentation in speech synthesis, providing a consistent and accurate phone labeling scheme. Even the HMM-based approach has been successful, it may locate a phone boundary at a different position than expected. In this paper. we categorized adjacent phoneme pairs and analyzed the mismatches between hand-labeled transcriptions and HMM-based labels. Then we described the dominant error patterns that must be improved for the speech synthesis. For the experiment. hand labeled standard Korean speech DB from ETRI was used as a reference DB. Time difference larger than 20ms between hand-labeled phoneme boundary and auto-aligned boundary is treated as an automatic segmentation error. Our experimental results from female speaker revealed that plosive-vowel, affricate-vowel and vowel-liquid pairs showed high accuracies, 99%, 99.5% and 99% respectively. But stop-nasal, stop-liquid and nasal-liquid pairs showed very low accuracies, 45%, 50% and 55%. And these from male speaker revealed similar tendency.
https://doi.org/10.7776/ASK.2006.25.5.213 인용 PDF KSCI

Search Result 87, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)