Search | Korea Science

Pruning Methodology for Reducing the Size of Speech DB for Corpus-based TTS Systems (코퍼스 기반 음성합성기의 데이터베이스 축소 방법)

최승호;엄기완;강상기;김진영
- The Journal of the Acoustical Society of Korea
- /
- v.22 no.8
- /
- pp.703-710
- /
- 2003
Because of their human-like synthesized speech quality, recently Corpus-Based Text-To-Speech(CB-TTS) have been actively studied worldwide. However, due to their large size speech database (DB), their application is very restricted. In this paper we propose and evaluate three DB reduction algorithms to which are designed to solve the above drawback. The first method is based on a K-means clustering approach, which selects k-representatives among multiple instances. The second method is keeping only those unit instances that are selected during synthesis, using a domain-restricted text as input to the synthesizer. The third method is a kind of hybrid approach of the above two methods and is using a large text as input in the system. After synthesizing the given sentences, the used unit instances and their occurrence information is extracted. As next step a modified K-means clustering is applied, which takes into account also the occurrence information of the selected unit instances, Finally we compare three pruning methods by evaluating the synthesized speech quality for the similar DB reduction rate, Based on perceptual listening tests, we concluded that the last method shows the best performance among three algorithms. More than this, the results show that the last method is able to reduce DB size without speech quality looses.
PDF KSCI

A Study on WAF reduction and SST file size on RocksDB (RocksDB에서 SST 파일에 따른 WAF 감소에 관한 연구)

Cho, Minsoo;Choi, Wongi;Park, Sang Hyun
- Proceedings of the Korea Information Processing Society Conference
- /
- 2017.11a
- /
- pp.709-712
- /
- 2017
RocksDB는 Facebook에서 LevelDB를 기반으로 개발한 임베디드 key-value 스토리지 엔진이다. Log structured tree(LSM-tree)를 기본구조로 사용하는 RocksDB는 데이터를 레벨단위로 저장한다. 지속적인 데이터 입력으로 인하여 레벨의 크기를 초과하게 되면 하위 레벨의 SST 파일과 병합을 통해 하위레벨로 내려 보낸다. 이 과정에서 디바이스의 부가적인 쓰기가 발생한다. 본 논문에서는 RocksDB의 디스크영역에 있는 SST 파일의 크기가 디바이스의 쓰기 증폭에 미치는 영향을 분석하였다. SST 파일크기변화에 따른 디바이스의 쓰기 증폭과 성능변화를 측정하고 비교하였다. 실험결과를 통해 SST의 크기가 작을수록 쓰기 증폭이 줄었지만 디바이스의 쓰기와 읽기 성능이 감소하는 것을 확인하였다. 결과적으로 쓰기 증폭을 줄이고 성능을 최대화 하기 위해서는 이 둘의 트레이드오프를 파악하고 분석하여 시스템에 맞는 최적의 SST 파일 크기를 찾아야한다.
https://doi.org/10.3745/PKIPS.y2017m11a.709 인용 PDF

Construction of the Personal 3D Characters for Virtual Clothing Coordination (가상 의복 코디네이션을 위한 개인 3D캐릭터의 구성)

최창석;김효숙
- Journal of the Korean Society of Clothing and Textiles
- /
- v.27 no.9_10
- /
- pp.1015-1025
- /
- 2003
This paper proposes a method for constructing the virtual characters adopting the personal body types for the clothing coordination. At first, the method produces the 38 kinds of the Korean 3D body models considering sex, ages and body types, and constructs model DB. We select a model similar to the personal body size from DB and deform the selected model according to body size. The method deforms the model linearly for height 12 items, width 6 items, depth 5 items and round 13 items, and constructs the personal character fitted to the personal body size. The preprocess for model deformation consists of grouping for body part and establishing the feature points. Linear deformation for each group leads us to easy construction of the virtual personal characters. This method has two advantages as follows: 1. Large reduction of man power, cost and time for DB construction of the body 3D models, since the preprocess permits us to effectively use the various body models whose geometrical structures are different, 2 Suitability to Web-based clothing coordination, since the body deformation method is simple and its speed is very high.
PDF KSCI

UA Tree-based Reduction of Speech DB in a Large Corpus-based Korean TTS (대용량 한국어 TTS의 결정트리기반 음성 DB 감축 방안)

Lee, Jung-Chul
- Journal of the Korea Society of Computer and Information
- /
- v.15 no.7
- /
- pp.91-98
- /
- 2010
Large corpus-based concatenating Text-to-Speech (TTS) systems can generate natural synthetic speech without additional signal processing. Because the improvements in the natualness, personality, speaking style, emotions of synthetic speech need the increase of the size of speech DB, it is necessary to prune the redundant speech segments in a large speech segment DB. In this paper, we propose a new method to construct a segmental speech DB for the Korean TTS system based on a clustering algorithm to downsize the segmental speech DB. For the performance test, the synthetic speech was generated using the Korean TTS system which consists of the language processing module, prosody processing module, segment selection module, speech concatenation module, and segmental speech DB. And MOS test was executed with the a set of synthetic speech generated with 4 different segmental speech DBs. We constructed 4 different segmental speech DB by combining CM1(or CM2) tree clustering method and full DB (or reduced DB). Experimental results show that the proposed method can reduce the size of speech DB by 23% and get high MOS in the perception test. Therefore the proposed method can be applied to make a small sized TTS.
https://doi.org/10.9708/jksci.2010.15.7.091 인용 PDF KSCI

The implementation of database for high quality Embedded Text-to-speech system (고품질 내장형 음성합성 시스템을 위한 음성합성 DB구현)

Kwon, Oh-Il
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.42 no.4 s.304
- /
- pp.103-110
- /
- 2005
Speech Database is one of the most important part of Text-to-speech(TTS) system Especially, the embedded TTS system needs more small size of database than that of the server TTS system So, the compression and statistical reduction or database is a very important factor in the embedded TTS system But this compression and statistical reduction of database always rise a loss of quality of the synthesised speech. In this paper, we propose a method of constructing database for high quality embedded TTS system and verify the quality of synthesised speech with MOS(Mean Opinion Score) test.
PDF KSCI

Implementation of HMM Based Speech Recognizer with Medium Vocabulary Size Using TMS320C6201 DSP (TMS320C6201 DSP를 이용한 HMM 기반의 음성인식기 구현)

Jung, Sung-Yun;Son, Jong-Mok;Bae, Keun-Sung
- The Journal of the Acoustical Society of Korea
- /
- v.25 no.1E
- /
- pp.20-24
- /
- 2006
In this paper, we focused on the real time implementation of a speech recognition system with medium size of vocabulary considering its application to a mobile phone. First, we developed the PC based variable vocabulary word recognizer having the size of program memory and total acoustic models as small as possible. To reduce the memory size of acoustic models, linear discriminant analysis and phonetic tied mixture were applied in the feature selection process and training HMMs, respectively. In addition, state based Gaussian selection method with the real time cepstral normalization was used for reduction of computational load and robust recognition. Then, we verified the real-time operation of the implemented recognition system on the TMS320C6201 EVM board. The implemented recognition system uses memory size of about 610 kbytes including both program memory and data memory. The recognition rate was 95.86% for ETRI 445DB, and 96.4%, 97.92%, 87.04% for three kinds of name databases collected through the mobile phones.
PDF KSCI

The Study of Overtopping Wave Energy Converter Control and Monitoring System

Oh, Jin-Seok
- Journal of Advanced Marine Engineering and Technology
- /
- v.33 no.7
- /
- pp.1012-1016
- /
- 2009
This paper describes the control and monitoring system for OWEC (Overtopping Wave Energy Converter) which shows the characteristic of power stabilization in overtopping wave energy converter system. Overtopping waves generates different water pressure and the turbine is rotated by this pressure. As a result, overtopping wave energy converter is able to convert wave energy into electricity. Small size of overtopping wave energy converter is developed to simulate the control monitoring system which is able to control power generation and also monitor the system condition. The result shows the reduction of fluctuation from the overtopping wave energy system by the developed control monitoring system. In addition, the DB(Data Base) of test results are contributed to the research and development for OWEC.
https://doi.org/10.5916/jkosme.2009.33.7.1012 인용 PDF KSCI

Search Result 7, Processing Time 0.019 seconds

Pruning Methodology for Reducing the Size of Speech DB for Corpus-based TTS Systems (코퍼스 기반 음성합성기의 데이터베이스 축소 방법)

A Study on WAF reduction and SST file size on RocksDB (RocksDB에서 SST 파일에 따른 WAF 감소에 관한 연구)

Construction of the Personal 3D Characters for Virtual Clothing Coordination (가상 의복 코디네이션을 위한 개인 3D캐릭터의 구성)

UA Tree-based Reduction of Speech DB in a Large Corpus-based Korean TTS (대용량 한국어 TTS의 결정트리기반 음성 DB 감축 방안)

The implementation of database for high quality Embedded Text-to-speech system (고품질 내장형 음성합성 시스템을 위한 음성합성 DB구현)

Implementation of HMM Based Speech Recognizer with Medium Vocabulary Size Using TMS320C6201 DSP (TMS320C6201 DSP를 이용한 HMM 기반의 음성인식기 구현)

The Study of Overtopping Wave Energy Converter Control and Monitoring System

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)