• Title/Summary/Keyword: 음합성

Search Result 333, Processing Time 0.029 seconds

Realization of Digital Music Synthesizer Using a Frequency Modulation (FM 방식을 이용한 디지탈 악기음 합성기의 구현)

  • 주세철;김진범;김기두
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.32B no.7
    • /
    • pp.1025-1035
    • /
    • 1995
  • In this paper, we realize a real time digital FM synthesizer based on genetic algorithm using a general purpose digital signal processor. Especially, we synthesize diverse music sounds nicely using a synthesis model consisting of a single modulator and multiple carriers. Also we present genetic algorithm-based technique which determines optimal parameters for reconstruction through FM synthesis of a sound after analyzing the spectrum of PCM data as a standard music sound using FFT. Using the suggested parameter extractiuon algorithm, we extract parameters of several instruments and then synthesize digital FM sounds. To verify the validity of the parameter extraction algorithm as well as realization of a real time digital music synthesizer, the evaluation is first done by listening the sound directly as subjective test. Secondly, to evaluate the synthesized sound objectively with an engineering sense, we compare the synthesized sound with an original one in a time domain and a frequency domain.

  • PDF

On Speech Input with Microphone Array using the variable coefficient Pre-emphasis (가변계수 프리엠퍼시스를 이용한 마이크어레이 음성입력에 관한 연구)

  • Jo Wangrae;Bae Myungjin
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • autumn
    • /
    • pp.65-68
    • /
    • 2001
  • 현재 음성인식기는 다채널의 음성입력방식을 사용하고 있는 추세이다. 이런 방법으로 음성인식기를 사용할 때에 자동적으로 음성을 검출하는 음성입력방식은 발성자와 마이크간의 거리에 따라 발성음성, 반사음성과 잡음이 입력된 경우에 원 음성의 고차포만트 성분에 왜곡이 발생하게 된다 이러한 문제점을 개선하기 위하여 본 논문에서는 고주파영역의 특성을 강조시킬 수 있는 프리엠퍼시스를 이용하여 더욱 더 정확한 음성입력 방법을 제안한다. 제안한 방법은 원음과 유사한 합성입력음을 얻었고, 또한 기존의 시간영역 법 보다 더 높은 SNR을 얻을 수 있다는 것을 알 수가 있다.

  • PDF

Pitch Modification based on a Voice Source Model (음원 모델에 기초한 합성음의 피치 조절)

  • Choi, Yong-Jin;Yeo, Su-Jin;Kim, Jin-Young;Sung, Koeng-Mo
    • Speech Sciences
    • /
    • v.3
    • /
    • pp.132-147
    • /
    • 1998
  • Previously developed methods for pitch modification have not been based on the voice source model. Therefore, the synthesized speech often sounds unnatural although it may be highly intelligible. The purpose of this paper is to analyze the alteration of a voice source signal with pitch period and to establish the pitch-modification rule based on the result of this analysis. We examine the alteration of the interval of closing phase, closed phase and open phase using the excitation waveform as the pitch increases. In comparison to the previous methods which performed directly on the speech signal, the pitch modification method based on a voice source model shows high intelligibility and naturalness. This study might benefit the application to the speaker identification and the voice color conversion. Therefore the proposed method will provide high quality synthetic speech.

  • PDF

Correlation Analysis of PESQ and MOS Evaluation for HMM-based Synthetic Korean Speech (HMM 기반의 한국어 합성음에 대한 PESQ 및 MOS 평가의 상관도 분석)

  • Lin, Cang-Song;Bae, Keun-Sung
    • Phonetics and Speech Sciences
    • /
    • v.2 no.1
    • /
    • pp.71-75
    • /
    • 2010
  • The PESQ is an objective speech quality evaluation measure that is known to have a high correlation with a subjective speech quality measure such as MOS. To examine whether it could be useful as an objective quality measure of synthetic speech, we carried out both subjective evaluation tests with MOS and DMOS and an objective evaluation test with PESQ for HMM-based Korean synthetic speech signals and analyzed the correlation between them. Experimental results have shown that the PESQ has correlations of 0.87 with MOS and 0.92 with DMOS. It means that the PESQ holds much promise for evaluating the quality of synthetic Korean speech.

  • PDF

Structure-borne Noise Analysis of Marine Diesel Engine Considering Receptance of Hull Structure at Mounting Point (선체 마운트 지지점에서의 리셉턴스를 고려한 선박용 디젤 엔진의 고체전달음 해석)

  • Jang, Seong-Gil;Jeong, Weui-Bong;Hong, Chin-Suk;Bae, Soo-Ryong
    • Transactions of the Korean Society for Noise and Vibration Engineering
    • /
    • v.21 no.2
    • /
    • pp.120-128
    • /
    • 2011
  • This paper presents an efficient method to analyze noise and vibration of marine diesel engines mounted on flexible hull structure. The analysis model should in general include the hull structure, leading to lots of computational efforts. To minimize the computational efforts, in this paper, the transfer synthesis utilizing the receptance at the mounting points is proposed. The procedure is then verified by comparing the results with those from the full model calculation. The effects of flexible hull structure on the acoustic power from engine block are finally investigated. It is found that the effect of the hull is significant when the receptance of hull structure is similar to or greater than that of mount or engine block.

An Implementation of Interactive 3D Audio Broadcasting Terminal (대화형 3차원 오디오 방송단말 구현)

  • Park Gi Yoon;Lee Taejin;Kang Kyeongok
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2004.11a
    • /
    • pp.211-214
    • /
    • 2004
  • 본 논문에서는 사용자의 입력에 따라 3차원 오디오 장면을 재구성하여 전달할 수 있는 대화형 오디오 방송단말의 구현 예를 제시한다. MPEG-4 AudioBIFS 규격에 따라 계층적으로 표현한 오디오 장면의 속성을 사용자의 입력에 따라 갱신하고, 주어진 속성을 참조하여 오디오 데이터를 3차원 공간상에 재합성하는 방식을 취한다 속성을 갱신하는 모듈은 MPEG-4 Audio 프로파일을 지원하게 하되 AudioBIFS 노드 유형에 따른 사용자 인터페이스를 미리 정의하여 단말 측에 저장해 두고 이용함으로써 대화형 방송 서비스를 구현했다. 3차원 오디오 데이터를 재생하는 기능은 사용자의 입력에 대한 피드백을 풍부하게 하여 대화형 방송의 효과를 극대화하고, 사실감을 제고하는 데 중요한 역할을 담당한다. 요소기술로 음상의 위치, 지향성, 모양, 잔향특성 등을 구현하기 위한 3차원 오디오 기술에 대해 소개한다. 또한 대화형 3차원 오디오 방송단말을 이용한 서비스의 예로 대화형 합주 및 합창 프로그램을 소개한다.

  • PDF

Implementation of Continuous Utterance Using Buffer Rearrangement for Articula Synthesizer (조음 음성 합성기에서 버퍼 재정렬을 이용한 연속음 구현)

  • Lee, Hui-Sung;Chung, Myung-Jin
    • Proceedings of the KIEE Conference
    • /
    • 2002.07d
    • /
    • pp.2454-2456
    • /
    • 2002
  • Since articuratory synthesis models the human vocal organs as precise as possible, it is potentially the most desirable method to produce various words and languages. This paper proposes a new type of an articulatory synthesizer using Mermelstein vocal tract model and Kelly-Lochbaum digital filter. Previous researches have assumed that the length of the vocal tract or the number of its cross sections dose not vary while uttering. However, the continuous utterance can not be easily implemented under this assumption. The limitation is overcomed by "Buffer Rearrangement" for dynamic vocal tract in this paper.

  • PDF

A Study on Objective Quality Assessment of Synthesized Speech by Rule (규칙 합성음의 객관적 품질평가에 관한 연구)

  • 홍진우
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1991.06a
    • /
    • pp.67-72
    • /
    • 1991
  • This paper evaluates thequality of synthesized speech by rule using the LPC CD in the objective measure and then compares the result with the subjective analysis. By evaluating the quality of synthesized speech by rule objectively. We have tried to resolve the problems (Evaluation time or size expansion, variables within the analysis results) that arise when the evaluation is done subjectively. Also by comparing intelligibility-the index for the subjective quality evaluation of synthesized speech by rule-with evaluation results obtained using MOS and the objective evaluation. We have proved the validity of the objective analysis and thus provides a guide that would be useful when R&D and marketing of synthesis by rule method is done.

  • PDF

Evaluation of Synthetic Voice which is Agreeable to the Ear Using Sensibility Ergonomics Method (감성 평가를 이용한 듣기 좋은 음성 합성음에 대한 연구)

  • Park, Yong-Kuk;Kim, Jae-Kuk;Jeon, Yong-Woong;Cho, Am
    • Journal of the Ergonomics Society of Korea
    • /
    • v.21 no.1
    • /
    • pp.51-65
    • /
    • 2002
  • As the method of providing information is getting multimedia, the synthetic voice is used in not only CTI(Computer Telephony Integration), information service for the blind, but also applications on internet. But properties of synthetic voice, such as speech rate, pitch, timbre and so on, are not adjusted to customers' preference but providers' preference. In order to consider customers' preference, this study proposed four subjective factors of voice through the evaluation of voice using the method of sensibility ergonomics. And the relation synthetic voice to be agreeable to the ear with emotional images was formulated as a fuzzy model. Consequently, this study proposed the speech rate and pitch of synthetic voice which is agreeable to the ear.

Speech Synthesis Algorithm Using Mixed Phase Information for TTS Systems (혼합 위상 정보를 이용한 TTS 합성음 생성 알고리즘)

  • Kwon, Chul-Hong;Lee, Min-Kyu
    • Speech Sciences
    • /
    • v.8 no.4
    • /
    • pp.35-43
    • /
    • 2001
  • New speech synthesis algorithms capable of flexible prosody (especially F0) modification are desired for a high quality TTS system. TD-PSOLA is the most popular synthesis algorithm. The algorithm shows very high quality when F0 modification is limited. However, the quality degradation due to pitch epoch detection error becomes severe as the F0 modification factor becomes large. On the other hand, the vocoder framework is very flexible in F0 manipulation. The synthesized speech quality from the vocoder is far from natural human speech and suffers from buzziness. To remedy the buzzy quality from the vocoder and make more natural synthetic speech, we propose a mixed phase vocoder.

  • PDF