• Title/Summary/Keyword: Source speaker

Search Result 104, Processing Time 0.023 seconds

Voice Color Conversion Based on the Formants and Spectrum Tilt Modification (포먼트 이동과 스펙트럼 기울기의 변환을 이용한 음색 변환)

  • Son Song-Young;Hahn Min-Soo
    • MALSORI
    • /
    • no.45
    • /
    • pp.63-77
    • /
    • 2003
  • The purpose of voice color conversion is to change the speaker identity perceived from the speech signal. In this paper, we propose a new voice color conversion algorithm through the formant shifting and the spectrum-tilt modification in the frequency domain. The basic idea of this technique is to convert the positions of source formants into those of target speaker's formants through interpolation and decimation and to modify the spectrum-tilt by utilizing the information of both speakers' spectrum envelops. The LPC spectrum is adopted to evaluate the position of formant and the information of spectrum-tilt. Our algorithm enables us to convert the speaker identity rather successfully while maintaining good speech quality, since it modifies speech waveforms directly in the frequency domain.

  • PDF

Bilingual Voice Conversion Using Frequency Warping on Formant Space (포만트 공간에서의 주파수 변환을 이용한 이중 언어 음성 변환 연구)

  • Chae, Yi-Geun;Yun, Young-Sun;Jung, Jin Man;Eun, Seongbae
    • Phonetics and Speech Sciences
    • /
    • v.6 no.4
    • /
    • pp.133-139
    • /
    • 2014
  • This paper describes several approaches to transform a speaker's individuality to another's individuality using frequency warping between bilingual formant frequencies on different language environments. The proposed methods are simple and intuitive voice conversion algorithms that do not use training data between different languages. The approaches find the warping function from source speaker's frequency to target speaker's frequency on formant space. The formant space comprises four representative monophthongs for each language. The warping functions can be represented by piecewise linear equations, inverse matrix. The used features are pure frequency components including magnitudes, phases, and line spectral frequencies (LSF). The experiments show that the LSF-based voice conversion methods give better performance than other methods.

Noise Attenuation Effect According to the Direction of Canceling Speaker in Duct-acoustic System (덕트-음향 시스템에서 소거용스피커 방향에 따른 소음감소효과)

  • Lee, Hyung-Seok;Lee, Eung-Suk
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.26 no.7
    • /
    • pp.51-57
    • /
    • 2009
  • In this paper, we studied on an attenuation effect of automobile exhaust noise according to the direction of canceling speaker in duct-acoustic ANC system. Automobile exhaust noise was recorded at 800rpm, 3500rpm and 5000rpm of a diesel engine. Directions of canceling speaker can be set to $30^{\circ}$, $90^{\circ}$ and $150^{\circ}$ against the primary noise flow by acrylic ducts to be made for the experimentation. DSP board used to control the ANC system. The algorithm of this ANC system applied the Filtered-x-LMS algorithm that is modified to compensate for a property of DSP input signal and the secondary-path effect. As an experiment result, the direction of canceling speaker was proved to influence the reduction effect of noise. The $150^{\circ}$ duct in the attenuation effect of noise showed a better result than the $90^{\circ}$ or $30^{\circ}$ duct.

The Thronging of Shoals of Squid to Audible underwater Sound (가청 수중음에 대한 오징어 어군의 위집)

  • 서두옥
    • Journal of the Korean Society of Fisheries and Ocean Technology
    • /
    • v.31 no.3
    • /
    • pp.220-227
    • /
    • 1995
  • An underwater speaker was designed and used as sound source for thronging shoal of squid in squid angling gear operation. The frequency characteristics of the designed speaker was analyzed experimentally and the thronging response of shoals of squid which may be a key parameter for a new sound catching method, was characterized in audible frequency. The field experiment was carried out in the coast of Cheju Island. The results of this study are summarized as follows; 1. Amplitude response of the speaker shows a maximum in their the frequency of 500Hz. 2. The output waveform distortion is not measured in the frequency range of 250~600Hz. 3. A underwater noise of shoals of squid which were thronged by fish lamp in night appeared the center frequency of 300~400Hz. 4. The shoals of squid shows a thronging response, when a manufactured underwater speaker transmits a intermittent audible sound of 300~400Hz in 10m depth of water.

  • PDF

Pitch Modification based on a Voice Source Model (음원 모델에 기초한 합성음의 피치 조절)

  • Choi, Yong-Jin;Yeo, Su-Jin;Kim, Jin-Young;Sung, Koeng-Mo
    • Speech Sciences
    • /
    • v.3
    • /
    • pp.132-147
    • /
    • 1998
  • Previously developed methods for pitch modification have not been based on the voice source model. Therefore, the synthesized speech often sounds unnatural although it may be highly intelligible. The purpose of this paper is to analyze the alteration of a voice source signal with pitch period and to establish the pitch-modification rule based on the result of this analysis. We examine the alteration of the interval of closing phase, closed phase and open phase using the excitation waveform as the pitch increases. In comparison to the previous methods which performed directly on the speech signal, the pitch modification method based on a voice source model shows high intelligibility and naturalness. This study might benefit the application to the speaker identification and the voice color conversion. Therefore the proposed method will provide high quality synthetic speech.

  • PDF

Performance Improvement of Speaker Recognition by MCE-based Score Combination of Multiple Feature Parameters (MCE기반의 다중 특징 파라미터 스코어의 결합을 통한 화자인식 성능 향상)

  • Kang, Ji Hoon;Kim, Bo Ram;Kim, Kyu Young;Lee, Sang Hoon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.6
    • /
    • pp.679-686
    • /
    • 2020
  • In this thesis, an enhanced method for the feature extraction of vocal source signals and score combination using an MCE-Based weight estimation of the score of multiple feature vectors are proposed for the performance improvement of speaker recognition systems. The proposed feature vector is composed of perceptual linear predictive cepstral coefficients, skewness, and kurtosis extracted with lowpass filtered glottal flow signals to eliminate the flat spectrum region, which is a meaningless information section. The proposed feature was used to improve the conventional speaker recognition system utilizing the mel-frequency cepstral coefficients and the perceptual linear predictive cepstral coefficients extracted with the speech signals and Gaussian mixture models. In addition, to increase the reliability of the estimated scores, instead of estimating the weight using the probability distribution of the convectional score, the scores evaluated by the conventional vocal tract, and the proposed feature are fused by the MCE-Based score combination method to find the optimal speaker. The experimental results showed that the proposed feature vectors contained valid information to recognize the speaker. In addition, when speaker recognition is performed by combining the MCE-based multiple feature parameter scores, the recognition system outperformed the conventional one, particularly in low Gaussian mixture cases.

Detection of Speaker Position for Robot Using HRTF (머리전달함수를 이용한 로봇의 화자 위치 추정)

  • Hwang, Sung-Mook;Park, Youn-Sik;Park, Young-Jin
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2005.11a
    • /
    • pp.637-640
    • /
    • 2005
  • We propose a sound source localization method using the Head-Related-Transfer-Function (HRTF) to be implemented in a given platform. HRTFs contain not only the information regarding proper time delays but also phase and magnitude distortions due to diffraction and scattering by the shading object. Therefore, a set of HRTFs for any given platform provides a substantial amount of information as to the whereabouts of the source. In this study, we introduce new phase criterion in order to find the sound source location in accordance with the HRTF database empirically obtained in an anechoic chamber with the given platform. Using this criterion, we analyze the estimation performance of the proposed method in a household environment.

  • PDF

Pitch Contour Conversion Using Slanted Gaussian Normalization Based on Accentual Phrases

  • Lee, Ki-Young;Bae, Myung-Jin;Lee, Ho-Young;Kim, Jong-Kuk
    • Speech Sciences
    • /
    • v.11 no.1
    • /
    • pp.31-42
    • /
    • 2004
  • This paper presents methods using Gaussian normalization for converting pitch contours based on prosodic phrases along with experimental tests on the Korean database of 16 declarative sentences and the first sentences of the story of 'The Three Little Pigs'. We propose a new conversion method using Gaussian normalization to the pitch deviation of pitch contour subtracted by partial declination lines: by using partial declination lines for each accentual phrase of pitch contour, we avoid the problem that a Gaussian normalization using average values and standard deviations of intonational phrase tends to lose individual local variability and thus cannot modify individual characteristics of pitch contour from a source speaker to a target speaker. From the results of the experiments, we show that this slanted Gaussian normalization using these declination lines subtracted from pitch contour of accentual phrases can modify pitch contour more accurately than other methods using Gaussian normalization.

  • PDF

Active Noise Control of Closed Rectangular Cavity using the FXLMS Algorithms (FXLMS 알고리듬을 이용한 사각밀폐공간의 능동소음제어)

  • Ryu, Kyung-Wan;Hong, Chin-Suk;Jeong, Wei-Bong
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2009.04a
    • /
    • pp.247-249
    • /
    • 2009
  • This paper investigates active noise control(ANC) of a rectangular cavity using single channel filtered-x least mean square(FXLMS) algorithms to reduce interior noise globally. To obtain global reduction of the interior noise, multichannel active control should be incorporated in general. We, however, examined firstly the optimal location of the secondary speaker that produces a global reduction of the interior noise field. We then investigated the frequency characteristics of the reduction to yield the effective frequency band of the active control system. It follows that the secondary speaker should be located as close to the primary source as possible in order to obtain global reduction.

  • PDF