• Title/Summary/Keyword: speaker variation

Search Result 74, Processing Time 0.023 seconds

Method for Current-Driving of the Loudspeakers with Class D Audio Power Amplifiers Using Input Signal Pre-Compensation (입력 신호의 전치 보상을 이용한 D 급 음향 전력 증폭기의 스피커 전류 구동 방법)

  • Eun, Changsoo;Lee, Yu-chil
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.9
    • /
    • pp.1068-1075
    • /
    • 2018
  • We propose a method for driving loudspeakers from class D audio power amplifiers in current mode, instead of in conventional voltage mode, which was impossible with the feedback circuitry. Unlike analog audio amplifiers, Class D audio power amplifiers have signal delay between the input and output signals, which makes it difficult to apply the feedback circuitry for current-mode driving. The idea of the pre-distortion scheme used for the compensation of the non-linearity of RF power amplifiers is adapted to remedy the impedance variation effect of the loudspeakers for current driving. The method uses the speaker model for the pre-distorter to compensate for the speaker impedance variation with frequency. The simulation and test results confirms the validity of the proposed method.

A Study on Speaker Identification by Difference Sum and Correlation Coefficients of Narrow-band Spectrum (좁은대역 스펙트럼의 차이값과 상관계수에 의한 화자확인 연구)

  • Yang, Byung-Gon;Kang, Sun-Mee
    • Speech Sciences
    • /
    • v.9 no.3
    • /
    • pp.3-16
    • /
    • 2002
  • We examined some problems in speaker identification procedures: transformation of acoustic parameters into auditory scales, invalid measurement values, and comparability of spectral energy values across the frequency range. To resolve those problems, we analyzed the acoustic spectral energy of three Korean numbers produced by ten female students from narrow-band spectrograms at 19 proportional time points of each voiced segment. Then, cells of the first five spectral matrices were averaged to form a matrix model for each speaker. The correlation coefficients and sum of the absolute amplitude difference in each pair of the spectral models of the ten subjects were obtained. Also, some individual matrix models were compared to those of the same subject or the other subject with a similar spectral model. Results showed that in numbers '2' and '9' subjects could not be clearly distinguished from the others but in number '4' it shed some possibility of setting threshold values for speaker identification if we employed the coefficients and the sum of absolute difference. Further studies would be desirable on various combinations of the range of long-term average spectra and the degree of signal pre-emphasis.

  • PDF

A Proposition of the Fuzzy Correlation Dimension for Speaker Recognition (화자인식을 위한 퍼지상관차원 제안)

  • Yoo, Byong-Wook;Kim, Chang-Seok;Park, Hyun-Sook
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.36S no.1
    • /
    • pp.115-122
    • /
    • 1999
  • In this paper, we confirmed that a speech signal is a chaos signal, and in order to use it as a speaker recognition parameter, analyzed chaos dimension. In order to raise speaker identification and pattern recognition, by making up the strange attractor involving an individual's vocal tract characteristics very well and applying fuzzy membership function to correlation dimension, we proposed fuzzy correlation dimension. By estimating the correlation of the points making up an attractor are limited according space dimension value, fuzzy correlation dimension absorbed the variation of the reference pattern attractor and test pattern attractor. Concerning fuzzy correlation dimension, by estimating the distance according to the average value of discrimination error per each speaker and reference pattern, investigated the validity of speaker recognition parameter.

  • PDF

An acoustic study on the duration of the morn in Japanese (일본어 특수박의 지속시간에 관한 음향음성학적 분석)

  • Kim Seonhi
    • MALSORI
    • /
    • no.38
    • /
    • pp.113-124
    • /
    • 1999
  • It is well known that Japanese prosodic structure assumes mora below the syllable tier. Syllables with V or CV structure are counted as having one morn whereas those with coda consonants /-pp, -tt, -kk, -ss, -N/ or long vowels are counted as having two morns in Japanese. This study measured the acoustic duration of these special moras ('tokusyuhaku') produced by Tokyo dialect speakers to see if they are isochronic with V or CV. It also examined the production of Korean(Seoul/Kyungsang dialect) and Chinese native speakers loaming Japanese as a second language to examine how the learners' first language influence their second language. Finally, it examined how speakers of the Akita dialect, which is blown as a syllabeme dialect in Japanese, produced them. The results showed that intra-speaker variation as well as inter-speaker variation was observed in the production by Akita dialect speakers. Production of native speakers of Chinese and Kyungsang dialect of Korean -- which have vowel length contrast in their phonological systems -- showed a similar result to Tokyo dialect speakers, which implies the influence of the learners' first language on the acquisition of the second language.

  • PDF

A Study on Adaptive Model Updating and a Priori Threshold Decision for Speaker Verification System (화자 확인 시스템을 위한 적응적 모델 갱신과 사전 문턱치 결정에 관한 연구)

  • 진세훈;이재희;강철호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.5
    • /
    • pp.20-26
    • /
    • 2000
  • In speaker verification system the HMM(hidden Markov model) parameter updating using small amount of data and the priori threshold decision are crucial factor for dealing with long-term variability in people voices. In the paper we present the speaker model updating technique which can be adaptable to the session-to-intra speaker variability and the priori threshold determining technique. The proposed technique decreases verification error rates which the session-to-session intra-speaker variability can bring by adapting new speech data to speaker model parameter through Baum Welch re-estimation. And in this study the proposed priori threshold determining technique is decided by a hybrid score measurement which combines the world model based technique and the cohen model based technique together. The results show that the proposed technique can lead a better performance and the difference of performance is small between the posteriori threshold decision based approach and the proposed priori threshold decision based approach.

  • PDF

An acoustical analysis method of numeric sounds by Praat (Praat를 이용한 숫자음의 음향적 분석법)

  • Yang, Byung-Gon
    • Speech Sciences
    • /
    • v.7 no.2
    • /
    • pp.127-137
    • /
    • 2000
  • This paper presents a macro script to analyze numeric sounds by a speech analysis shareware, Praat, and analyzes those sounds produced by three students who were born and raised in Pusan. Recording was done in a quiet office. To make a meaningful comparison, dynamic time points in relation to the total duration of voicing segments were determined to measure acoustical values. Results showed that a strong correlation coefficient was found between the repetitive production of numeric sounds within and across the speakers. Very high coefficients among diphthongal numbers (0 and 6) which usually show wide formant variation were noticed. This supports that each speaker produced numbers quite coherently. Also, the frequency differences between the three subjects were within a perceptually similar range. To identify a speaker among others may require to find subtle individual differences within this range. Perceptual experiments by synthesized numeric sounds may lead to resolve the issue.

  • PDF

A Study on Speaker Recognition using the Peak and valley pitch detection and the Fuzzy (국부 봉우리와 골에 의한 피치 검출과 퍼지를 이용한 화자 인식에 관한 연구)

  • 김연숙;김희주;김경재
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.8 no.1
    • /
    • pp.213-219
    • /
    • 2004
  • This paper proposes speaker recognition algorithm which includes the pitch parameter for the peak and valley. The time-frequency hybrid method for pitch extraction is valuable in that it can improve resolution in the time domain and accuracy in the frequency domain at the same time. It makes reference pattern using membership function and performs vocal track recognition of common character using fuzzy pattern matching in order to include time variation width for non-linear utterance for proposed method, speaker recognition experiments are carried out using vowels and number sounds.

Vocal Tract Length Normalization for Speech Recognition (음성인식을 위한 성도 길이 정규화)

  • 지상문
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.7 no.7
    • /
    • pp.1380-1386
    • /
    • 2003
  • Speech recognition performance is degraded by the variation in vocal tract length among speakers. In this paper, we have used a vocal tract length normalization method wherein the frequency axis of the short-time spectrum associated with a speaker's speech is scaled to minimize the effects of speaker's vocal tract length on the speech recognition performance In order to normalize vocal tract length, we tried several frequency warping functions such as linear and piece-wise linear function. Variable interval piece-wise linear warping function is proposed to effectively model the variation of frequency axis scale due to the large variation of vocal tract length. Experimental results on TIDIGITS connected digits showed the dramatic reduction of word error rates from 2.15% to 0.53% by the proposed vocal tract normalization.

Speaker Recognition Using Dynamic Time Variation fo Orthogonal Parameters (직교인자의 동적 특성을 이용한 화자인식)

  • 배철수
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.17 no.9
    • /
    • pp.993-1000
    • /
    • 1992
  • Recently, many researchers have found that the speaker recognition rate is high when they perform the speaker recognition using statistical processing method of orthogonal parameter, which are derived from the analysis of speech signal and contain much of the speaker's identity. This method, however, has problems caused by vocalization speed or time varying feature of speed. Thus, to solve these problems, this paper proposes two methods of speaker recognition which combine DTW algorithm with the method using orthogonal parameters extracted from $Karthumem-Lo\'{e}ve$ Transform method which applies orthogonal parameters as feature vector to ETW algorithm and the other is the method which applies orthogonal parameters to the optimal path. In addition, we compare speaker recognition rate obtained from the proposed two method with that from the conventional method of statistical process of orthogonal parameters. Orthogonal parameters used in this paper are derived from both linear prediction coefficients and partial correlation coefficients of speech signal.

  • PDF

Articulatory characteristics and variation of Korean laterals

  • Hwang, Young;Charles, Sherman;Lulich, Steven M.
    • Phonetics and Speech Sciences
    • /
    • v.11 no.1
    • /
    • pp.19-27
    • /
    • 2019
  • Lateral approximants are well known as having complex articulatory characteristics, which vary cross-linguistically, across speakers, and across utterances. However, less attention has been paid to the articulation of Korean laterals, which do not contrast with a rhotic and may thus exhibit greater-than-normal variability. The focus of this study is to investigate the general articulatory characteristics of the Korean lateral [l] as well as the articulatory variation using novel 3D ultrasound imaging methods. The results of this study revealed significant between-speaker variation and some vowel-dependent variation with regard to the articulation of the Korean lateral [l], which has not been reported previously. Even though all participants in this study showed an anterior occlusion, the place of articulation and the size of the occlusion varied greatly across speakers. The data also revealed that left-right asymmetry is present in the articulation of the Korean lateral. The individual variation of the Korean lateral [l] suggests that it has a large articulatory-acoustic space for variation, since it has no contrasting sound that causes perceptual confusion.