Search | Korea Science

Flexible selection of feature vectors for speaker identification (화자 인식을 위한 특징 벡터의 유연한 선택)

Yoon, Sang-Min;Park, Gyeong-Mi;Kim, Gil-Yeon;O, Yeong-Hwan
- Proceedings of the KSPS conference
- /
- 2007.05a
- /
- pp.45-48
- /
- 2007
This paper proposes a flexible selection method of feature vectors for speaker identification. In speaker identification, overlapped region between speaker models lowers the accuracy. Recently, a method was proposed which discards overlapped feature vectors without regard to the source causing the overlap. We suggest a new method using both overlapped features among speakers and non-overlapped features to mitigate the overlap effects.
PDF

Target Speaker Speech Restoration via Spectral bases Learning (주파수 특성 기저벡터 학습을 통한 특정화자 음성 복원)

Park, Sun-Ho;Yoo, Ji-Ho;Choi, Seung-Jin
- Journal of KIISE:Software and Applications
- /
- v.36 no.3
- /
- pp.179-186
- /
- 2009
This paper proposes a target speech extraction which restores speech signal of a target speaker form noisy convolutive mixture of speech and an interference source. We assume that the target speaker is known and his/her utterances are available in the training time. Incorporating the additional information extracted from the training utterances into the separation, we combine convolutive blind source separation(CBSS) and non-negative decomposition techniques, e.g., probabilistic latent variable model. The nonnegative decomposition is used to learn a set of bases from the spectrogram of the training utterances, where the bases represent the spectral information corresponding to the target speaker. Based on the learned spectral bases, our method provides two postprocessing steps for CBSS. Channel selection step finds a desirable output channel from CBSS, which dominantly contains the target speech. Reconstruct step recovers the original spectrogram of the target speech from the selected output channel so that the remained interference source and background noise are suppressed. Experimental results show that our method substantially improves the separation results of CBSS and, as a result, successfully recovers the target speech.
PDF KSCI

A Single Sensor Active Noise Control Considering The Characteristics of The Speaker and The Microphone (스피커와 마이크의 전달특성을 고려한 단일 센서 능동소음제어)

김현태;박장식
- Journal of Korea Multimedia Society
- /
- v.6 no.7
- /
- pp.1131-1138
- /
- 2003
Active noise control(ANC) is an approach to noise reduction in which a secondary noise source destructively interferes with the unwanted noise is introduced. Generally, the performance of ANC is determined how well a secondary noise tracks noises. A secondary noise is generated from the cancelling speaker and a error sensor pick up error signal. The transfer function between the cancelling speaker and the error sensor is not flat and distorts secondary noises. Consequently, the performance of ANC is degraded by the transfer function. In this paper, a single sensor ANC which considers the characteristics of the speaker and the error sensor is proposed. To reduce distortion of secondary noises, the transfer function is estimated by adaptive inverse modelling and the primary noises are estimated by Kalman filter. Experimental results show that the proposed single sensor ANC effectively attenuates noises.
PDF

Voice personality transformation using an orthogonal vector space conversion (직교 벡터 공간 변환을 이용한 음성 개성 변환)

Lee, Ki-Seung;Park, Kun-Jong;Youn, Dae-Hee
- Journal of the Korean Institute of Telematics and Electronics B
- /
- v.33B no.1
- /
- pp.96-107
- /
- 1996
A voice personality transformation algorithm using orthogonal vector space conversion is proposed in this paper. Voice personality transformation is the process of changing one person's acoustic features (source) to those of another person (target). In this paper, personality transformation is achieved by changing the LPC cepstrum coefficients, excitation spectrum and pitch contour. An orthogonal vector space conversion technique is proposed to transform the LPC cepstrum coefficients. The LPC cepstrum transformation is implemented by principle component decomposition by applying the Karhunen-Loeve transformation and minimum mean-square error coordinate transformation(MSECT). Additionally, we propose a pitch contour modification method to transform the prosodic characteristics of any speaker. To do this, reference pitch patterns for source and target speaker are firstly built up, and speaker's one. The experimental results show the effectiveness of the proposed algorithm in both subjective and objective evaluations.
PDF

Voice Conversion Using Linear Multivariate Regression Model and LP-PSOLA Synthesis Method (선형다변회귀모델과 LP-PSOLA 합성방식을 이용한 음성변환)

권홍석;배건성
- The Journal of the Acoustical Society of Korea
- /
- v.20 no.3
- /
- pp.15-23
- /
- 2001
This paper presents a voice conversion technique that modifies the utterance of a source speaker as if it were spoken by a target speaker. Feature parameter conversion methods to perform the transformation of vocal tract and prosodic characteristics between the source and target speakers are described. The transformation of vocal tract characteristics is achieved by modifying the LPC cepstral coefficients using Linear Multivariate Regression (LMR). Prosodic transformation is done by changing the average pitch period between speakers, and it is applied to the residual signal using the LP-PSOLA scheme. Experimental results show that transformed speech by LMR and LP-PSOLA synthesis method contains much characteristics of the target speaker.
PDF

Noise Attenuation Effect According to the Direction of Secondary Sound Source in Duct ANC System (Duct ANC System에서 부가음원 방향별 소음감소효과)

Lee, Eung-Suk;Lee, Hyung-Seok
- Transactions of the Korean Society for Noise and Vibration Engineering
- /
- v.19 no.3
- /
- pp.251-260
- /
- 2009
In this paper, we studied on an attenuation effect of automobile exhaust noise according to the direction of canceling speaker in ANC system. Automobile exhaust noise was recorded at 800 rpm, 3500 rpm and 5000 rpm of a diesel engine. Directions of canceling speaker can be set to $30^{\circ}$, $90^{\circ}$ and $150^{\circ}$ against the primary noise flow by acrylic ducts to be made for the experimentation. DSP board with TMS320C6416 chip of Texas Instrument Co. used to control the ANC system. The algorithm of this ANC system applied the Filtered-x-LMS algorithm that is modified to compensate for a property of DSP input signal and the secondary-path effect. As an experiment result, the direction of canceling speaker was proved to influence the reduction effect of noise. The $150^{\circ}$ duct in the attenuation effect of noise showed a better result than the $90^{\circ}$ or $30^{\circ}$ duct.
https://doi.org/10.5050/KSNVN.2009.19.3.251 인용 PDF KSCI

A New Type Speaker Utilizing a Magneto-rheological Fluid Diaphragm (자기유변유체 다이어프램을 이용한 새로운 타입의 스피커)

Park, Jhin Ha;Yoon, Ji Young;Kim, Seon Hye;Lee, Tae Hoon;Lee, Soo Hyuk;Choi, Seung Bok
- Transactions of the Korean Society for Noise and Vibration Engineering
- /
- v.27 no.2
- /
- pp.182-188
- /
- 2017
In this work, a new type speaker which features various resonant frequencies is proposed utilizing a magneto-rheological (MR) fluid and its performance is evaluated in terms of the change of the field-dependent sound pressure level. In order to achieve this goal, a whole concept of the speaker system is firstly discussed and subsequently a controllable diaphragm is made using MR fluid whose rheological properties such as viscosity are controllable by the magnitude of magnetic field. Then, the proposed speaker system consisting of the inner structure and the squeeze mode type of MR diaphragm is established in an anechoic room The effectiveness of the proposed speaker system is experimentally evaluated at two different conditions; with and without the magnetic field. It is shown from experimental tests that the sound pressure level at different sound source can be controlled which is not able to achieve using one conventional speaker system.
https://doi.org/10.5050/KSNVE.2017.27.2.182 인용 PDF KSCI

Effectiveness of Active Noise Control through Three-Dimensional Sound (입체음향 제작기법을 통한 능동소음제어 방법의 효율성)

Park, Junhong;Kim, Junejong;Min, Dongki
- Proceedings of the Korean Society for Noise and Vibration Engineering Conference
- /
- 2014.10a
- /
- pp.955-956
- /
- 2014
Active noise control is noise reduction method by generate anti-phase control signal for destructive interference of through control speaker. purpose of this paper is create a virtual control source at a using the DBAP(Distance Based Amplitude Panning) algorithm which is one of the three-dimensional sound reproduction method, and verified through the experimentally for noise control method through the virtual control source. We compared active noise method by using one control speaker with active noise control method by using DBAP algorithm.
PDF

GMM based Nonlinear Transformation Methods for Voice Conversion

Vu, Hoang-Gia;Bae, Jae-Hyun;Oh, Yung-Hwan
- Proceedings of the KSPS conference
- /
- 2005.11a
- /
- pp.67-70
- /
- 2005
Voice conversion (VC) is a technique for modifying the speech signal of a source speaker so that it sounds as if it is spoken by a target speaker. Most previous VC approaches used a linear transformation function based on GMM to convert the source spectral envelope to the target spectral envelope. In this paper, we propose several nonlinear GMM-based transformation functions in an attempt to deal with the over-smoothing effect of linear transformation. In order to obtain high-quality modifications of speech signals our VC system is implemented using the Harmonic plus Noise Model (HNM)analysis/synthesis framework. Experimental results are reported on the English corpus, MOCHA-TlMlT.
PDF

Performance Improvement of Speaker Recognition Using Enhanced Feature Extraction in Glottal Flow Signals and Multiple Feature Parameter Combination (Glottal flow 신호에서의 향상된 특징추출 및 다중 특징파라미터 결합을 통한 화자인식 성능 향상)

Kang, Jihoon;Kim, Youngil;Jeong, Sangbae
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.19 no.12
- /
- pp.2792-2799
- /
- 2015
In this paper, we utilize source mel-frequency cepstral coefficients (SMFCCs), skewness, and kurtosis extracted in glottal flow signals to improve speaker recognition performance. Generally, because the high band magnitude response of glottal flow signals is somewhat flat, the SMFCCs are extracted using the response below the predefined cutoff frequency. The extracted SMFCC, skewness, and kurtosis are concatenated with conventional feature parameters. Then, dimensional reduction by the principal component analysis (PCA) and the linear discriminat analysis (LDA) is followed to compare performances with conventional systems under equivalent conditions. The proposed recognition system outperformed the conventional system for large scale speaker recognition experiments. Especially, the performance improvement was more noticeable for small Gaussan mixtures.
https://doi.org/10.6109/jkiice.2015.19.12.2792 인용 PDF KSCI

Search Result 104, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)