통합 검색 | Korea Science

Noisy Speech Recognition Based on Noise-Adapted HMMs Using Speech Feature Compensation

Chung, Yong-Joo
- 융합신호처리학회논문지
- /
- 제15권2호
- /
- pp.37-41
- /
- 2014
The vector Taylor series (VTS) based method usually employs clean speech Hidden Markov Models (HMMs) when compensating speech feature vectors or adapting the parameters of trained HMMs. It is well-known that noisy speech HMMs trained by the Multi-condition TRaining (MTR) and the Multi-Model-based Speech Recognition framework (MMSR) method perform better than the clean speech HMM in noisy speech recognition. In this paper, we propose a method to use the noise-adapted HMMs in the VTS-based speech feature compensation method. We derived a novel mathematical relation between the train and the test noisy speech feature vector in the log-spectrum domain and the VTS is used to estimate the statistics of the test noisy speech. An iterative EM algorithm is used to estimate train noisy speech from the test noisy speech along with noise parameters. The proposed method was applied to the noise-adapted HMMs trained by the MTR and MMSR and could reduce the relative word error rate significantly in the noisy speech recognition experiments on the Aurora 2 database.
PDF KSCI

신경망과 퍼지논리를 이용한 음소인식에 관한 연구 (A Study on Phoneme Recognition using Neural Networks and Fuzzy logic)

한정현;최두일
- 대한전기학회:학술대회논문집
- /
- 대한전기학회 1998년도 하계학술대회 논문집 G
- /
- pp.2265-2267
- /
- 1998
This paper deals with study of Fast Speaker Adaptation Type Speech Recognition, and to analyze speech signal efficiently in time domain and time-frequency domain, utilizes SCONN[1] with Speech Signal Process suffices for Fast Speaker Adaptation Type Speech Recognition, and examined Speech Recognition to investigate adaptation of system, which has speech data input after speaker dependent recognition test.
PDF

Spectrum 강조특성을 이용한 음성신호에서 Voicd - Unvoiced - Silence 분류 (Voiced, Unvoiced, and Silence Classification of human speech signals by enphasis characteristics of spectrum)

배명수;안수길
- 한국음향학회지
- /
- 제4권1호
- /
- pp.9-15
- /
- 1985
In this paper, we describe a new algorithm for deciding whether a given segment of a speech signal is classified as voiced speech, unvoiced speech, or silence, based on parameters made on the signal. The measured parameters for the voiced-unvoiced classfication are the areas of each Zero crossing interval, which is given by multiplication of the magnitude by the inverse zero corssing rate of speech signals. The employed parameter for the unvoiced-silence classification, also, are each of positive area summation during four milisecond interval for the high frequency emphasized speech signals.
PDF

웨이브렛 변환을 이용한 음성신호의 유성음/무성음/묵음 분류 (Voiced/Unvoiced/Silence Classification웨 of Speech Signal Using Wavelet Transform)

손영호;배건성
- 음성과학
- /
- 제4권2호
- /
- pp.41-54
- /
- 1998
Speech signals are, depending on the characteristics of waveform, classified as voiced sound, unvoiced sound, and silence. Voiced sound, produced by an air flow generated by the vibration of the vocal cords, is quasi-periodic, while unvoiced sound, produced by a turbulent air flow passed through some constriction in the vocal tract, is noise-like. Silence represents the ambient noise signal during the absence of speech. The need for deciding whether a given segment of a speech waveform should be classified as voiced, unvoiced, or silence has arisen in many speech analysis systems. In this paper, a voiced/unvoiced/silence classification algorithm using spectral change in the wavelet transformed signal is proposed and then, experimental results are demonstrated with our discussions.
PDF

회의실내 유리창 진동의 도청에 대한 연구 (A Study on the Eavesdropping of the Glass Window Vibration in a Conference Room)

김석현;김윤호;허욱
- 산업기술연구
- /
- 제27권A호
- /
- pp.55-60
- /
- 2007
Possibility of the eavesdropping is investigated on a conference room-glass window coupled system. Speech intelligibility analysis is performed on the eavesdropping sound of the glass window. Using MLS(Maximum Length Sequency) signal as a sound source, acceleration and velocity responses of the glass window are measured by accelerometer and laser doppler vibrometer. MTF(Modulation Transfer Function) is used to identify the speech transmission characteristics of the room and window system. STI(Speech Transmission Index) is calculated by using MTF and speech intelligibility of the vibration sound is estimated. Speech intelligibilities by the acceleration signal and the velocity signal are compared.
PDF

마이크로폰 배열에서 독립벡터분석 기법을 이용한 잡음음성의 음질 개선 (Microphone Array Based Speech Enhancement Using Independent Vector Analysis)

왕씽양;전성일;배건성
- 말소리와 음성과학
- /
- 제4권4호
- /
- pp.87-92
- /
- 2012
Speech enhancement aims to improve speech quality by removing background noise from noisy speech. Independent vector analysis is a type of frequency-domain independent component analysis method that is known to be free from the frequency bin permutation problem in the process of blind source separation from multi-channel inputs. This paper proposed a new method of microphone array based speech enhancement that combines independent vector analysis and beamforming techniques. Independent vector analysis is used to separate speech and noise components from multi-channel noisy speech, and delay-sum beamforming is used to determine the enhanced speech among the separated signals. To verify the effectiveness of the proposed method, experiments for computer simulated multi-channel noisy speech with various signal-to-noise ratios were carried out, and both PESQ and output signal-to-noise ratio were obtained as objective speech quality measures. Experimental results have shown that the proposed method is superior to the conventional microphone array based noise removal approach like GSC beamforming in the speech enhancement.
https://doi.org/10.13064/KSSS.2012.4.4.087 인용 PDF

재귀적 지연추정기를 갖는 적응잡음제거 기법을 이용한 음성개선 (Speech Enhancement Using the Adaptive Noise Canceling Technique with a Recursive Time Delay Estimator)

강해동;배근성
- 전자공학회논문지B
- /
- 제31B권7호
- /
- pp.33-41
- /
- 1994
A single channel adaptive noise canceling (ANC) technique with a recursive time delay estimator (RTDE) is presented for removing effects of additive noise on the speech signal. While the conventional method makes a reference signal for the adaptive filter using the pitch estimated on a frame basis from the input speech, the proposed method makes the reference signal using the delay estimated recursively on a sample-by-sample basis. As the RTDEs, the recursion formulae of autocorrelation function (ACF) and average magnitude difference function (AMDF) are derived. The normalized least mean square (NLMS) and recursive least square (RLS) algorithms are applied for adaptation of filter coefficients. Experimental results with noisy speech demonstrate that the proposed method improves the perceived speech quality as well as the signal-to-noise ratio and cepstral distance when compared with the conventional method.
PDF

단순화된 다중 모드 방법을 이용한 음성 부호화기 (A Speech Coder using the Simplified Multi-mode Method)

강홍구
- 한국음향학회:학술대회논문집
- /
- 한국음향학회 1995년도 제12회 음성통신 및 신호처리 워크샵 논문집 (SCAS 12권 1호)
- /
- pp.146-149
- /
- 1995
This paper proposes a SM-CELP speech coder which applies different excitation signal according to the characteristic of speech segment at bit-rate below 4 kbps. Speech signal is divided with 2 modes such as stationary voice and etc. using the parameters of average energy of the short-time speech and the residual signal after long term prediction. Structured multi-pulse method is used for the excitation of mode-A and gaussian or pulse-like codebook for mode-B. 4.8kbps DoD-CELP are used to evaluate the performance of the proposed coder. As a result, the propose method shows 1~2 dB higher segmental signal to noise ratio and better subjectional quality without increasing the computational amount.
PDF

서브밴드 스케일링에 의한 음성신호의 피치변경법에 관한 연구 (A Study on the Pitch Alteration Technique by Subband Scaling in Speech Signal)

김영구;배명진
- 음성과학
- /
- 제10권4호
- /
- pp.137-147
- /
- 2003
Speech synthesis can classify by synthesis way, that is waveform coding, source coding and mixture coding. Specially, waveform coding is suitable for high quality synthesis. However, it is not desirable by synthesis techniques of syllable or phoneme unit because it do not separate and handles excitation and formant part. Therefore, there is a need for pitch alteration method applied in synthesis by the rule in waveform coding. This study propose about pitch alteration method that use spectrum scaling after do to flatten spectra by subband linear approximation to minimize spectrum distortion. This paper show evaluation whether show excellency of some measure compared with LPC, Cepstrum, lifter function and method that propose. estimation method seeks distribution of each flattened signal and measured degree of flattened spectra Signal flattened is normalized, So that highest point amounts to zero, and distribution of signal ,whose average is zero, is calculated. this show result that measure the spectrum distortion rate to estimate performance of method that propose. The average spectrum distortion rate was kept below the average 2.12%, so the method that propose is superiors than existent method.
PDF

음성 신호의 주파수 영역에서의 주파수 대역별 공분산 행렬의 고유값 분석 (Analysis of Eigenvalues of Covariance Matrices of Speech Signals in Frequency Domain for Various Bands)

김선일
- 한국정보통신학회:학술대회논문집
- /
- 한국정보통신학회 2016년도 춘계학술대회
- /
- pp.293-296
- /
- 2016
음성 신호는 자음 신호와 모음 신호의 결합으로 이루어져 있지만 그 특성상 자음보다는 모음 신호의 지속시간이 길다. 따라서 전체적으로 음성 신호 블록들 사이의 상관관계가 상당히 크다고 간주할 수 있다. 하지만 같은 음성 신호 내에서도 주파수 대역별로 그 상관관계가 다르게 나타난다. 음성신호를 128개의 데이터를 갖는 블록들로 나눈 후 각 블록의 FFT를 구한다. 여러 주파수 대역별 FFT 값으로 부터 이웃 블록들과의 공분산 행렬을 구하고 이 행렬로부터 고유값을 계산해 낸다. 이중 첫 번 째 고유값은 주성분과 관련이 있다. 다양한 주파수 대역별로 주성분을 구한 후 이 주성분의 값들이 대역별로 어떻게 나타나는지 그 분포를 알아보고 어떤 대역의 공분산 행렬의 고유값을 선택해야 더 안정적인 결과를 얻을 수 있을지 분석한다.
PDF

검색결과 1,174건 처리시간 0.023초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)