Search | Korea Science

Speech Processing System Using a Noise Reduction Neural Network Based on FFT Spectrums

Choi, Jae-Seung
- Journal of information and communication convergence engineering
- /
- v.10 no.2
- /
- pp.162-167
- /
- 2012
This paper proposes a speech processing system based on a model of the human auditory system and a noise reduction neural network with fast Fourier transform (FFT) amplitude and phase spectrums for noise reduction under background noise environments. The proposed system reduces noise signals by using the proposed neural network based on FFT amplitude spectrums and phase spectrums, then implements auditory processing frame by frame after detecting voiced and transitional sections for each frame. The results of the proposed system are compared with the results of a conventional spectral subtraction method and minimum mean-square error log-spectral amplitude estimator at different noise levels. The effectiveness of the proposed system is experimentally confirmed based on measuring the signal-to-noise ratio (SNR). In this experiment, the maximal improvement in the output SNR values with the proposed method is approximately 11.5 dB better for car noise, and 11.0 dB better for street noise, when compared with a conventional spectral subtraction method.
https://doi.org/10.6109/jicce.2012.10.2.162 인용 PDF KSCI

Practical Considerations for Hardware Implementations of the Auditory Model and Evaluations in Real World Noisy Environments

Kim, Doh-Suk;Jeong, Jae-Hoon;Lee, Soo-Young;Kil, Rhee M.
- The Journal of the Acoustical Society of Korea
- /
- v.16 no.1E
- /
- pp.15-23
- /
- 1997
Zero-Crossings with Peak Amplitudes(ZCPA) model motivated by human auditory periphery was proposed to extract reliable features speech signals even in noisy environments for robust speech recognition. In this paper, some practical considerations for digital hardware implementations of the ZCPA model are addressed and evaluated for recognition of speech corrupted by several real world noises as well as white Gaussian noise. Infinite impulse response(IIR) filters which constitute the cochliar filterbank of the ZCPA are replaced by hamming bandpass filters of which frequency responses are less similar to biological neural tuning curves. Experimental results demonstrate that the detailed frequency response of the cochlear filters are not critical to performance. Also, the sensitivity of the model output to the variations in microphone gain is investigated, and results in good reliability of the ZCPA model.
PDF

Pitch Detection of Speech Signals Using Wavelet Transform (웨이브렛 변환을 이용한 음성 신호의 피치 검출)

Lee, Min-Woo;Sohn, Joon-Il;Choi, Dong-Woo;Beack, Seung-Hwa;Kim, Jin-Soo
- Proceedings of the KIEE Conference
- /
- 1995.11a
- /
- pp.149-153
- /
- 1995
In this paper, wavelet transform with multi-resolution property is used to improve the accuracy of pitch estimation of speech signal. Pitch detection of speech signal is based on the local maxima by using wavelet transform. The wavelet transform of a signal is a multiscale decomposition that is well localized in space and frequency. The proposed pitch defection algorithm is suitable for both low-pitched and high-pitched speakers.
PDF

Pitch Detection Using Variable LPF

Hong KEUM
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1994.06a
- /
- pp.963-970
- /
- 1994
In speech signal processing, it is very important to detect the pitch exactly. The algorithms for pitch extraction that have been proposed until now are not enough to detect the fine pitch in speech signal. Thus we propose the new algorithm which takes advantage of the G-peak extraction. It is the method to find MZCI(maximum zer-crossing interval) which is defined as cut-off bandwidth rate of LPF (low pass filter)and detect the pitch period of the voiced signals. This algorithm performs robustly with a gross error rate of 3.63% even in 0 dB SNR environment. The gross error rate for clean speech is only 0.18%. Also it is able to process all course with speed.
PDF

On Altering the Pitch of Speech Signals in Waveform Coding -(Altering Method by the LPC and the Pitch Halving)- (음성 파형코딩의 음원피치 변경에 관한 연구 - LPC와 주기반분법에 의한 피치변경법 -)

민경중
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1991.06a
- /
- pp.45-49
- /
- 1991
In area of the speech synthesis, the waveform coding with high quality are mainly used to the synthesis by analysis. However, it is difficult to applying the waveform coding to the synthesis by rule, because the parameters of this coding are not classified as either excitation parameters and vocal tract parameters. In this paper, we proposed a new pitch change method that can alter the pitch periods in the waveform coding. The proposed method expands the pitch period by the LPC synthesis method, and then the period is compressed by the waveform halving technique. Thus, it is possible that the waveform coding is carried out the synthesis by rule in speech processing.
PDF

Application of the Wavelet transformation to denoising and analyzing the speech

Hung Phan Duy;Lan Huong Nguyen Thi;Ngoc Yen Pham Thi;Castelli Eric
- Proceedings of the IEEK Conference
- /
- summer
- /
- pp.249-253
- /
- 2004
Wavelet transform (WT) has attracted most engineers and scientists because of its excellent properties. The coherence of practical approach and a theoretical basis not only solves currently important problems, but also gives the potential of formulating and solving completely new problems. It has been show that multi-resolution analysis of Wavelet transforms is good solution in speech analysis and threshold of wavelet coefficients has near optimal noise reduction property for many classes of signals. This paper proposed applications of wavelet in speech processing: pitch detection, voice-unvoice (V -UV) decision, denoising with the detailed algorithms and results.
PDF

Classification of Pathological Voice Signal with Severe Noise Component

Li, Ta-O;Jo, Cheol-Woo
- Speech Sciences
- /
- v.10 no.4
- /
- pp.107-115
- /
- 2003
In this paper we tried to classify the pathological voice signal with severe noise component based on two different parameters, the spectral slope and the ratio of energies in the harmonic and noise components (HNR), The spectral slope is obtained by using a curve fitting method and the HNR is computed in cepstrum quefrency domain. Speech data from normal peoples and patients are collected, diagnosed and divided into three different classes (normal, relatively less noisy and severely noisy data), The mean values and the standard deviations of the spectral slope and the HNR are computed and compared with in the three kinds of data to characterize and classify the severely noisy pathological voice signals from others.
PDF

An Analysis Method of Strange Attractor for the Feature Extraction (음성 특징 추출을 위한 스트레인지 어트랙터의 분석 방법)

Kim, Tae-Sik
- Speech Sciences
- /
- v.9 no.2
- /
- pp.147-155
- /
- 2002
In the area of speech processing, raw signals used to be presented into 2D format. However, such kind of presentation methods have limitation to extract characteristics from the signal because of the presentation method. Generally, not much information can be detected from the 2D signal. Strange attractor in the field of chaos theory provides a 3D presentation method. In the area of recognition problem, signal presentation method is very important because good features can be detected from a good presentation. This paper discusses a new feature extraction method that extracts features from a cycle of the strange attractor. A neural network is used to check whether the method extracts suitable features or not. The result shows very good points that can be applied to some areas of signal processing.
PDF

A Speaker Recognition Based on Strange Attractor with Vector Average (벡터 평균값을 갖는 스트레인지 어트랙터 기반 화자인식)

Kim, Tae-Sik
- Speech Sciences
- /
- v.8 no.3
- /
- pp.133-142
- /
- 2001
In the area of speech processing, raw signals used to be presented in 2D format and different kinds of algorithms use the format to solve their problems. However, such kinds of presentation methods have limitations to extract characteristics from the signal, even though the algorithms are quiet good. The basic reason is that not much information can be detected from the 2D signal. Strange attractor in the field of chaos theory provides the 3D presentation method. In the area of the recognition problem, signal construction method is very important because good features can be detected from a good shape of attractors. This paper discusses a new presentation method that can be used to construct strange attractor in a different way. Normal strange attractor uses time-delay idea while the new method uses time-delay and vector average. This method provides us good information to be applied to speaker recognition problem.
PDF

Time-varying Estimation of Vocal Track Parameters During the Speech Transition Regions (음성천이구간에서의 성도 파라메타 시변추정에 관한 연구)

Choi, Hong-Sub
- The Journal of the Acoustical Society of Korea
- /
- v.16 no.2
- /
- pp.101-106
- /
- 1997
In this paper, sample selective RLS(SSRLS) method is proposed, which aims to eliminate the influence of pitch bias. Its basic concepts are as follows. First it extracts the open glottis interval by using the residual signals, then estimates the formant values from the selected speech samples excluding above open glottis interval. This method has some analogy with the SSLPS, the simulation is conducted upon the synthetic and real speech. From these results, we find more usefulness of the proposed method than the conventional ones.
PDF

Search Result 499, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)