• Title/Summary/Keyword: Speech signal processing

Search Result 331, Processing Time 0.027 seconds

A Generalized Subspace Approach for Enhancing Speech Corrupted by Colored Noise Using Whitening Transformation (유색 잡음에 오염된 음성의 향상을 위한 백색 변환을 이용한 일반화 부공간 접근)

  • Lee, Jeong-Wook;Son, Kyung-Sik;Park, Jang-Sik;Kim, Hyun-Tae
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.8
    • /
    • pp.1665-1674
    • /
    • 2011
  • In this paper, we proposed an algorithm for speech enhancement of speeches corrupted by colored noise. When there is no correlation between colored noise and speech signal, the colored noise turns into white noise through whitening transformation. This transformed signal has been applied to the generalized subspace approach for speech enhancement. The speech spectral distortion, produced by the whitening transformation as pre-processing, has been restored by using the inverse whitening transformation as post-processing of the proposed algorithm. The performance of the proposed algorithm for speech enhancement has been confirmed by computer simulation. The colored noises used in this experiment were car noise and multi-talker babble. It is confirmed that the proposed algorithm shows better performance from SNR and SSD viewpoint over the previous approach with the data from the AURORA and TIMIT data base.

Transient Noise Reduction in Speech Signal Utilizing a Long-term Predictor (장구간 예측 필터를 이용한 음성 신호에서의 돌발 잡음 제거)

  • Choi, Min-Seok;Kang, Hong-Goo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.31 no.1
    • /
    • pp.29-38
    • /
    • 2012
  • This paper presents a transient noise reduction system in a speech signal. The proposed transient noise reduction system utilizes a median filter to reduce the transient noise. Since the median filter can distort speech during the noise reduction, a long-term prediction (LTP) filter is adopted as a pre-processor to minimize speech distortion. The speech information preserved by the LTP filter is re-synthesized after reducing the noise. This paper verifies the weakness of a linear prediction (LP) filter and the superiority of the LTP filter for preserving the speech component in transient noise presence environment. Applying the proposed system, the signal-to-noise ratio (SNR) of output is improved by 8dB in both speech and noise presence region, and PESQ score is increased by 1 point comparing with noisy input.

Speech Basis Matrix Using Noise Data and NMF-Based Speech Enhancement Scheme (잡음 데이터를 활용한 음성 기저 행렬과 NMF 기반 음성 향상 기법)

  • Kwon, Kisoo;Kim, Hyung Young;Kim, Nam Soo
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.40 no.4
    • /
    • pp.619-627
    • /
    • 2015
  • This paper presents a speech enhancement method using non-negative matrix factorization (NMF). In the training phase, each basis matrix of source signal is obtained from a proper database, and these basis matrices are utilized for the source separation. In this case, the performance of speech enhancement relies heavily on the basis matrix. The proposed method for which speech basis matrix is made a high reconstruction error for noise signal shows a better performance than the standard NMF which basis matrix is trained independently. For comparison, we propose another method, and evaluate one of previous method. In the experiment result, the performance is evaluated by perceptual evaluation speech quality and signal to distortion ratio, and the proposed method outperformed the other methods.

A Study on the Real Time Processing Technique of speech Signal (음성신호의 실시간 처리기법에 관한 연구)

  • Lee, Taek-Soo;Rhn, Chang;Kim, Sung-Nak;Rhee, Sang-Burm
    • Proceedings of the KIEE Conference
    • /
    • 1987.07b
    • /
    • pp.1094-1096
    • /
    • 1987
  • Zero-crossing analysis techniques have been applied to speech recognition. Zero-crossing rate, level-crossing rate and differentiated zero-crossing rate in time domain we used in analyzing speech signals. Speech samples could be stored in memory buffer in real time.

  • PDF

Post Processing using Blind Signal Separation in Stereo Acoustic Echo Canceller (스테레오 음향반향제거기의 BSS 후처리방법)

  • Lee, Haeng Woo
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.10 no.1
    • /
    • pp.131-138
    • /
    • 2014
  • This paper is on a stereo acoustic echo canceller with the blind signal separation for post processing. The convergence speed of the stereo acoustic echo canceller is deteriorated due to mixing two residual signals which are update signals of each echo canceller. To solve this problem, we are to use the blind signal separation(BSS) method separating the mixed signals after the echo cancellers. The blind signal separation method can extracts the source signals by means of the iterative computations with two input signals. We had verified performances of the proposed acoustic echo canceller for stereo through simulations. The results of simulations show that the acoustic echo canceller for stereo using this algorithm operates stably without divergence in the normal state. And, when the speech signals were inputted, this echo canceller achieved about 2dB higher ERLE with the BSS post processing method than without this method. This stereo echo canceller showed the best performance in the case of inputting the real voice signal.

DNN based Speech Detection for the Media Audio (미디어 오디오에서의 DNN 기반 음성 검출)

  • Jang, Inseon;Ahn, ChungHyun;Seo, Jeongil;Jang, Younseon
    • Journal of Broadcast Engineering
    • /
    • v.22 no.5
    • /
    • pp.632-642
    • /
    • 2017
  • In this paper, we propose a DNN based speech detection system using acoustic characteristics and context information of media audio. The speech detection for discriminating between speech and non-speech included in the media audio is a necessary preprocessing technique for effective speech processing. However, since the media audio signal includes various types of sound sources, it has been difficult to achieve high performance with the conventional signal processing techniques. The proposed method improves the speech detection performance by separating the harmonic and percussive components of the media audio and constructing the DNN input vector reflecting the acoustic characteristics and context information of the media audio. In order to verify the performance of the proposed system, a data set for speech detection was made using more than 20 hours of drama, and an 8-hour Hollywood movie data set, which was publicly available, was further acquired and used for experiments. In the experiment, it is shown that the proposed system provides better performance than the conventional method through the cross validation for two data sets.

A Study on the Extraction of the Excitation Pattern for Auditory Prothesis (청각 보철을 위한 자극패턴 추출에 관한 연구)

  • Park, Sang-Hui;Yoon, Tae-Sung;Lee, Jae-Hyuk;Beack, Seunt-Hwa
    • Proceedings of the KIEE Conference
    • /
    • 1987.07b
    • /
    • pp.1322-1325
    • /
    • 1987
  • In this study, the excitation pattern, which can be sensated by a man having hearing loss due to the damage of inner ear, is extracted, and the procedure of the auditory speech signal processing is simulated with the computer. Therefore, the excitation pattern is extracted by the neural tuning model satisfying the physiological characteristic of the inner ear and by the infor.ation extracted from speech signal. The firing pattern is also extracted by inputting this excitation pattern to the auditory neural model. With this extracted firing pattern, the possibility that the patient can sensate the speech signal is studied by the computer simulation.

  • PDF

A Study on Adaptive Algorithm Based on Wavelet Transform for Adaptive Noise Canceler Improvement (적응잡음제거기의 성능향상을 위한 웨이브렛 기반 적응알고리즘에 관한 연구)

  • 이채욱;김도형;오신범
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.7 no.2
    • /
    • pp.68-73
    • /
    • 2002
  • Many paper about the adaptive algorithm based to LS(Least Square) to improve convergence speed are already presented. In this paper, we propose a wavelet based adaptive algorithm which improves the convergence speed and reduces computational complexity, and adapt two kinds of adaptive noise cancelers using the characteristic of speech signal. We compared the performance of the nosed algorithm with time and frequency domain adaptive algorithm using computer simulation of adaptive noise canceler based on synthesis speech. As the result the proposed algorithm is suitable for adaptive signal processing area using speech or acoustic signal.

  • PDF

A Speaker Recognition Based on Strange Attractor with Vector Average (벡터 평균값을 갖는 스트레인지 어트랙터 기반 화자인식)

  • Kim, Tae-Sik
    • Speech Sciences
    • /
    • v.8 no.3
    • /
    • pp.133-142
    • /
    • 2001
  • In the area of speech processing, raw signals used to be presented in 2D format and different kinds of algorithms use the format to solve their problems. However, such kinds of presentation methods have limitations to extract characteristics from the signal, even though the algorithms are quiet good. The basic reason is that not much information can be detected from the 2D signal. Strange attractor in the field of chaos theory provides the 3D presentation method. In the area of the recognition problem, signal construction method is very important because good features can be detected from a good shape of attractors. This paper discusses a new presentation method that can be used to construct strange attractor in a different way. Normal strange attractor uses time-delay idea while the new method uses time-delay and vector average. This method provides us good information to be applied to speaker recognition problem.

  • PDF