• Title/Summary/Keyword: Perceptual evaluation

Search Result 248, Processing Time 0.023 seconds

Real-time implementation of the 2.4kbps EHSX Speech Coder Using a $TMS320C6701^TM$ DSPCore ($TMS320C6701^TM$을 이용한 2.4kbps EHSX 음성 부호화기의 실시간 구현)

  • 양용호;이인성;권오주
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.7C
    • /
    • pp.962-970
    • /
    • 2004
  • This paper presents an efficient implementation of the 2.4 kbps EHSX(Enhanced Harmonic Stochastic Excitation) speech coder on a TMS320C6701$^{TM}$ floating-point digital signal processor. The EHSX speech codec is based on a harmonic and CELP(Code Excited Linear Prediction) modeling of the excitation signal respectively according to the frame characteristic such as a voiced speech and an unvoiced speech. In this paper, we represent the optimization methods to reduce the complexity for real-time implementation. The complexity in the filtering of a CELP algorithm that is the main part for the EHSX algorithm complexity can be reduced by converting program using floating-point variable to program using fixed-point variable. We also present the efficient optimization methods including the code allocation considering a DSP architecture and the low complexity algorithm of harmonic/pitch search in encoder part. Finally, we obtained the subjective quality of MOS 3.28 from speech quality test using the PESQ(perceptual evaluation of speech quality), ITU-T Recommendation P.862 and could get a goal of realtime operation of the EHSX codec.c.

Robust Speech Enhancement Based on Soft Decision Employing Spectral Deviation (스펙트럼 변이를 이용한 Soft Decision 기반의 음성향상 기법)

  • Choi, Jae-Hun;Chang, Joon-Hyuk;Kim, Nam-Soo
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.47 no.5
    • /
    • pp.222-228
    • /
    • 2010
  • In this paper, we propose a new approach to noise estimation incorporating spectral deviation with soft decision scheme to enhance the intelligibility of the degraded speech signal in non-stationary noisy environments. Since the conventional noise estimation technique based on soft decision scheme estimates and updates the noise power spectrum using a fixed smoothing parameter which was assumed in stationary noisy environments, it is difficult to obtain the robust estimates of noise power spectrum in non-stationary noisy environments that spectral characteristics of noise signal such as restaurant constantly change. In this paper, once we first classify the stationary noise and non-stationary noise environments based on the analysis of spectral deviation of noise signal, we adaptively estimate and update the noise power spectrum according to the classified noise types. The performances of the proposed algorithm are evaluated by ITU-T P. 862 perceptual evaluation of speech quality (PESQ) under various ambient noise environments and show better performances compared with the conventional method.

An Improved Speech Absence Probability Estimation based on Environmental Noise Classification (환경잡음분류 기반의 향상된 음성부재확률 추정)

  • Son, Young-Ho;Park, Yun-Sik;An, Hong-Sub;Lee, Sang-Min
    • The Journal of the Acoustical Society of Korea
    • /
    • v.30 no.7
    • /
    • pp.383-389
    • /
    • 2011
  • In this paper, we propose a improved speech absence probability estimation algorithm by applying environmental noise classification for speech enhancement. The previous speech absence probability required to seek a priori probability of speech absence was derived by applying microphone input signal and the noise signal based on the estimated value of a posteriori SNR threshold. In this paper, the proposed algorithm estimates the speech absence probability using noise classification algorithm which is based on Gaussian mixture model in order to apply the optimal parameter each noise types, unlike the conventional fixed threshold and smoothing parameter. Performance of the proposed enhancement algorithm is evaluated by ITU-T P.862 PESQ (perceptual evaluation of speech quality) and composite measure under various noise environments. It is verified that the proposed algorithm yields better results compared to the conventional speech absence probability estimation algorithm.

Quality Assessment of Images Projected Using Multiple Projectors

  • Kakli, Muhammad Umer;Qureshi, Hassaan Saadat;Khan, Muhammad Murtaza;Hafiz, Rehan;Cho, Yongju;Park, Unsang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.9 no.6
    • /
    • pp.2230-2250
    • /
    • 2015
  • Multiple projectors with partially overlapping regions can be used to project a seamless image on a large projection surface. With the advent of high-resolution photography, such systems are gaining popularity. Experts set up such projection systems by subjectively identifying the types of errors induced by the system in the projected images and rectifying them by optimizing (correcting) the parameters associated with the system. This requires substantial time and effort, thus making it difficult to set up such systems. Moreover, comparing the performance of different multi-projector display (MPD) systems becomes difficult because of the subjective nature of evaluation. In this work, we present a framework to quantitatively determine the quality of an MPD system and any image projected using such a system. We have divided the quality assessment into geometric and photometric qualities. For geometric quality assessment, we use Feature Similarity Index (FSIM) and distance-based Scale Invariant Feature Transform (SIFT). For photometric quality assessment, we propose to use a measure incorporating Spectral Angle Mapper (SAM), Intensity Magnitude Ratio (IMR) and Perceptual Color Difference (ΔE). We have tested the proposed framework and demonstrated that it provides an acceptable method for both quantitative evaluation of MPD systems and estimation of the perceptual quality of any image projected by them.

A Study on the Performance of Noise Reduction using Multi-Microphones for Digital Hearing Aids (디지털 보청기를 위한 다중 마이크로폰을 이용한 잡음제거 성능 연구)

  • Kang, Hyun-Deok;Song, Young-Rok;Lee, Sang-Min
    • Journal of IKEEE
    • /
    • v.14 no.1
    • /
    • pp.47-54
    • /
    • 2010
  • In this study, we analyzed the reduction of noise in a noise environment using 2, 3, 4 or 5 microphones in digital hearing aids. In order to be able to use this in actual digital hearing aids, we made the experiment microphone set similar to the behind-the-ear type (BTE) and then recorded the signal accordingly, with each situation. With the recorded signals, we reduced the noise in each signal by a noise reduction algorithm using multi-microphones. As a result, in the case of By comparing the SNR (Signal to Noise Ratio) and PESQ (Perceptual Evaluation of Speech) measurements, before and after the noise reduction, the results showed that the improvement in performance was highest when three or four microphones were used. Generally, when two or more microphones were used, we found that as the number of microphones increased there was an increase in performance.

Speech Basis Matrix Using Noise Data and NMF-Based Speech Enhancement Scheme (잡음 데이터를 활용한 음성 기저 행렬과 NMF 기반 음성 향상 기법)

  • Kwon, Kisoo;Kim, Hyung Young;Kim, Nam Soo
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.40 no.4
    • /
    • pp.619-627
    • /
    • 2015
  • This paper presents a speech enhancement method using non-negative matrix factorization (NMF). In the training phase, each basis matrix of source signal is obtained from a proper database, and these basis matrices are utilized for the source separation. In this case, the performance of speech enhancement relies heavily on the basis matrix. The proposed method for which speech basis matrix is made a high reconstruction error for noise signal shows a better performance than the standard NMF which basis matrix is trained independently. For comparison, we propose another method, and evaluate one of previous method. In the experiment result, the performance is evaluated by perceptual evaluation speech quality and signal to distortion ratio, and the proposed method outperformed the other methods.

Multi-channel input-based non-stationary noise cenceller for mobile devices (이동형 단말기를 위한 다채널 입력 기반 비정상성 잡음 제거기)

  • Jeong, Sang-Bae;Lee, Sung-Doke
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.17 no.7
    • /
    • pp.945-951
    • /
    • 2007
  • Noise cancellation is essential for the devices which use speech as an interface. In real environments, speech quality and recognition rates are degraded by the auditive noises coming near the microphone. In this paper, we propose a noise cancellation algorithm using stereo microphones basically. The advantage of the use of multiple microphones is that the direction information of the target source could be applied. The proposed noise canceller is based on the Wiener filter. To estimate the filter, noise and target speech frequency responses should be known and they are estimated by the spectral classification in the frequency domain. The performance of the proposed algorithm is compared with that of the well-known Frost algorithm and the generalized sidelobe canceller (GSC) with an adaptation mode controller (AMC). As performance measures, the perceptual evaluation of speech quality (PESQ), which is the most widely used among various objective speech quality methods, and speech recognition rates are adopted.

Speech Enhancement Based on Improved Minima Controlled Recursive Averaging Incorporating GSAP (전역 음성 부재 확률 기반의 향상된 최소값 제어 재귀평균기법을 이용한 음성 향상 기법)

  • Song, Ji-Hyun;Bang, Dong-Hyeouck;Lee, Sang-Min
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.49 no.1
    • /
    • pp.104-111
    • /
    • 2012
  • In this paper, we propose a novel method to improve the performance of the improved minima controlled recursive averaging (IMCRA). From an examination for various noise environment, it is shown that the IMCRA has a fundamental drawback for the noise power estimate at the offset region of continuity speech signals. Espectially, it is difficult to obtain the robust estimates of the noise power in non-stationary noisy environments that is rapidly changed the spectral characteristics such as babble noise. To overcome the drawback, we apply the global speech absence probability (GSAP) conditioned on both a priori SNR and a posteriori SNR to the speech detection algorithm of IMCRA. With the performance criteria of the ITU-T P.862 perceptual evaluation of speech quality (PESQ) and a composite measure test, we show that the proposed algorithm yields better results compared to the conventional IMCRA-based scheme under various noise environments. In particular, in the case of babble 5 dB, the proposed method produced a remarkable improvement compared to the IMCRA ( PESQ = 0.026, composite measure = 0.029 ).

Voice Activity Detection Using Modified Power Spectral Deviation Based on Teager Energy (Teager Energy 기반의 수정된 파워 스펙트럼 편차를 이용한 음성 검출)

  • Song, J.H.;Song, Y.R.;Shim, H.M.;Lee, S.M.
    • Journal of rehabilitation welfare engineering & assistive technology
    • /
    • v.8 no.1
    • /
    • pp.41-46
    • /
    • 2014
  • In this paper, we propose a novel voice activity detection (VAD) algorithm using feature vectors based on TE (teager energy). Specifically, power spectral deviation (PSD), which is used as the feature for the VAD in the IS-127 noise suppression algorithm, is obtained after the input signal is transfomed by Teager energy operator. In addition, the TE-based likelihhod ratio are derived in each frame to modifiy the PSD for further VAD. The performance of our proposed VAD algorithm are evaluated by objective testing (total error rate, receiver operating characteristics, perceptual evaluation of speech quality) under various environments, and it is found that the proposed method yields better results than conventional VAD algorithms in the non-stationary noise environments under 5 dB SNR (total error rate = 2.6% decrease, PESQ score = 0.053 improvement).

  • PDF

Global Soft Decision Based on Improved Speech Presence Uncertainty Tracking Method Incorporating Spectral Gradient (스펙트럼 변이 기반의 향상된 음성 존재 불확실성 추적 기법을 이용한 Global Soft Decision)

  • Kim, Jong-Woong;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.32 no.3
    • /
    • pp.279-285
    • /
    • 2013
  • In this paper, we propose a novel speech enhancement method to improve the performance of the conventional global soft decision which is based on the spectral gradient method applied to the ratio of a priori speech absence and presence probability value (q). Conventional global soft decision scheme used a fixed value of q in accordance with the hypothesis assumed, but the proposed algorithm is a technique for improving the speech absence probability which is applied adaptively variable value of q according to the speech presence or absence in the previous two frames and the conditions of the spectral gradient value. Experimental results show that the proposed improved global soft decision method based on the spectral gradient method yields better results compared to the conventional global soft decision technique based on the performance criteria of the ITU-T P. 862 PESQ (Perceptual Evaluation of Speech Quality).