• Title/Summary/Keyword: Objective Speech Quality Measure

Search Result 19, Processing Time 0.02 seconds

Enhanced Adjustment Strategy of Masking Threshold for Speech Signals in Low Bit-Rate Audio Coding (저전송률 오디오 부호화에서 음성 신호의 성능 개선을 위한 마스킹 임계값 적응기법 향상)

  • Lee, Chang-Heon;Kang, Hong-Goo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.1
    • /
    • pp.62-68
    • /
    • 2010
  • This paper proposes a new masking threshold adjustment strategy to improve the performance for speech signals in low bit-rate audio coding. After determining formant regions, the masking threshold is adjusted by using the energy ratio of each sub-band to the average energy of each formant. More quantization noises are added to the bands that have relatively large energy, but less distortion is allowed in spectral valley regions by allocating more bits, which reflects the concept of perceptual weighting widely used in speech coding. From the results of objective speech quality measure, we verified that the proposed method improves quality for the speech input signals compared to the conventional one.

Quality Assessment of Telephone Speech with ATM Circuit Emulation Services (ATM 망을 통한 Circuit Emulation 서비스에서 전화음성의 품질평가)

  • Cho, Young-Soon;Seo, Jeong-Wook;Bae, Keun-Sung
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.35S no.6
    • /
    • pp.156-163
    • /
    • 1998
  • The ATM network provides ATM CES(Circuit Emulation Services) with AAL1 for CBR(constant bit rate) services such as telephone speech. In this study, quality assessment of telephone speech with CES over ATM was performed and discussed. For this, interoperability between ATM network and structured/unstructured DS1 link was modeled for simulation. And for qualiy assessment of telephone speech, SNR and MOS were used as an objective and a subjective measure, respectively. Experimental results have shown that MOS score 4 as well as SNR 30dB could be obtained at CLR of $10^{-3}$ or below for speech signal.

  • PDF

Conversational Quality Measurement System for Mobile VoIP Speech Communication (모바일 VoIP 음성통신을 위한 대화음질 측정 시스템)

  • Cho, Jae-Man;Kim, Hyoung-Gook
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.10 no.4
    • /
    • pp.71-77
    • /
    • 2011
  • In this paper, we propose a conversational quality measurement (CQM) system for providing the objective QoS of high quality mobile VoIP voice telecommunication. For measuring the conversational quality, the VoIP telecommunication system is implemented in two smart phones connected with VoIP. The VoIP telecommunication system consists of echo cancellation, noise reduction, speech encoding/decoding, packet generation with RTP (Real-Time Protocol), jitter buffer control and POS (Play-out Schedule) with LC (loss Concealment). The CQM system is connected to a microphone and a speaker of each smart phone. The voice signal of each speaker is recorded and used to measure CE (Conversational Efficiency), CS (Conversational Symmetry), PESQ (Perceptual Evaluation of Speech Quality) and CE-CS-PESQ correlation. We prove the CQM system by measuring CE, CS and PESQ under various SNR, delay and loss due to IP network environment.

A 4 kbps PSI-VSELP Speech Coding Algorithm (4 kbps PSI-VSELP 음성 부호화 알고리듬)

  • Choi, Yong-Soo;Kang, Hong-Goo;Park, Sang-Wook;Youn, Dae-Hee
    • The Journal of the Acoustical Society of Korea
    • /
    • v.15 no.6
    • /
    • pp.59-65
    • /
    • 1996
  • This paper proposes a 4 kbps PSI-VSELP(Pitch Synchronous Innovation-Vector Sum Excited Linear Prediction) speech coder which produces speech equivalent to that of the conventional 4.8 kbps VSELP. Since the 'half-rate' is differently defined from country to country, there may be a need to reduce the bit rate of conventional half-rate coder. To minimize the degradation of speech quality caused by bit-rate reduction, it is desirable to perform bit-allocation based on the carefull consideration of the effect of various transmission parameters. This paper adopts this analytical approach for bit-allocation at 4 kbps. To improve the quality of the VSELP coder at 4 kbps, basis vectors which play the most important role in the performance, are optimized by an iterative closed-loop training process and the PSI technique is employed in the VSELP performance, are optimized by an iterative closed-loop training process and the PSI technique is employed in the VSELP coder. To demonstrate the performance of the proposed speech coder, we peformed experiments under the noiseless and error free conditions. From experimental results, even though the proposed 4 kbps PSI-VSELP coder showed lower scores in the objective measure, higher scores in subjective measure was obtained compared with those of the conventional 4.8 kbps VSELp.

  • PDF

Performance Comparison for Objective Measures of Speech Quality Evaluation in PCS Wireless Telephone Network (PCS 이동전화망에서의 객관적인 음질평가척도별 성능비교)

  • Kim Nag-Cheol;Kim Kwang-Soo;Jung Ho-Youl;Chung Hyun-Yeol
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.48-51
    • /
    • 1999
  • 본 연구에서는 PCS 이동전화의 객관적 통화품질평가 척도개발을 위한 기초연구로 기존의 CD(Cepstral Distance), MSD (Mel Spectral Distance), BSD(Bark Spectral Distance), PSQM (Perceptual Speech Quality Measure) 척도를 적용하여 그 성능을 비교 분석하였다. 이 척도들을 실제환경에서 수집된 PCS 음성데이터에 대해서 적용하였고 이 결과치와 청취자들의 평가 반응에 의해 얻어진 MOS 결과치와의 상관성을 조사하였다. 실험 결과, BSD와 PSQM 척도의 상관성이 0.81, 0.84로 나타나 CD, MSD보다 성능이 더 우수함을 보였다.

  • PDF

Performance Comparison of Objective Measures for Speech Quality for Evaluation in CDMA Mobile Telephone (CDMA 이동전화 통화품질평가를 위한 객관적 음질평가척도별 성능 비교)

  • 이준희;김광수;윤정오
    • Proceedings of the Korea Society for Industrial Systems Conference
    • /
    • 2001.05a
    • /
    • pp.256-260
    • /
    • 2001
  • 본 논문에서는 디지털 이동전화(CDMA) 채널환경을 통과한 왜곡된 전화음성에 대해 객관적 음질평가 척도의 개발을 위한 기초 연구로서 기존의 CD(Cepstral Distance), MSD(Mel Spectral Distance), BSD(Bark Spectral Distance), Modified BSD, PSQM(Perceptual Speech Quality Measure)를 대상으로 객관척도 알고리즘을 성능평가 하였다. 이 척도들은 실제 이동전화 환경에서 수집된 PCS 음성데이터에 대해서 적용하였으며 이 결과치를 주관적 음질평가 방법인 MU와 상관성을 비교 조사하였다. 실험 결과, BSD와 MBSD, 그리고 PSQM 척도의 상관성이 각각 0.80, 0.85, 0.84로 나타났으며 CD, MSD 보다 성능이 상대적으로 더 우수함을 보였다.

  • PDF

Quantitative Evaluation of the Performance of Monaural FDSI Beamforming Algorithm using a KEMAR Mannequin (KEMAR 마네킹을 이용한 단이 보청기용 FDSI 빔포밍 알고리즘의 정량적 평가)

  • Cho, Kyeongwon;Nam, Kyoung Won;Han, Jonghee;Lee, Sangmin;Kim, Dongwook;Hong, Sung Hwa;Jang, Dong Pyo;Kim, In Young
    • Journal of Biomedical Engineering Research
    • /
    • v.34 no.1
    • /
    • pp.24-33
    • /
    • 2013
  • To enhance the speech perception of hearing aid users in noisy environment, most hearing aid devices adopt various beamforming algorithms such as the first-order differential microphone (DM1) and the two-stage directional microphone (DM2) algorithms that maintain sounds from the direction of the interlocutor and reduce the ambient sounds from the other directions. However, these conventional algorithms represent poor directionality ability in low frequency area. Therefore, to enhance the speech perception of hearing aid uses in low frequency range, our group had suggested a fractional delay subtraction and integration (FDSI) algorithm and estimated its theoretical performance using computer simulation in previous article. In this study, we performed a KEMAR test in non-reverberant room that compares the performance of DM1, DM2, broadband beamforming (BBF), and proposed FDSI algorithms using several objective indices such as a signal-to-noise ratio (SNR) improvement, a segmental SNR (seg-SNR) improvement, a perceptual evaluation of speech quality (PESQ), and an Itakura-Saito measure (IS). Experimental results showed that the performance of the FDSI algorithm was -3.26-7.16 dB in SNR improvement, -1.94-5.41 dB in segSNR improvement, 1.49-2.79 in PESQ, and 0.79-3.59 in IS, which demonstrated that the FDSI algorithm showed the highest improvement of SNR and segSNR, and the lowest IS. We believe that the proposed FDSI algorithm has a potential as a beamformer for digital hearing aid devices.

Evaluation of a signal segregation by FDBM (FDBM의 음원분리 성능평가)

  • Lee, Chai-Bong
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.8 no.12
    • /
    • pp.1793-1802
    • /
    • 2013
  • Various approaches for sound source segregation have been proposed. Among these approaches, frequency domain binaural model(FDBM) has the advantages of low computational load and effective howling cancellation. A binaural hearing assistance system based on FDBM has been proposed. This system can enhance desired signal based on the directivity information. Although FDBM has been evaluated in terms of signal-to-noise ratio (SNR) and coherence function, the evaluation results do not always agree with the human impressions. These evaluation methods provide physical measures, and do not take account of perceptual aspect of human being. Considering a binaural hearing assistance system as a one of major applications, the quality of segregated sound should keep level enough. In the paper, signal segregation performance by means of FDBM is evaluated by three objective methods, i.e., SNR, coherence and Perceptual Evaluation of Speech Quality(PESQ), to discuss the characteristic of FDBM on the sound source segregation performance. The simulation's evaluation results show that FDBM improves the quality of the left and right channel signals to an equivalent level. And the results suggest the possibility that PESQ provides a more useful measure than SNR and coherence in terms of the segregation performance of FDBM. The evaluation results by PESQ show the effects from segregation parameters and indicate appropriate parameters under the conditions. In the paper, signal segregation performance by means of FDBM is evaluated by three objective methods, i.e., SNR, coherence and PESQ, to discuss the characteristic of FDBM on the sound source segregation performance. The simulation's evaluation results show that FDBM improves the quality of the left and right channel signals to an equivalent level. And the results suggest the possibility that PESQ provides a more useful measure than SNR and coherence in terms of the segregation performance of FDBM. The evaluation results by PESQ show the effects from segregation parameters and indicate appropriate parameters under the conditions.

Reliability of OperaVOXTM against Multi-Dimensional Voice Program to Assess Voice Quality before and after Laryngeal Microsurgery in Patient with Vocal Polyp (성대 용종 환자의 후두미세수술 전후 음성 평가에서 OperaVOXTM와 Multi-Dimensional Voice Program 간의 신뢰도 연구)

  • Kim, Sun Woo;Kim, So Yean;Cho, Jae Kyung;Jin, Sung Min;Lee, Sang Hyuk
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.31 no.2
    • /
    • pp.71-77
    • /
    • 2020
  • Background and Objectives OperaVOXTM (Oxford Wave Research Ltd.) is a portable voice analysis software package designed for use with iOS devices. As a relatively cheap, portable and easily accessible form of acoustic analysis, OperaVOXTM may be more clinically useful than laboratory-based software in many situations. The aim of this study was to evaluate the agreement between OperaVOXTM and Multi-Dimensional Voice Program (MDVP; Computerized Speech Lab) to assess voice quality before and after laryngeal microsurgery in patient with vocal polyp. Materials and Method Twenty patients who had undergone laryngeal microsurgery for vocal polyp were enrolled in this study. Preoperative and postoperative voices were assessed by acoustic analysis using MDVP and OperaVOXTM. A five-seconds recording of vowel /a/ was used to measure fundamental frequency (F0), jitter, shimmer and noise-to-harmonic ratio (NHR). Results Several acoustic parameters of MDVP and OperaVOXTM related to short-term variability showed significant improvement. While pre-operative value of F0, jitter, shimmer, NHR was 155.75 Hz (male: 125.37 Hz, female: 183.37 Hz), 2.20%, 6.28%, 0.16, post-operative values of these parameter was 164.34 Hz (male: 129.42 Hz, female: 199.26 Hz), 2.15%, 5.18%, 0.14 Hz in MDVP. While pre-operative value of F0, jitter, shimmer, NHR was 168.26 Hz (male: 135.16 Hz, female: 201.37 Hz), 2.27%, 6.95%, 0.26, post-operative values of these parameters was 162.72 Hz (male: 128.267 Hz, female: 197.18 Hz), 1.71%, 5.36%, 0.20 in OperaVOXTM. There was high intersoftware agreement for F0, jitter, shimmer with intraclass correlation coefficient. Conclusion Our results showed that the short-term variability of acoustic parameters in both MDVP and OperaVOXTM were useful for the objective assessment of voice quality in patients who received laryngeal microsurgery. OperaVOXTM is comparable to MDVP and has high intersoftware reliability with MDVP in measuring the F0, jitter, and shimmer