• Title/Summary/Keyword: Mean Opinion Score (MOS)

Search Result 94, Processing Time 0.021 seconds

Video Quality Metric Using One-Dimensional Histograms of Motion Vectors (움직임 벡터의 1차원 히스토그램을 이용한 비디오 화질 평가 척도)

  • Han, Ho-Sung;Kim, Dong-O;Park, Bae-Hong;Sim, Dong-Gyu
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.45 no.2
    • /
    • pp.21-28
    • /
    • 2008
  • This paper proposes a novel reduced-reference assessment method for video quality assessment, in which one-dimensional (1-D) histograms of motion vectors (MVs) are used as features of videos. The proposed method is more efficient than the conventional methods in view of computation time, because the proposed quality metric decodes MVs directly from video stream in the parsing process instead of reconstructing the distorted video at the receiver. Moreover, in view of data size, the propose method is efficient because a sender transmits 1-D histograms of MVs accumulated over whole input video sequences. Here, we use 1-D histograms of MVs accumulated over the whole video sequences, which is different from the conventional methods that assessed each image independently. For testing the similarity between histograms, we use histogram intersection and histogram difference methods. We compare the proposed method with the conventional methods for 52 video clips, which are coded under varying bit rate, image size, and frame rate. Experimental results show that the proposed method is more efficient than the conventional methods and that the proposed method is more similar to the mean opinion score (MOS) than conventional algorithms.

A Study on Improving Voice Quality and Pitch Searching of the VSELP Coder (VSELP 부호화기의 음질 및 주기탐색 개선에 관한 연구)

  • 성기철;문상재
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.19 no.4
    • /
    • pp.740-749
    • /
    • 1994
  • This paper presents method for improving the performance of the VSELP speech coder. The hybrid method is employed for pitch period searching. Pitch searching time is reduced and pitch detection error, caused by quantization error of excitation signal of encoder in VSELP coder, is reduced by this method. This paper also adopts a pitch period enhancement filter and an adaptive first order filter. In this result, pitch period searching time is reduced to 26%, and MOS of reconstructed speech signal is increased by 3.19 to 4.04.

  • PDF

A Short-term and Long-term Usability Testing of the Speech Synthesizer for the People with Visual Impairments (시각장애인용 음성합성기에 대한 장/단기 사용성 평가)

  • Lee, H.Y.;Hong, K.H.
    • Journal of rehabilitation welfare engineering & assistive technology
    • /
    • v.9 no.1
    • /
    • pp.53-60
    • /
    • 2015
  • We conducted a long-term and short-term usability testing on the built-in speech synthesizer of a screen-reader for the people with visual impairments. A total of 20 persons with visual impairments participated in the short-term usability testing, and 10 of them participated in the long-term usability testing. Naturalness and clarity of the synthetic speech were evaluated by MOS scores, preference for various synthetic speeches was examined through a preference test, and the users' satisfaction level and other requirements for the synthetic speech were evaluated by open feedback. We also examined naturalness, clarity, preference, and user requirements for the synthetic speech through a long-term usability testing. Then, we compare and contrast the long-term and short-term usability testing results.

  • PDF

Performance Evaluation of IDS on MANET under Grayhole Attack (그레이홀 공격이 있는 MANET에서 IDS 성능 분석)

  • Kim, Young-Dong
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.11 no.11
    • /
    • pp.1077-1082
    • /
    • 2016
  • IDS can be used as a countermeasure for malicious attacks which cause degrade of network transmission performance by disturbing of MANET routing function. In this paper, effects of IDS for transmission performance on MANET under grayhole attacks which has intrusion objects for a part of transmissions packets, some suggestion for effective IDS will be considered. Computer simulation based on NS-2 is used for performance analysis, performance is measured with VoIP(: Voice over Internet Protocol) as an application service. MOS(: Mean Opinion Score), CCR(: Call Connection Rate) and end-to-end delay is used for performance parameter as standard transmission quality factor for voice transmission.

Design of Wideband Speech Coder Compatible with CS-ACELP (CS-ACELP와 호환성을 갖는 광대역 음성 부호화기 설계)

  • 김동주;이인성
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.4
    • /
    • pp.52-57
    • /
    • 2000
  • In this paper, we designed the 16 Kbps speech coder that has compatibility with CS-ACELP algorithm(G.729). The speech signal is sampled at rate of 16 KHz, divided into two narrowband signal by QMF filterbank, and decimated to rate of 8 KHz. The lower-band signal is encoded by CS-ACELP and the upper-band signal is encoded by Adaptive Transform Coding(ATC) algorithm. At the receiver, two band signals are synthesized by decoder of CS-ACELP and ATC, respectively. The reconstructed output is obtained by passing the QMF synthesis bank. The proposed wideband coder is evaluated with ITU-T G.722 coder through the Mean Opinion Score(MOS) test.

  • PDF

Improved CycleGAN for underwater ship engine audio translation (수중 선박엔진 음향 변환을 위한 향상된 CycleGAN 알고리즘)

  • Ashraf, Hina;Jeong, Yoon-Sang;Lee, Chong Hyun
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.4
    • /
    • pp.292-302
    • /
    • 2020
  • Machine learning algorithms have made immense contributions in various fields including sonar and radar applications. Recently developed Cycle-Consistency Generative Adversarial Network (CycleGAN), a variant of GAN has been successfully used for unpaired image-to-image translation. We present a modified CycleGAN for translation of underwater ship engine sounds with high perceptual quality. The proposed network is composed of an improved generator model trained to translate underwater audio from one vessel type to other, an improved discriminator to identify the data as real or fake and a modified cycle-consistency loss function. The quantitative and qualitative analysis of the proposed CycleGAN are performed on publicly available underwater dataset ShipsEar by evaluating and comparing Mel-cepstral distortion, pitch contour matching, nearest neighbor comparison and mean opinion score with existing algorithms. The analysis results of the proposed network demonstrate the effectiveness of the proposed network.

Noise Reduction Using the Standard Deviation of the Time-Frequency Bin and Modified Gain Function for Speech Enhancement in Stationary and Nonstationary Noisy Environments

  • Lee, Soo-Jeong;Kim, Soon-Hyob
    • The Journal of the Acoustical Society of Korea
    • /
    • v.26 no.3E
    • /
    • pp.87-96
    • /
    • 2007
  • In this paper we propose a new noise reduction algorithm for stationary and nonstationary noisy environments. Our algorithm classifies the speech and noise signal contributions in time-frequency bins, and is not based on a spectral algorithm or a minimum statistics approach. It relies on calculating the ratio of the standard deviation of the noisy power spectrum in time-frequency bins to its normalized time-frequency average. We show that good quality can be achieved for enhancement speech signal by choosing appropriate values for ${\delta}_t\;and\;{\delta}_f$. The proposed method greatly reduces the noise while providing enhanced speech with lower residual noise and somewhat higher mean opinion score (MOS), background intrusiveness (BAK) and signal distortion (SIG) scores than conventional methods.

A Study on a Improvement of the Speech Quality by Spectrum Analysis with Variable Window in CELP Vocoder (가변 윈도우 스펙트럼 분석을 이용한 CELP 부호화기의 음질 향상에 관한 연구)

  • 나덕수;민소연;배명진
    • Proceedings of the IEEK Conference
    • /
    • 2000.06d
    • /
    • pp.106-109
    • /
    • 2000
  • There have been proposed two types of low bit rate vocoder upto now : One is MBE type using the spectrum modeling and another is CELP type using the hybrid coding method. CELP type vocoder has mainly studied between them. Specially, much of intensity is concentrated in CELP vocoder due to the emergence of Internet Phone and PCS in a domestic. In order to improve the speech quality in CELP vocoder, in this paper, we proposed a new spectrum analysis algorithm with variable window, In CELP vocoder, the spectrum of the synthesised speech signal is distorted because the fixed size windows is used for spectrum analysis. So we have measured the spectral leakage and in order to minimize the spectral leakage have adjusted the window size. Applying this method G.723.1 ACELP, we can get SD(Spectral Distortion) reduction 0.084(dB), residual energy reduction 6.3% and MOS(Mean Opinion Score) improvement 0.1.

  • PDF

Development of an Automatic Speech Quality Evaluator for Analog Cellular System (아날로그 셀룰라 시스템을 위한 자동 음질 평가기 개발)

  • 박상욱;최용수;정성교;윤대희;이충용
    • The Journal of the Acoustical Society of Korea
    • /
    • v.17 no.7
    • /
    • pp.28-35
    • /
    • 1998
  • 본 논문에서는 아날로그 이동 전화 환경에서의, 객관적인 음질 평가 척도를 사용하 여 주관적 음질을 추정하는 이동전화 자동 음질평가 시스템을 개발하였다. 이동전화의 통화 품질을 유지하기 위해서는 이동전화의 네트워크를 계속하여 체크하는 것이 매우 중요하다. 주관적 음질 평가는 사람의 체감을 직접 나타내는 것이므로 실제적인 음질을 평가하는데 중 요한 척도가 되지만, 인력과 시간이 많이 소모되므로 다양한 지역에서 지속적으로 음질을 평가하는데 부적절하다. 이러한 문제를 해결하기 위하여 객관적 음질평가 척도를 이용하여 주관적 음질 평가 척도를 예측하는 자동 음질 평가 시스템이 필수적이다. 반복된 실험을 통 하여 BSD(Bark Spectral Distance)가 주관적 음질 평가 척도와 높은 상관관계가 있음을 확 인하였으며 원래의 음성과 이동 전화 채널을 통과한 음성과의 BSD를 측정한 후 이를 바탕 으로 MOS(Mean Opinion Score)를 추정하는 자동 음질 평가 시스템(Automatic Speech Quality Evaluator)을 개발하였다.

  • PDF

Speech Quality of a Sinusoidal Model Depending on the Number of Sinusoids

  • Seo, Jeong-Wook;Kim, Ki-Hong;Seok, Jong-Won;Bae, Keun-Sung
    • Speech Sciences
    • /
    • v.7 no.1
    • /
    • pp.17-29
    • /
    • 2000
  • The STC(Sinusoidal Transform Coding) is a vocoding technique that uses a sinusoidal speech model to obtain high- quality speech at low data rate. It models and synthesizes the speech signal with fundamental frequency and its harmonic elements in frequency domain. To reduce the data rate, it is necessary to represent the sinusoidal amplitudes and phases with as small number of peaks as possible while maintaining the speech quality. As a basic research to develop a low-rate speech coding algorithm using the sinusoidal model, in this paper, we investigate the speech quality depending on the number of sinusoids. By varying the number of spectral peaks from 5 to 40 speech signals are reconstructed, and then their qualities are evaluated using spectral envelope distortion measure and MOS(Mean Opinion Score). Two approaches are used to obtain the spectral peaks: one is a conventional STFT (Short-Time Fourier Transform), and the other is a multiresolutional analysis method.

  • PDF