• Title/Summary/Keyword: mean opinion score (MOS)

Search Result 94, Processing Time 0.021 seconds

EVALUATION OF THE SYNTHETIC SPEECH QUALITY BY THE TD-PCULI METHOD

  • Kang, Chan-Hee;Shin, Yong-Jo;Kim, Yun-Seok;Kwon, Ki-Hyung;Chin, Yong-Ohk
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06a
    • /
    • pp.977-983
    • /
    • 1994
  • In this paper we have evaluated the synthetic speech quality by the proposed TD-PCULI speech synthesis method. For the synthesis we have extracted parameters from the Korean monosyllables through the analysis of speech waveforms in the time domain. We have constructed the Korean data format dictionary for the synthesis-by-rule depending upon the frequencies of the Korean pronunciation large vocabulary dictionary, in which V type syllables are 19, CV type's are 80, VC type's are 30 and CVC type's are 100. And using them we have synthesized various Korean monosyllables, words and sentences. We have tested each 10 syllables selected according to the 4 Korean syllable types with the objective MOS(Mean Opinion Score) evluation method about the 4 items i.e., intelligibility, clearness, loudness, and naturality after selecting random group without the knowledge of them. And also we have tested the possibility to modify a duration and F0 into another forms with changing a duration (i.e., 150msec, 300msec, 500msec, 700msec and 1sec) and a central fundamental frequency(i.e., 80Hz, 118Hz, 140Hz, 170Hz, and 200Hz). As the results of experiments the noises occurred in the course of synthesizing the speech by the rules are removed to be a very clear level and we can find that the prosodic elements can be controled as a good condition.

  • PDF

A Perceptual Rate Control Algorithm with S-JND Model for HEVC Encoder (S-JND 모델을 사용한 주관적인 율 제어 알고리즘 기반의 HEVC 부호화 방법)

  • Kim, JaeRyun;Ahn, Yong-Jo;Lim, Woong;Sim, Donggyu
    • Journal of Broadcast Engineering
    • /
    • v.21 no.6
    • /
    • pp.929-943
    • /
    • 2016
  • This paper proposes the rate control algorithm based on the S-JND (Saliency-Just Noticeable Difference) model for considering perceptual visual quality. The proposed rate control algorithm employs the S-JND model to simultaneously reflect human visual sensitivity and human visual attention for considering characteristics of human visual system. During allocating bits for CTU (Coding Tree Unit) level in a rate control, the bit allocation model calculates the S-JND threshold of each CTU in a picture. The threshold of each CTU is used for adaptively allocating a proper number of bits; thus, the proposed bit allocation model can improve perceptual visual quality. For performance evaluation of the proposed algorithm, the proposed algorithm was implemented on HM 16.9 and tested for sequences in Class B and Class C under the CTC (Common Test Condition) RA (Random Access), Low-delay B and Low-delay P case. Experimental results show that the proposed method reduces the bit-rate of 2.3%, and improves BD-PSNR of 0.07dB and bit-rate accuracy of 0.06% on average. We achieved MOS improvement of 0.03 with the proposed method, compared with the conventional method based on DSCQS (Double Stimulus Continuous Quality Scale).

A Nobel Video Quality Degradation Monitoring Schemes Over an IPTV Service with Packet Loss (IPTV 서비스에서 패킷손실에 의한 비디오품질 열화 모니터링 방법)

  • Kwon, Jae-Cheol;Oh, Seoung-Jun;Suh, Chang-Ryul;Chin, Young-Min
    • Journal of Broadcast Engineering
    • /
    • v.14 no.5
    • /
    • pp.573-588
    • /
    • 2009
  • In this paper, we propose a novel video quality degradation monitoring scheme titled VR-VQMS(Visual Rhythm based Video Quality Monitoring Scheme) over an IPTV service prone to packet losses during network transmission. Proposed scheme quantifies the amount of quality degradation due to packet losses, and can be classified into a RR(reduced-reference) based quality measurement scheme exploiting visual rhythm data of H.264-encoded video frames at a media server and reconstructed ones at an Set-top Box as feature information. Two scenarios, On-line and Off-line VR-VQMS, are proposed as the practical solutions. We define the NPSNR(Networked Peak-to-peak Signal-to-Noise Ratio) modified by the well-known PSNR as a new objective quality metric, and several additional objective and subjective metrics based on it to obtain the statistics on timing, duration, occurrence, and amount of quality degradation. Simulation results show that the proposed method closely approximates the results from 2D video frames and gives good estimation of subjective quality(i.e.,MOS(mean opinion score)) performed by 10 test observers. We expect that the proposed scheme can play a role as a practical solution to monitor the video quality experienced by individual customers in a commercial IPTV service, and be implemented as a small and light agent program running on a resource-limited set-top box.

3D Visual Attention Model and its Application to No-reference Stereoscopic Video Quality Assessment (3차원 시각 주의 모델과 이를 이용한 무참조 스테레오스코픽 비디오 화질 측정 방법)

  • Kim, Donghyun;Sohn, Kwanghoon
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.51 no.4
    • /
    • pp.110-122
    • /
    • 2014
  • As multimedia technologies develop, three-dimensional (3D) technologies are attracting increasing attention from researchers. In particular, video quality assessment (VQA) has become a critical issue in stereoscopic image/video processing applications. Furthermore, a human visual system (HVS) could play an important role in the measurement of stereoscopic video quality, yet existing VQA methods have done little to develop a HVS for stereoscopic video. We seek to amend this by proposing a 3D visual attention (3DVA) model which simulates the HVS for stereoscopic video by combining multiple perceptual stimuli such as depth, motion, color, intensity, and orientation contrast. We utilize this 3DVA model for pooling on significant regions of very poor video quality, and we propose no-reference (NR) stereoscopic VQA (SVQA) method. We validated the proposed SVQA method using subjective test scores from our results and those reported by others. Our approach yields high correlation with the measured mean opinion score (MOS) as well as consistent performance in asymmetric coding conditions. Additionally, the 3DVA model is used to extract information for the region-of-interest (ROI). Subjective evaluations of the extracted ROI indicate that the 3DVA-based ROI extraction outperforms the other compared extraction methods using spatial or/and temporal terms.