• Title/Summary/Keyword: Mean Opinion Score (MOS)

Search Result 94, Processing Time 0.027 seconds

Bandwidth Expansion Method Using Spline Codebook Based Spectral Folding (Spline 코드북 기반의 spectral folding을 이용한 대역폭 확장 방법)

  • Park, Ji-Hoon;Han, Seung-Ho;Yang, Hee-Sik;Jeong, Sang-Bae;Hahn, Min-Soo
    • Proceedings of the KSPS conference
    • /
    • 2006.11a
    • /
    • pp.131-134
    • /
    • 2006
  • Quality of narrowband speech $(0{\sim}4kHz)$ can be enhanced by the bandwidth expansion technique, by which the high- band components are estimated. This paper proposes the bandwidth expansion method using the spline codebook based spectral folding. For the performance evaluation, the PESQ(Perceptual Evaluation of Speech Quality) scores are measured as the objective measurement In addition, the MOS (Mean Opinion Score) and the preference tests are performed as the subjective measurement. The results show our proposed method outperforms the existing spline based one.

  • PDF

PROSODY CONTROL BASED ON SYNTACTIC INFORMATION IN KOREAN TEXT-TO-SPEECH CONVERSION SYSTEM

  • Kim, Yeon-Jun;Oh, Yung-Hwan
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06a
    • /
    • pp.937-942
    • /
    • 1994
  • Text-to-Speech(TTS) conversion system can convert any words or sentences into speech. To synthesize the speech like human beings do, careful prosody control including intonation, duration, accent, and pause is required. It helps listeners to understand the speech clearly and makes the speech sound more natural. In this paper, a prosody control scheme which makes use of the information of the function word is proposed. Among many factors of prosody, intonation, duration, and pause are closely related to syntactic structure, and their relations have been formalized and embodied in TTS. To evaluate the synthesized speech with the proposed prosody control, one of the subjective evaluation methods-MOS(Mean Opinion Score) method has been used. Synthesized speech has been tested on 10 listeners and each listener scored the speech between 1 and 5. Through the evaluation experiments, it is observed that the proposed prosody control helps TTS system synthesize the more natural speech.

  • PDF

Text-to-Speech Synthesizer with the Process of Minimizing Concatenation Distortion (접합 왜곡의 최소화 과정이 포함된 음성합성기)

  • 박훈재;김상훈;정재호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.17 no.4
    • /
    • pp.38-44
    • /
    • 1998
  • 대용량의 음성합성용 데이터베이스를 용이하게 구축하기 위해 음성인식 시스템을 이용한 음소 경계 분할이 이루어지고 있다. 그러나 자동 분할 결과를 직접 이용하여 합성음 을 생성할 경우 음소 경계 에러로 인하여 접합 왜곡이 많이 발생하게 된다. 이러한 문제를 해결하기 위해서, 본 연구에서는 단위 접합시 경계 에러를 고려하여 적합한 접합 위치를 찾 고자 하였다. 여기서 적합한 접합 위치는 스펙트럼의 불연속이 최소화된 접합점을 의미한다. 합성음에 대한 MOS(Mean Opinion Score) 테스트와 스펙트로그램(spectrogram)의 모양을 비교하므로써 제안된 방법의 성능을 평가하였다. 제안된 방법은 두 단계로 이루어져 있다. 첫째, 레퍼런스 패턴(reference pattern)과 두 개의 테스트 패턴(test pattern)을 선택하는 단 계와, 둘째, 앞과 뒤 테스트 패턴 사이의 적합한 접합위치를 찾는 단계이다. 본 연구에서는 패턴 사이의 스펙트로그램 비교를 위해 켑스트럼(cepstrum) 피라미터와 패턴 분류기 (pattern classifier)인 DTW(Dynamic Time Warping) 알고리즘을 사용하였다. 제안된 알고 리즘을 평가한 청취 테스트의 결과에서 제안된 알고리즘을 적용하여 합성된 합성음의 음질 이 자동 분절로 생성된 단위를 그대로 이용한 경우의 음질보다 우수함을 보였다.

  • PDF

A Study on Testing Image Quality on Facsimile (팩시밀리 화상품질 측정에 관한 연구)

  • Kwon, S.;Hwang, G.
    • Electronics and Telecommunications Trends
    • /
    • v.8 no.4
    • /
    • pp.157-162
    • /
    • 1993
  • 본 연구는 아날로그 신호를 사용하는 공중교환 전화망과 접속되는 그룹 3(G3) 팩시밀리의 화상 품질을 측정하는 방법을 제시하였다. CCITT(현 ITU-TS) 표준시험 도표 No.2를 이용하여 전송된 화상에 대한 평가는 설문조사를 통해 평가되었고, 그것들은 MOS(Mean Opinion Score) 방법에 의해 계량화되었다. 설문지의 결과에 대한 상관 분석을 통해 문항을 하나의 종합 평가 문항으로 줄일 수 있음을 살펴보았다. 그리고 그 점수들의 평균들에 대한 차이를 분석함으로써 팩시밀리 화상 품질에 영향을 미치는 요인들의 유의성을 검정하였다. 유의성을 검정하는 방법들로 t 검정법과 Vander Waerden Scores 방법을 제시하였다. 그리고 검정 결과 점수 평균이 유의하지 않은 그룹들을 하나의 그룹으로 하여 그 그룹에 있어서 점수 히스토그램을 구하였다. 이 히스토그램을 하나의 정규 분포 곡선으로 근사시켜 팩시밀리 화상 품질 평가치를 살펴보았다.

The implementation of the Language-Study-Headphone storng to Noise Environment (소음 환경에서 강인한 어학용 헤드폰 구현)

  • Son, Jae-Hyeak;Shin, Jae-Ho
    • 한국정보통신설비학회:학술대회논문집
    • /
    • 2005.08a
    • /
    • pp.397-405
    • /
    • 2005
  • This paper presents a headphone system which has adopted two algorithm to increase sound clearness and to separate signal from noisy environment. In the field of adaptive signal processing, LMS algorithm which is a kind of steepest decent method, can be implemented with more simple calculation, so that we use it to eliminate unwanted noise elements for the proposed system. Futhermore we generate early echo using some delays, then mix it in signal. This process can increase the clearness of signal. In this paper, we prove that the proposed system can be implemented in real time. The proposed system is satisfied to subject assessment test base on MOS(Mean Opinion Score) of ITU-T.

  • PDF

Quality Evaluation of JPEG2000 Compressed Images in PACS Environments (PACS 환경에서 JPEG2000 압축 영상의 화질 평가)

  • Lee, Yong-Jai
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2005.07b
    • /
    • pp.682-684
    • /
    • 2005
  • 현재 설러 병원에서 PACS 시스템을 도입해 유용하게 사용하고 있다. 병원 진료에서 방사선 영상 정보는 중요한 위치를 차지한다. 방사선 영상은 관전압(KVP)과 관전류(mAs)로 방사선량을 조절한 후 인체에 조사하여 얻게 되는데, KVP와 mAs, 인체의 두께에 따라 영상의 질이 변하게 된다. 이와 같이 장비에서 촬영된 영상은 판독을 거처 진료에 이용되고 일정한 시간이 지나면 압축하여 보관하게 되는데, 압축율을 높게 적용할수록 저장장치에 대한 경제적인 효과는 크다. 이에 저자는 1) CR, DR 촬영 조건별 흉부 영상을 얻어 JPEG 2000 압축방식을 적용해 촬영조건이 압축영상에 미치는 영향을 평가하였고, 2) MOS(Mean opinion score) 평가를 통해 영상판독에 영향을 주지 않는 유효 압축율을 제시하였다.

  • PDF

A Study on a Improvement of the Speech Quality with Variable Window in CELP Vocoder (가변 윈도우를 이용한 CELP 부호화기의 음질 향상에 관한 연구)

  • Ju, Sang-Gyu
    • Proceedings of the KAIS Fall Conference
    • /
    • 2010.05a
    • /
    • pp.265-268
    • /
    • 2010
  • There have been proposed two types of low bit rate vocoder upto now : One is MBE type using the spectrum modeling and another is CELP type using the hybrid coding method. CELP type vocoder has mainly studied between them. Specially, much of intensity is concentrated in CELP vocoder due to the emergence of Internet Phone and PCS in a domestic. In order to improve the speech quality in CELP vocoder, in this paper, we proposed a new spectrum analysis algorithm with variable window. In CELP vocoder, the spectrum of the synthesised speech signal is distorted because the fixed size windows is used for spectrum analysis. So we have measured the spectral leakage and in order to minimize the spectral leakage have adjusted the window size. Applying this method G.723.1 ACELP, we can get SD(Spectral Distortion) reduction 0.084(dB), residual energy reduction 6.3% and MOS(Mean Opinion Score) improvement 0.1.

  • PDF

A Novel Approach to a Robust A Priori SNR Estimator in Speech Enhancement (음성 향상에서 강인한 새로운 선행 SNR 추정 기법에 관한 연구)

  • Park, Yun-Sik;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.8
    • /
    • pp.383-388
    • /
    • 2006
  • This Paper presents a novel approach to single channel microphone speech enhancement in noisy environments. Widely used noise reduction techniques based on the spectral subtraction are generally expressed as a spectral gam depending on the signal-to-noise ratio (SNR). The well-known decision-directed(DD) estimator of Ephraim and Malah efficiently reduces musical noise under the background noise conditions, but generates the delay of the a prioiri SNR because the DD weights the speech spectrum component of the Previous frame in the speech signal. Therefore, the noise suppression gain which is affected by the delay of the a priori SNR, which is estimated by the DD matches the previous frame rather than the current one, so after noise suppression. this degrades the noise reduction performance during speech transient periods. We propose a computationally simple but effective speech enhancement technique based on the sigmoid type function for the weight Parameter of the DD. The proposed approach solves the delay problem about the main parameter, the a priori SNR of the DD while maintaining the benefits of the DD. Performances of the proposed enhancement algorithm are evaluated by ITU-T p.862 Perceptual Evaluation of Speech duality (PESQ). the Mean Opinion Score (MOS) and the speech spectrogram under various noise environments and yields better results compared with the fixed weight parameter of the DD.

A Novel Perceptual No-Reference Video-Quality Measurement With the Histogram Analysis of Luminance and Chrominance (휘도, 색차의 분포도 분석을 이용한 인지적 무기준법 영상 화질 평가방법)

  • Kim, Yo-Han;Sung, Duk-Gu;Han, Jung-Hyun;Shin, Ji-Tae
    • Journal of Broadcast Engineering
    • /
    • v.14 no.2
    • /
    • pp.127-133
    • /
    • 2009
  • With advances in video technology, many researchers are interested in video quality assessment to prove better performance of proposed algorithms. Since human visual system is too complex to be formulated exactly, many researches about video quality assessment are in progressing. No-reference video-quality assessment is suitable for various video streaming services, because of no requested additional data and network capacity to perform quality assessment. In this paper, we propose a novel no-reference video-quality assessment method with the estimation of dynamic range distortion. To measure the performance, we obtain mean opinion score (MOS) data by subject video quality test with the ITU-T P.910 Absolute Category Rating (ACR) method. And, we compare it with proposed algorithm using 363 video sequences. Experimental results show that the proposed algorithm has a higher correlation with obtained MOS.

Design of the Noise Suppressor Using Wavelet Transform (웨이블릿 변환을 이용한 잡음제거기 설계)

  • 원호진;김종학;이인성
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.7
    • /
    • pp.37-46
    • /
    • 2001
  • This paper proposes a new noise suppression method using the Wavelet transform analysis. The noise suppressor using the Wavelet transform shows the more effective advantages in a babble noise than one using the short-time Fourier transform. We designed a new channel structure based on spectral subtraction of Wavelet transform coefficients and used the Wavelet mask pattern with more higher time resolution in high frequency. It showed a good adaptation capability for babble noise with a non-stationary property. To evaluate the performance of proposed noise canceller, the informal subjective listening tests (Mos tests) were performed in background noise environments (car noise, street noise, babble noise) of mobile communication. The proposed noise suppression algorithm showed about MOS 0.2 performance improvements than the suppression algorithm of EVRC in informal listening tests. The noise reduction by the proposed method was shown in spectrogram of speech signal.

  • PDF