• Title/Summary/Keyword: 주관적 음향성능

Search Result 82, Processing Time 0.023 seconds

High-Band Codec for Bandwidth Scalable Wideband Speech Codec (대역폭 계층 구조의 광대역 음성 부호화기를 위한 상위 대역 부호화기 연구)

  • Kim Youngvo;Jeong Byounghak;Son Chang-Yong;Sung Ho-Sang;Park Hochong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.7
    • /
    • pp.395-401
    • /
    • 2005
  • In this paper, the high-band codec for bandwidth scalable wideband speech codec is proposed. The wideband input speech signal is separated into low-band signal and high-band signal, and the low-band signal is encoded by the standard narrow-band speech codec and the high-band signal is encoded by the proposed codec. In the high-band codec. the signal is transformed into frequency domain by MLT on a subframe basis, and MLT coefficients are splitted into magnitude and sign for quantization. The magnitudes of MLT coefficients are arranged into several time-frequency bands and each band is quantized in 2D-DCT domain, where the low-band information is utilized for better performance. The sign of MLT coefficient is quantized based on a priority selection process with the weighting measurement. The objective and subjective performance of wideband speech codec including the proposed high-band codec is measured, and it is confirmed that the proposed codec has better performance than 32kbps G.722.1.

16kbps Windeband Sideband Speech Codec (16kbps 광대역 음성 압축기 개발)

  • 박호종;송재종
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.1
    • /
    • pp.5-10
    • /
    • 2002
  • This paper proposes new 16 kbps wideband speech codec with bandwidth of 7 kHz. The proposed codec decomposes the input speech signal into low-band and high-band signals using QMF (Quadrature Mirror Filter), then AMR (Adaptive Multi Rate) speech codec processes the low-band signal and new transform-domain codec based on G.722.1 wideband cosec compresses the high-band signal. The proposed codec allocates different number of bits to each band in an adaptive way according to the property of input signal, which provides better performance than the codec with the fixed bit allocation scheme. In addition, the proposed cosec processes high-band signal using wavelet transform for better performance. The performance of proposed codec is measured in a subjective method. and the simulations with various speech data show that the proposed coders has better performance than G.722 48 kbps SB-ADPCM.

Entropy-Constrained Temporal Decomposition (엔트로피 제한 조건을 갖는 시간축 분할)

  • Lee Ki-Seung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.5
    • /
    • pp.262-270
    • /
    • 2005
  • In this paper, a new temporal decomposition method is proposed. where not oniy distortion but also entropy are involved in segmentation. The interpolation functions and the target feature vectors are determined by a dynamic Programing technique. where both distortion and entropy are simultaneously minimized. The interpolation functions are built by using a training speech corpus. An iterative method. where segmentation and estimation are iteratively performed. finds the locally optimum Points in the sense of minimizing both distortion and entropy. Simulation results -3how that in terms of both distortion and entropy. the Proposed temporal decomposition method Produced superior results to the conventional split vector-quantization method which is widely employed in the current speech coding methods. According to the results from the subjective listening test, the Proposed method reveals superior Performance in terms of qualify. comparing to the Previous vector quantization method.

A Probabilistic Combination Method of Minimum Statistics and Soft Decision for Robust Noise Power Estimation in Speech Enhancement (강인한 음성향상을 위한 Minimum Statistics와 Soft Decision의 확률적 결합의 새로운 잡음전력 추정기법)

  • Park, Yun-Sik;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.26 no.4
    • /
    • pp.153-158
    • /
    • 2007
  • This paper presents a new approach to noise estimation to improve speech enhancement in non-stationary noisy environments. The proposed method combines the two separate noise power estimates provided by the minimum statistics (MS) for speech presence and soft decision (SD) for speech absence in accordance with SAP (Speech Absence Probability) on a separate frequency bin. The performance of the proposed algorithm is evaluated by the subjective test under various noise environments and yields better results compared with the conventional MS or SD-based schemes.

Design of a 4kb/s ACELP Codec Using the Generalized AbS Principle (Generalized AbS 구조를 이용한 4kb/s ACELP 음성 부호화기의 설계)

  • 성호상;강상원
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.7
    • /
    • pp.33-38
    • /
    • 1999
  • In this paper, we combine a generalized analysis-by-synthesis (AbS) structure and an algebraic excitation scheme to propose a new 4kb/s speech codec. This codec partly uses the structure of G.729. We design a line spectrum pair (LSP) quantizer, an adaptive codebook, and an excitation codebook to fit the 4 kb/s bit rate. The codec has a 25㎳ algorithmic delay, which corresponds to a 20㎳ frame size and a 5㎳ lookahead. At the bit rates below 4kb/s, most CELP speech codecs using the AbS principle have a drawback that results a rapid degradation of speech quality. To overcome this drawback we use the generalized AbS structure which is efficient for the low bit rate speech codec. LP coefficients are converted to LSP and quantized using a predictive 2-stage VQ. A low complexity algebraic codebook which uses shifting method is used for the fixed codebook excitation, and gains of the adaptive codebook and the fixed codebook are quantized using the VQ. To evaluate the performance of the proposed codec A-B preference tests are done with the fixed rate 8kb/s QCELP. As the result of the test, the performance of the codec is similar to that of the fixed rate 8kb/s QCELP.

  • PDF

Vocal separation method using weighted β-order minimum mean square error estimation based on kernel back-fitting (커널 백피팅 알고리즘 기반의 가중 β-지수승 최소평균제곱오차 추정방식을 적용한 보컬음 분리 기법)

  • Cho, Hye-Seung;Kim, Hyoung-Gook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.35 no.1
    • /
    • pp.49-54
    • /
    • 2016
  • In this paper, we propose a vocal separation method using weighted ${\beta}$-order minimum mean wquare error estimation (WbE) based on kernel back-fitting algorithm. In spoken speech enhancement, it is well-known that the WbE outperforms the existing Bayesian estimators such as the minimum mean square error (MMSE) of the short-time spectral amplitude (STSA) and the MMSE of the logarithm of the STSA (LSA), in terms of both objective and subjective measures. In the proposed method, WbE is applied to a basic iterative kernel back-fitting algorithm for improving the vocal separation performance from monaural music signal. The experimental results show that the proposed method achieves better separation performance than other existing methods.

On a Research of Improving the Performance of Voice Activity Detector in G.723.1 (G.723.1 음성 활동 검출 장치 성능 향상에 관한 연구)

  • JANG KyungA;KIM JeongJin;Chang YoungOh;HONG SeongHoon;BAE MyungJin
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • autumn
    • /
    • pp.53-56
    • /
    • 1999
  • ITU-T 국제 표준화 기구에서 인터넷 폰과 화상회의를 목적으로 개발된 G.723.1 음성 부호화기는 잡음 구간에서의 전송률을 낮추기 위한 방법으로 VAD(Voice Activity Detector)와 CNG(Comfort Noise Generator)를 사용하고 있다 이중 VAD는 최종적으로 현재 프레임의 에너지 레벨을 비교하여 음성의 활동 유무를 판정하고 있다. 하지만 G.723.1 VAD에서는 보다 안정적인 판정을 위해 음성 활동 구간 사이에 삽입되어 있는 묵음 구간에 대해서는 거의 대부분 음성이 활동하는 영역으로 판정을 하고 있다. 따라서 본 논문에서는 묵음 구간에 대해 보다 정확한 판정을 통하여 기존의 방법에 비해 전송율을 더욱 감소시킬 수 있는 방법을 제안한다. 실험에서는 묵음구간을 길게 조절한 문장을 사용하여 측정한 결과 평균 $46.8\%$ 정도의 전송율을 감소시킬 수 있었으며, 주관적인 음질평가의 경우 음질의 열하는 거의 발생하지 않았다.

  • PDF

Design of a Variable half rate speech codec (가변율 half rate 음성 부호화기의 설계)

  • 성호상
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1998.06e
    • /
    • pp.293-296
    • /
    • 1998
  • 본 논문에서는 다양한 멀티미디어 서비스를 위해 가변율 half rate 음성 부호화기를 설계하였다. 유, 무성음과 묵음의 구분을 위해 본 논문에서는 프레임 에너지와 음성 파라메터들을 이용한 효과적인 voicing 결정 알고리즘을 사용하였다. 유성음을 위한 half rate 음성 부호화기는 저속에서 좋은 특성을 보이는 generalized AbS구조를 이용하였다. LPC 계수는 LSP 계수로 변환한 후 predictive 2-stage VQ를 통해서 양자화하며, 여기 신호는 음질저하를 최소화하며 복잡도를 감소시킨 shift 방식의 대수적 고정 코드북 구조를 사용하고, 적응코드북과 여기코드북의 이득은 VQ로 양자화 하였다. 무성음을 위한 부호화기는 대부분이 유성음을 위한 부호화기와 동일하지만, 무성음에서는 피치간 상관도가 매우 낮으므로 피치 보간 방법을 사용하지 않고 개루프로 피치 lag를 찾은 후 전체 프레임에 사용한다. 1 kb/s 부호화기는 묵음 구간과 주변소음 구간에 사용되며 이 구간의 신호를 피치 성분이 미약한 주변소음들로 제한하고 이에 최적인 부음성 부호화기를 설계하였다. 최종적으로 완성된 가변율 half rate 부호화기는 voice activity factor(VAF)가 0.47인 시험음성에서 약 2.6 kb/s의 평균 전송률을 보였다. 주관적 음질 평가의 일환으로 IS-96 표준 코덱인 가변율 8 kb/s QCELP와 A-B preference 시험을 실시하였다. 시험 결과 평균전송률이 약 2배인 가변율 8 kb/s QCELP 보다 우수한 음질 성능을 보였다.

  • PDF

Improving Speech Quality of VoIP by Packet Prioritization (패킷 중요도 결정에 의한 VoIP 통화 품질 향상 기술)

  • Yoon, Jae-Yul;Park, Ho-Chong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.5
    • /
    • pp.347-353
    • /
    • 2010
  • In VoIP system, the speech quality is seriously degraded due to packet loss, and the degree of degradation by each packet loss depends on the characteristics of the corresponding packet. Therefore, it is possible to improve the speech quality of VoIP by selectively controlling the packet to be lost during transmission based on the expected degradation by the loss of each packet. In this paper, a new scheme to improve speech quality of DiffServ-based VoIP by assigning priority to each packet is proposed, and a method to determine the priority of each packet is developed. The performance of proposed method was measured in packet loss environment based on Gilbert model, and it was verified both objectively and subjectively that the speech quality is improved by the proposed method.

Design of the Noise Suppressor Using Wavelet Transform (웨이블릿 변환을 이용한 잡음제거기 설계)

  • 원호진;김종학;이인성
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.7
    • /
    • pp.37-46
    • /
    • 2001
  • This paper proposes a new noise suppression method using the Wavelet transform analysis. The noise suppressor using the Wavelet transform shows the more effective advantages in a babble noise than one using the short-time Fourier transform. We designed a new channel structure based on spectral subtraction of Wavelet transform coefficients and used the Wavelet mask pattern with more higher time resolution in high frequency. It showed a good adaptation capability for babble noise with a non-stationary property. To evaluate the performance of proposed noise canceller, the informal subjective listening tests (Mos tests) were performed in background noise environments (car noise, street noise, babble noise) of mobile communication. The proposed noise suppression algorithm showed about MOS 0.2 performance improvements than the suppression algorithm of EVRC in informal listening tests. The noise reduction by the proposed method was shown in spectrogram of speech signal.

  • PDF