• Title/Summary/Keyword: WSOLA

Search Result 3, Processing Time 0.025 seconds

A Fast Normalized Cross-Correlation Computation for WSOLA-based Speech Time-Scale Modification (WSOLA 기반의 음성 시간축 변환을 위한 고속의 정규상호상관도 계산)

  • Lim, Sangjun;Kim, Hyung Soon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.31 no.7
    • /
    • pp.427-434
    • /
    • 2012
  • The overlap-add technique based on waveform similarity (WSOLA) method is known to be an efficient high-quality algorithm for time scaling of speech signal. The computational load of WSOLA is concentrated on the repeated normalized cross-correlation (NCC) calculation to evaluate the similarity between two signal waveforms. To reduce the computational complexity of WSOLA, this paper proposes a fast NCC computation method, in which NCC is obtained through pre-calculated sum tables to eliminate redundancy of repeated NCC calculations in the adjacent regions. While the denominator part of NCC has much redundancy irrespective of the time-scale factor, the numerator part of NCC has less redundancy and the amount of redundancy is dependent on both the time-scale factor and optimal shift value, thereby requiring more sophisticated algorithm for fast computation. The simulation results show that the proposed method reduces about 40%, 47% and 52% of the WSOLA execution time for the time-scale compression, 2 and 3 times time-scale expansions, respectively, while maintaining exactly the same speech quality of the conventional WSOLA.

A Study about the Users's Preferred Playing Speeds on Categorized Video Content using WSOLA method (WSOLA를 이용한 동영상 미세배속 재생 서비스에 대한 콘텐츠별 배속 선호도 분석 연구)

  • Kim, I-Gil
    • Journal of Digital Contents Society
    • /
    • v.16 no.2
    • /
    • pp.291-298
    • /
    • 2015
  • In a fast-paced information technology environment, consumption of video content is changing from one-way television viewing to VOD (Video on Demand) playing anywhere, anytime, on any device. This video-watching trend gives additional importance to videos with fine-speed-control, in addition to the strength of the digital video signal. Currently, many video players provide a fine-speed-control function which can speed up the video to skip a boring part, or slow it down to focus on an exciting scene. The audio information is just as important as the visual information for understanding the content of the speed-controlled video. Thus, a number of algorithms for fine-speed-control video-playing technologies have been proposed to solve the pitch distortion in the audio-processing area. In this study, well-known techniques for prosodic modification of speech signals, WSOLA (Waveform-Similarity-Based Overlap-Add), have been applied to analyze users' needs for fine-speed-control video playing. By surveying the users' preferred speeds on categorized video content and analyzing the results, this paper proposes that various fine-speed adjustments are needed to accommodate users' preferred video consumption.

BS-PLC(Both Side-Packet Loss Concealment) for CELP Coder (CELP 부호화기를 위한 양방향 패킷 손실 은닉 알고리즘)

  • Lee In-Sung;Hwang Jeong-Joon;Jeong Gyu-Hyeok
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.42 no.12
    • /
    • pp.127-134
    • /
    • 2005
  • Lost packet robustness is an most important quality measure for voice over IP networks(VoIP). Recovery of the lost packet from the received information is crucial to realize this robustness. So, this paper proposes the lost packet recovery method from the received information for real-time communication for CELP coder. The proposed BS-PLC (Both Side Packet Loss Concealment) based WSOLA(Waveform Shift OverLab Add) allow the lost packet to be recovered from both the 'previous' and 'next' good packet as the LP parameter and the excitation signal are respectively recovered. The burst of packet loss is modeled by Gilbert model. The proposed scheme is applied to G.729 most used in VoIP and is evaluated through the SNR(signal to noise) and the MOS(Mean Opinion Score) test. As a simulation result, The proposed scheme provide 0.3 higher in Mean Opinion Score and 2 dB higher in terms of SNR than an error concealment procedure in the decoder of G.729 at $20\%$ average packet loss rate.