Search | Korea Science

Lim, Sangjun;Kim, Hyung Soon
- The Journal of the Acoustical Society of Korea
- /
- v.31 no.7
- /
- pp.427-434
- /
- 2012
The overlap-add technique based on waveform similarity (WSOLA) method is known to be an efficient high-quality algorithm for time scaling of speech signal. The computational load of WSOLA is concentrated on the repeated normalized cross-correlation (NCC) calculation to evaluate the similarity between two signal waveforms. To reduce the computational complexity of WSOLA, this paper proposes a fast NCC computation method, in which NCC is obtained through pre-calculated sum tables to eliminate redundancy of repeated NCC calculations in the adjacent regions. While the denominator part of NCC has much redundancy irrespective of the time-scale factor, the numerator part of NCC has less redundancy and the amount of redundancy is dependent on both the time-scale factor and optimal shift value, thereby requiring more sophisticated algorithm for fast computation. The simulation results show that the proposed method reduces about 40%, 47% and 52% of the WSOLA execution time for the time-scale compression, 2 and 3 times time-scale expansions, respectively, while maintaining exactly the same speech quality of the conventional WSOLA.
https://doi.org/10.7776/ASK.2012.31.7.427 인용 PDF KSCI

Kim, I-Gil
- Journal of Digital Contents Society
- /
- v.16 no.2
- /
- pp.291-298
- /
- 2015
In a fast-paced information technology environment, consumption of video content is changing from one-way television viewing to VOD (Video on Demand) playing anywhere, anytime, on any device. This video-watching trend gives additional importance to videos with fine-speed-control, in addition to the strength of the digital video signal. Currently, many video players provide a fine-speed-control function which can speed up the video to skip a boring part, or slow it down to focus on an exciting scene. The audio information is just as important as the visual information for understanding the content of the speed-controlled video. Thus, a number of algorithms for fine-speed-control video-playing technologies have been proposed to solve the pitch distortion in the audio-processing area. In this study, well-known techniques for prosodic modification of speech signals, WSOLA (Waveform-Similarity-Based Overlap-Add), have been applied to analyze users' needs for fine-speed-control video playing. By surveying the users' preferred speeds on categorized video content and analyzing the results, this paper proposes that various fine-speed adjustments are needed to accommodate users' preferred video consumption.
https://doi.org/10.9728/dcs.2015.16.2.291 인용 PDF KSCI

Lee In-Sung;Hwang Jeong-Joon;Jeong Gyu-Hyeok
- Journal of the Institute of Electronics Engineers of Korea TC
- /
- v.42 no.12
- /
- pp.127-134
- /
- 2005
Lost packet robustness is an most important quality measure for voice over IP networks(VoIP). Recovery of the lost packet from the received information is crucial to realize this robustness. So, this paper proposes the lost packet recovery method from the received information for real-time communication for CELP coder. The proposed BS-PLC (Both Side Packet Loss Concealment) based WSOLA(Waveform Shift OverLab Add) allow the lost packet to be recovered from both the 'previous' and 'next' good packet as the LP parameter and the excitation signal are respectively recovered. The burst of packet loss is modeled by Gilbert model. The proposed scheme is applied to G.729 most used in VoIP and is evaluated through the SNR(signal to noise) and the MOS(Mean Opinion Score) test. As a simulation result, The proposed scheme provide 0.3 higher in Mean Opinion Score and 2 dB higher in terms of SNR than an error concealment procedure in the decoder of G.729 at $20\%$ average packet loss rate.
PDF KSCI