[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.4218/etrij.12.0211.0408

Recovery of Lost Speech Segments Using Incremental Subspace Learning

Huang, Jianjun (Institute of Command Automation, PLA University of Science and Technology)
Zhang, Xiongwei (Institute of Command Automation, PLA University of Science and Technology)
Zhang, Yafei (Institute of Command Automation, PLA University of Science and Technology)

Publication Information

ETRI Journal / v.34, no.4, 2012 , pp. 645-648 More about this Journal

Abstract

An incremental subspace learning scheme to recover lost speech segments online is presented. Our contributions in this work are twofold. First, the recovery problem is transformed into an interpolation problem of the time-varying gains via nonnegative matrix factorization. Second, incremental nonnegative matrix factorization is employed to allow online processing and track the evolution of speech statistics. The effectiveness of the proposed scheme is confirmed by the experiment results.

Keywords

Packet loss concealment (PLC); nonnegative matrix factorization (NMF); incremental subspace learning;

Citations & Related Records

Times Cited By Web Of Science : 0 (Related Records In Web of Science)

Reference

1	Appendix I: A High Quality Low-Complexity Algorithm for Packet Loss Concealment with G.711, ITU-T Recommend G.711, Sept. 1999.
2	Y.J. Liang, N. Färber, and B. Girod, "Adaptive Playout Scheduling and Loss Concealment for Voice Communication over IP Networks," IEEE Trans. Multimedia, vol. 5, no. 2, June 2003, pp. 532-543.
3	E. Zavarehei and S. Vaseghi, "Interpolation of Lost Speech Segments Using LP-HNM Model with Codebook Post-Processing," IEEE Trans. Multimedia, vol. 10, no. 3, Apr. 2008, pp. 493-502. DOI
4	C.A. Rødbro et al., "Hidden Markov Model-Based Packet Loss Concealment for Voice over IP," IEEE Trans. Audio, Speech, Lang. Process., vol. 14, no. 5, Sept. 2006, pp. 1609-1623. DOI
5	S.S. Bucak and B. Gunsel, "Incremental Subspace Learning via Non-negative Matrix Factorization," Pattern Recognition, vol. 42, no. 5, May 2009, pp. 788-797. DOI ScienceOn
6	T. Virtanen, "Monaural Sound Source Separation by Nonnegative Matrix Factorization with Temporal Continuity and Sparseness Criteria," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 3, Mar. 2007, pp. 1066-1074. DOI
7	X. Zhu, G.T. Beauregard, and L.L. Wyse, "Real-Time Signal Estimation from Modified Short-Time Fourier Transform Magnitude Spectra," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 5, July 2007, pp. 1645-1653. DOI
8	G. Zhou et al., "Online Blind Source Separation Using Incremental Nonnegative Matrix Factorization with Volume Constraint," IEEE Trans. Neural Netw., vol. 22, no. 4, Apr. 2011, pp. 550-560. DOI
9	Perceptual Evaluation of Speech Quality (PESQ): An Objective Method for End-to-End Speech Quality Assessment of Narrowband Telephone Networks and Speech Codecs, ITU-T Recommendation P.862, 2001.

1	(2013) Electronics letters Approach for time-scale modification of speech based on TCNMF / 49 (1) , 71
6	(2015) International journal of distributed sensor networks Adaptive Speech Streaming Based on Speech Quality Estimation and Artificial Bandwidth Extension for Voice over Wireless Multimedia Sensor Networks / 11 (6) , 395752
6	(2012) ETRI journal Adaptive Speech Streaming Based on Packet Loss Prediction Using Support Vector Machine for Software-Based Multipoint Control Unit over IP Networks / 38 (6) , 1064