Browse > Article
http://dx.doi.org/10.4218/etrij.12.0211.0408

Recovery of Lost Speech Segments Using Incremental Subspace Learning  

Huang, Jianjun (Institute of Command Automation, PLA University of Science and Technology)
Zhang, Xiongwei (Institute of Command Automation, PLA University of Science and Technology)
Zhang, Yafei (Institute of Command Automation, PLA University of Science and Technology)
Publication Information
ETRI Journal / v.34, no.4, 2012 , pp. 645-648 More about this Journal
Abstract
An incremental subspace learning scheme to recover lost speech segments online is presented. Our contributions in this work are twofold. First, the recovery problem is transformed into an interpolation problem of the time-varying gains via nonnegative matrix factorization. Second, incremental nonnegative matrix factorization is employed to allow online processing and track the evolution of speech statistics. The effectiveness of the proposed scheme is confirmed by the experiment results.
Keywords
Packet loss concealment (PLC); nonnegative matrix factorization (NMF); incremental subspace learning;
Citations & Related Records

Times Cited By Web Of Science : 0  (Related Records In Web of Science)
연도 인용수 순위
  • Reference
1 Appendix I: A High Quality Low-Complexity Algorithm for Packet Loss Concealment with G.711, ITU-T Recommend G.711, Sept. 1999.
2 Y.J. Liang, N. Färber, and B. Girod, "Adaptive Playout Scheduling and Loss Concealment for Voice Communication over IP Networks," IEEE Trans. Multimedia, vol. 5, no. 2, June 2003, pp. 532-543.
3 E. Zavarehei and S. Vaseghi, "Interpolation of Lost Speech Segments Using LP-HNM Model with Codebook Post-Processing," IEEE Trans. Multimedia, vol. 10, no. 3, Apr. 2008, pp. 493-502.   DOI
4 C.A. Rødbro et al., "Hidden Markov Model-Based Packet Loss Concealment for Voice over IP," IEEE Trans. Audio, Speech, Lang. Process., vol. 14, no. 5, Sept. 2006, pp. 1609-1623.   DOI
5 S.S. Bucak and B. Gunsel, "Incremental Subspace Learning via Non-negative Matrix Factorization," Pattern Recognition, vol. 42, no. 5, May 2009, pp. 788-797.   DOI   ScienceOn
6 T. Virtanen, "Monaural Sound Source Separation by Nonnegative Matrix Factorization with Temporal Continuity and Sparseness Criteria," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 3, Mar. 2007, pp. 1066-1074.   DOI
7 X. Zhu, G.T. Beauregard, and L.L. Wyse, "Real-Time Signal Estimation from Modified Short-Time Fourier Transform Magnitude Spectra," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 5, July 2007, pp. 1645-1653.   DOI
8 G. Zhou et al., "Online Blind Source Separation Using Incremental Nonnegative Matrix Factorization with Volume Constraint," IEEE Trans. Neural Netw., vol. 22, no. 4, Apr. 2011, pp. 550-560.   DOI
9 Perceptual Evaluation of Speech Quality (PESQ): An Objective Method for End-to-End Speech Quality Assessment of Narrowband Telephone Networks and Speech Codecs, ITU-T Recommendation P.862, 2001.