Recovery of Lost Speech Segments Using Incremental Subspace Learning

Huang, Jianjun;Zhang, Xiongwei;Zhang, Yafei;

doi:10.4218/etrij.12.0211.0408

ETRI Journal

Volume 34 Issue 4
/
Pages.645-648
/
2012
/
1225-6463(pISSN)
/
2233-7326(eISSN)

Electronics and Telecommunications Research Institute (한국전자통신연구원)

DOI QR Code

Recovery of Lost Speech Segments Using Incremental Subspace Learning

Huang, Jianjun (Institute of Command Automation, PLA University of Science and Technology) ;
Zhang, Xiongwei (Institute of Command Automation, PLA University of Science and Technology) ;
Zhang, Yafei (Institute of Command Automation, PLA University of Science and Technology)

Received : 2011.09.22
Accepted : 2012.03.23
Published : 2012.08.30

https://doi.org/10.4218/etrij.12.0211.0408 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

An incremental subspace learning scheme to recover lost speech segments online is presented. Our contributions in this work are twofold. First, the recovery problem is transformed into an interpolation problem of the time-varying gains via nonnegative matrix factorization. Second, incremental nonnegative matrix factorization is employed to allow online processing and track the evolution of speech statistics. The effectiveness of the proposed scheme is confirmed by the experiment results.

Keywords

References

Appendix I: A High Quality Low-Complexity Algorithm for Packet Loss Concealment with G.711, ITU-T Recommend G.711, Sept. 1999.
Y.J. Liang, N. Färber, and B. Girod, "Adaptive Playout Scheduling and Loss Concealment for Voice Communication over IP Networks," IEEE Trans. Multimedia, vol. 5, no. 2, June 2003, pp. 532-543.
E. Zavarehei and S. Vaseghi, "Interpolation of Lost Speech Segments Using LP-HNM Model with Codebook Post-Processing," IEEE Trans. Multimedia, vol. 10, no. 3, Apr. 2008, pp. 493-502. https://doi.org/10.1109/TMM.2008.917345
C.A. Rødbro et al., "Hidden Markov Model-Based Packet Loss Concealment for Voice over IP," IEEE Trans. Audio, Speech, Lang. Process., vol. 14, no. 5, Sept. 2006, pp. 1609-1623. https://doi.org/10.1109/TSA.2005.858561
S.S. Bucak and B. Gunsel, "Incremental Subspace Learning via Non-negative Matrix Factorization," Pattern Recognition, vol. 42, no. 5, May 2009, pp. 788-797. https://doi.org/10.1016/j.patcog.2008.09.002
T. Virtanen, "Monaural Sound Source Separation by Nonnegative Matrix Factorization with Temporal Continuity and Sparseness Criteria," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 3, Mar. 2007, pp. 1066-1074. https://doi.org/10.1109/TASL.2006.885253
X. Zhu, G.T. Beauregard, and L.L. Wyse, "Real-Time Signal Estimation from Modified Short-Time Fourier Transform Magnitude Spectra," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 5, July 2007, pp. 1645-1653. https://doi.org/10.1109/TASL.2007.899236
G. Zhou et al., "Online Blind Source Separation Using Incremental Nonnegative Matrix Factorization with Volume Constraint," IEEE Trans. Neural Netw., vol. 22, no. 4, Apr. 2011, pp. 550-560. https://doi.org/10.1109/TNN.2011.2109396
Perceptual Evaluation of Speech Quality (PESQ): An Objective Method for End-to-End Speech Quality Assessment of Narrowband Telephone Networks and Speech Codecs, ITU-T Recommendation P.862, 2001.

Cited by

Approach for time-scale modification of speech based on TCNMF vol.49, pp.1, 2013, https://doi.org/10.1049/el.2012.3262
Adaptive Speech Streaming Based on Speech Quality Estimation and Artificial Bandwidth Extension for Voice over Wireless Multimedia Sensor Networks vol.11, pp.6, 2015, https://doi.org/10.1155/2015/395752
Adaptive Speech Streaming Based on Packet Loss Prediction Using Support Vector Machine for Software-Based Multipoint Control Unit over IP Networks vol.38, pp.6, 2012, https://doi.org/10.4218/etrij.16.2716.0013

ETRI Journal

Recovery of Lost Speech Segments Using Incremental Subspace Learning

Abstract

Keywords

References

Cited by

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)