DOI QR코드

DOI QR Code

Robust Speech Hash Function

  • Chen, Ning (School of Communication and Information Engineering, Shanghai University) ;
  • Wan, Wanggen (School of Communication and Information Engineering, Shanghai University)
  • Received : 2009.07.27
  • Accepted : 2009.12.18
  • Published : 2010.04.30

Abstract

In this letter, we present a new speech hash function based on the non-negative matrix factorization (NMF) of linear prediction coefficients (LPCs). First, linear prediction analysis is applied to the speech to obtain its LPCs, which represent the frequency shaping attributes of the vocal tract. Then, the NMF is performed on the LPCs to capture the speech's local feature, which is then used for hash vector generation. Experimental results demonstrate the effectiveness of the proposed hash function in terms of discrimination and robustness against various types of content preserving signal processing manipulations.

Keywords

References

  1. P. Cano et al., "A Review of Audio Fingerprinting," J. VLSI Signal Process., vol. 41, no. 3, 2005, pp. 271-284. https://doi.org/10.1007/s11265-005-4151-3
  2. A. Ramalingam and S. Krishnan, "Gaussian Mixture Modeling of Shorttime Fourier Transform Features for Audio Fingerprinting," IEEE Trans. Inf. Forensics Security, vol. 1, no. 4, 2006, pp. 457-463. https://doi.org/10.1109/TIFS.2006.885036
  3. M. Park, H. Kim, and S.H. Yang, "Frequency-Temporal Filtering for a Robust Audio Fingerprinting Scheme in Real-Noise Environments," ETRI J., vol. 28, no. 4, 2006, pp. 509-512. https://doi.org/10.4218/etrij.06.0205.0135
  4. Y. Jiao et al., "Key-Dependent Compressed Domain Audio Hashing," Proc. ISDA, 2008.
  5. Y. Jiao, Q. Li, and X. Niu, "Compressed Domain Perceptual Hashing for MELP Coded Speech," Proc. IIHMSP, 2008, pp. 410-413.
  6. D.D. Lee and H.S. Seung, "Learning the Parts of Objects by Nonnegative Matrix Factorization," Nature, vol. 401, no. 6755, 1999, pp. 788-791. https://doi.org/10.1038/44565

Cited by

  1. Erasable Photograph Tagging: A Mobile Application Framework Employing Owner's Voice vol.ed97, pp.2, 2014, https://doi.org/10.1587/transinf.e97.d.370
  2. An efficient perceptual hashing based on improved spectral entropy for speech authentication vol.77, pp.2, 2018, https://doi.org/10.1007/s11042-017-4381-y
  3. A high-performance speech perceptual hashing authentication algorithm based on discrete wavelet transform and measurement matrix vol.77, pp.16, 2010, https://doi.org/10.1007/s11042-018-5613-5
  4. Multi-format speech BioHashing based on spectrogram vol.79, pp.33, 2010, https://doi.org/10.1007/s11042-020-09211-y