References
- J. Sohn, N. S. Kim and W. Sung, "A statistical model-based voice activity detection," IEEE Signal Processing Letters, vol. 6, no. 1, pp. 1-3, January 1999.
- M. Hoffman, Z. Li, and D. Khataniar, "GSC-based spatial voice activity detection for enhanced speech coding in the presence of competing speech," IEEE Trans. on Speech and Audio Processing, vol. 9, no. 2, pp. 175-179, March 2001. https://doi.org/10.1109/89.902284
- S. F. Boll, "Suppression of acoustic noise in speech using spectral subtraction," IEEE Trans. Acoust., Speech, and Signal Processing, vol. ASSP-27, no. 2, pp. 113-120, April, 1979.
- L. F. Lamel, L. R. Rabiner, A. E. Rosenberg, and J. G. Wilpon, "An improved endpoint detector for isolated word recognition,", IEEE Trans. Acoust., Speech, and Signal Processing, vol. ASSP-29, no. 4, pp. 777-785, August 1981.
- B.-F. Wu, "Robust endpoint detection algorithm based on the adaptive band-partitioning spectral entropy in adverse environments," IEEE Trans. Speech and Audio Processing, vol. 13, no. 5, pp. 762-775, September 2005. https://doi.org/10.1109/TSA.2005.851909
- L. Armani, M. Matassoni, M. Omologo, and P. Svaizer, "Use of a CSP-based voice activity detector for distant-talking ASR," in Proceedings of EUROSPEECH, Geneva, 2003.
- G. Kim and N. I. Cho, "Voice activity detection using phase vector in microphone array," Electronics Letters, vol. 43, issue 14, pp. 783-784, July 2007. https://doi.org/10.1049/el:20070780
- H. Yehia, R. Rubin, and E. Vatikiotis-Bateson, "Quantitative association of vocal-tract and facial behavior," Speech Communication, vol. 26, no. 1, pp. 23-43, August 1998. https://doi.org/10.1016/S0167-6393(98)00048-X
- P. Liu and Z. Wang, "Voice activity detection using visual information," in Proceedings of ICASSP, pp. 609-612, Montreal, Canada, May 2004.
- T. Cootes, G. Edwards, and C. Taylor, "Active appearance models," IEEE trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 6, pp. 681-685, June 2001. https://doi.org/10.1109/34.927467
- A. Aubrey, B. Rivet, Y. Hicks, L. Girin, L. Chambers, and C. Jutten, "Two novel visual voice activity detectors based on appearance models and retinal filtering," in Proceedings of EUSIPCO, September 2007.
- S. Siatras, N. Nikolaidis, M. Krinidis, and I. Pitas, "Visual lip activity detection and speaker detection using mouth region intensities," IEEE trans. Circuits and Systems for Video Technology, vol. 19, no. 1, pp. 133-137, January 2009. https://doi.org/10.1109/TCSVT.2008.2009262
- R. Navarathna, D. Dean, P. Lucey, S. Sridharan, and C. Fookes, "Cascading appearance-based features for visual voice activity detection," in Proceedings of International Conference on Audio-Visual Speech Processing, Hakone, Japan, September 2010.
- A. Aubrey, Y. Hicks, and J. Chambers, "Visual voice activity detection with optical flow," Image Processing, IET, vol. 4, no. 4, pp. 463-472, December 2010. https://doi.org/10.1049/iet-ipr.2009.0042
- S. Tamura, K. Iwano, and S. Furui, "Multi-modal speech recognition using optical-flow analysis for lip images," J. VLSI Signal Process. Syst., vol. 36, pp. 117-124, February 2004. https://doi.org/10.1023/B:VLSI.0000015091.47302.07
- D. Sun, S. Roth, and M. Black, "Secrets of optical flow estimation and their principles," In Proceedings of IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 2432-2439, San Francisco, USA, June 2010.
- B. Lucas and T. Kanade, "An iterative image registration technique with an application to stereo vision," in Proceedings of the 7th International Joint Conference on Artificial Intelligence (IJCAI), pp. 674-679, April 1981.
- R. Navarathna, D. Dean, S. Sridharan, C. Fookes, and P. Lucey, "Visual voice activity detection using frontal versus profile views," In Proceedings of the International Conference on Digital Image Computing : Techniques and Applications, December 2011.
- P. Viola and M. Jones, "Robust Real-time Object Detection", Second International Workshop on Statistical and Computational Theories of Vision-Modeling, Learning, Computing, and Sampling, Vancouver, Canada, July 2001.
- Y. Freund and R.E. Schapire "A decision-theoretic generalization of on-line learning and an application to boosting", In Computational Learning Theory: Eurocolt, Springer-Verlag, pp. 23-37, 1995
- E. Skodras and N. Fakotakis, "An Unconstrained Method for Lip Detection in Color Images", in Proceedings of ICASSP, Prague, Czech, 2011.
- G. Fanelli and J. Gall and L. Van Gool, "Hough Transform-based Mouth Localization for Audio-Visual Speech Recognition", British Machine Vision Conference, 2009
- X. Liu, Y. Cheung M. Li and H. Liu, "A Lip Contour Extraction Method Using Localized Active Contour Model with Automatic Parameter Selection", 20th Int. Conf. on Pattern Recognition (ICPR), August 2010.
Cited by
- Visual Voice Activity Detection and Adaptive Threshold Estimation for Speech Recognition vol.34, pp.4, 2015, https://doi.org/10.7776/ASK.2015.34.4.321