References
- N.W.D. Evans, C. Fredouille, and J.F. Bonastre, "Speaker Diarization Using Unsupervised Discriminant Analysis of Inter-Channel Delay Features," Proc. IEEE Int. Conf. Acoustics, Speech, Signal Process., ICASSP, 2009, pp. 4061-4064.
- J. Pelecanos and S. Sridharan, "Feature Warping for Robust Speaker Verification," A Speaker Odyssey - The Speaker Recognition Workshop, Crete, Greece, 2001, pp. 213-218.
- P. Ouellet, G. Boulianne, and P. Kenny, "Flavors of Gaussian Warping," Proc. Interspeech, 2005, pp. 2957-2960.
- R. Sinha et al., "The Cambridge University March 2005 Speaker Diarization System," Proc. Interspeech, 2005, pp. 2437-2440.
- X. Zhu et al., "Speaker Diarization: From Broadcast News to Lectures," Machine Learning for Multimodal Interaction, 2006, pp. 396-406.
- G. Friedland et al., "Prosodic and Other Long-Term Features for Speaker Diarization," IEEE Trans. Audio, Speech, Language Process., vol. 17, no. 5, 2009, pp. 985-993. https://doi.org/10.1109/TASL.2009.2015089
- G. Friedland et al., "Fusing Short Term and Long Term Features for Improved Speaker Diarization," IEEE Int. Conf. Acoustics, Speech, Signal Process., 2009, pp. 4077-4080.
- X. Serra, "Musical Sound Modeling with Sinusoids Plus Noise," Studies on New Music Research: Musical Signal Processing, C. Roads et al., Eds., The Netherlands: Swets & Zeitlinger, 1997, pp. 91-122.
- R.J. McAulay and T.F. Quatieri, "Magnitude-Only Reconstruction Using a Sinusoidal Speech Model," Proc. ICASSP, 1984, pp. 1-27.
- C. Cao et al., "Harmonic Structure Features for Robust Speaker Recognition against Channel Effect," 2nd Int. Symp. Inf. Sci. Eng., 2009, pp. 451-454.
- C. Wooters and M. Huijbregts, "The ICSI RT07s Speaker Diarization System," Multimodal Technologies for Perception of Humans, 2008, pp. 509-519.
- C. Fredouille and G. Senay, "Technical Improvements of the E-HMM Based Speaker Diarization System for Meeting Records," Machine Learning for Multimodal Interaction, May 2006, pp. 359-370.
- Y. Zhou et al., "An Improved Speaker Diarization System for Multiple Distance Microphone Meetings," 5th Int. Conf. Int. Computation Technol. Autom., 2012, pp. 80-83.
- A. Adami et al., "Qualcomm-ICSI-OGI Features for ASR," Proc. 7th Int. Conf. Spoken Language Process., 2002, pp. 21-24.
- BeamformIt toolkit. http://www.xavieranguera.com/beamformit/
- C. Wooters et al., "Toward Robust Speaker Segmentation: ICSI-SRI Fall 2004 Diarization System," Proc. Rich Transcription Workshop (RT-04), 2004.
- J. Ajmera, I. Lapidot, and I. McCowan, "Unknown Multiple Speaker Clustering Using HMM," Int. Conf. Spoken Language Process., 2002, pp. 573-576.
- S.S. Chen and P.S. Gopalakrishnan, "Speaker, Environment and Channel Change Detection and Clustering via the Bayesian Information Criterion," Proc. DARPA Broadcast News Transcription Understanding Workshop, 1998, pp. 127-132.
- C. Cao et al., "Singing Melody Extraction in Polyphonic Music by Harmonic Tracking," Proc. 8th Int. Conf. Music Inf. Retrieval, 2007, pp. 373-374.
- D. Imseng and G. Friedland, "Tuning-Robust Initialization Methods for Speaker Diarization," IEEE Trans. Audio, Speech, Language Process., vol. 18, no. 8, 2010, pp. 2028-2037. https://doi.org/10.1109/TASL.2010.2040796
- http://nist.gov/speech/tests/rt/rt2004/fall