1 |
Y. Zhou et al., "An Improved Speaker Diarization System for Multiple Distance Microphone Meetings," 5th Int. Conf. Int. Computation Technol. Autom., 2012, pp. 80-83.
|
2 |
A. Adami et al., "Qualcomm-ICSI-OGI Features for ASR," Proc. 7th Int. Conf. Spoken Language Process., 2002, pp. 21-24.
|
3 |
BeamformIt toolkit. http://www.xavieranguera.com/beamformit/
|
4 |
C. Wooters et al., "Toward Robust Speaker Segmentation: ICSI-SRI Fall 2004 Diarization System," Proc. Rich Transcription Workshop (RT-04), 2004.
|
5 |
J. Ajmera, I. Lapidot, and I. McCowan, "Unknown Multiple Speaker Clustering Using HMM," Int. Conf. Spoken Language Process., 2002, pp. 573-576.
|
6 |
S.S. Chen and P.S. Gopalakrishnan, "Speaker, Environment and Channel Change Detection and Clustering via the Bayesian Information Criterion," Proc. DARPA Broadcast News Transcription Understanding Workshop, 1998, pp. 127-132.
|
7 |
C. Cao et al., "Singing Melody Extraction in Polyphonic Music by Harmonic Tracking," Proc. 8th Int. Conf. Music Inf. Retrieval, 2007, pp. 373-374.
|
8 |
D. Imseng and G. Friedland, "Tuning-Robust Initialization Methods for Speaker Diarization," IEEE Trans. Audio, Speech, Language Process., vol. 18, no. 8, 2010, pp. 2028-2037.
DOI
|
9 |
http://nist.gov/speech/tests/rt/rt2004/fall
|
10 |
N.W.D. Evans, C. Fredouille, and J.F. Bonastre, "Speaker Diarization Using Unsupervised Discriminant Analysis of Inter-Channel Delay Features," Proc. IEEE Int. Conf. Acoustics, Speech, Signal Process., ICASSP, 2009, pp. 4061-4064.
|
11 |
J. Pelecanos and S. Sridharan, "Feature Warping for Robust Speaker Verification," A Speaker Odyssey - The Speaker Recognition Workshop, Crete, Greece, 2001, pp. 213-218.
|
12 |
P. Ouellet, G. Boulianne, and P. Kenny, "Flavors of Gaussian Warping," Proc. Interspeech, 2005, pp. 2957-2960.
|
13 |
R. Sinha et al., "The Cambridge University March 2005 Speaker Diarization System," Proc. Interspeech, 2005, pp. 2437-2440.
|
14 |
X. Zhu et al., "Speaker Diarization: From Broadcast News to Lectures," Machine Learning for Multimodal Interaction, 2006, pp. 396-406.
|
15 |
G. Friedland et al., "Prosodic and Other Long-Term Features for Speaker Diarization," IEEE Trans. Audio, Speech, Language Process., vol. 17, no. 5, 2009, pp. 985-993.
DOI
|
16 |
C. Cao et al., "Harmonic Structure Features for Robust Speaker Recognition against Channel Effect," 2nd Int. Symp. Inf. Sci. Eng., 2009, pp. 451-454.
|
17 |
G. Friedland et al., "Fusing Short Term and Long Term Features for Improved Speaker Diarization," IEEE Int. Conf. Acoustics, Speech, Signal Process., 2009, pp. 4077-4080.
|
18 |
X. Serra, "Musical Sound Modeling with Sinusoids Plus Noise," Studies on New Music Research: Musical Signal Processing, C. Roads et al., Eds., The Netherlands: Swets & Zeitlinger, 1997, pp. 91-122.
|
19 |
R.J. McAulay and T.F. Quatieri, "Magnitude-Only Reconstruction Using a Sinusoidal Speech Model," Proc. ICASSP, 1984, pp. 1-27.
|
20 |
C. Wooters and M. Huijbregts, "The ICSI RT07s Speaker Diarization System," Multimodal Technologies for Perception of Humans, 2008, pp. 509-519.
|
21 |
C. Fredouille and G. Senay, "Technical Improvements of the E-HMM Based Speaker Diarization System for Meeting Records," Machine Learning for Multimodal Interaction, May 2006, pp. 359-370.
|