References
- A. Martin and M. Przybocki, "Speaker Recognition in a Multispeaker Environment," Proc. Eur. Conf. Speech Commun. Technol., vol. 2, 2001, pp. 787-790.
- S. Meignier et al., "Step-by-Step and Integrated Approaches in Broadcast News Speaker Diarization," Comput. Speech Language, vol. 20, no. 2-3, 2006, pp. 303-330. https://doi.org/10.1016/j.csl.2005.08.002
- T.H. Nguyen, H. Li, and E.S. Chng, "Cluster Criterion Functions in Spectral Subspace and Their Application in Speaker Clustering," Proc. ICASSP, 2009, pp. 4085-4088.
- K. Iso, "Speaker Clustering Using Vector Quantization and Spectral Clustering," Proc. ICASSP, 2010, pp. 4986-4989.
- M. Kotti, V. Moschou, and C. Kotropoulos, "Speaker Segmentation and Clustering," Signal Process., vol. 88, no. 5, 2008, pp. 1091-1124. https://doi.org/10.1016/j.sigpro.2007.11.017
- S. Kwon and S. Narayanan, "Unsupervised Speaker Indexing Using Generic Models," IEEE Trans. Speech Audio Process., vol. 13, 2004, pp.1004-1013.
- M. Davy et al., "Supervised Classification Using MCMC Methods," Proc. ICASSP, 2000, pp. 33-36.
- K. Markov and S. Nakamura, "Never-Ending Learning with Dynamic Hidden Markov Network," Proc. Interspeech, 2007, pp. 1437-1440.
- D.A. Reynolds, "Speaker Identification and Verification Using Gaussian Mixture Speaker Models," Speech Commun., vol. 17, no. 1-2, 1995, pp. 91-108. https://doi.org/10.1016/0167-6393(95)00009-D
- K. Markov and S. Nakamura, "Improved Novelty Detection for Online GMM Based Speaker Diarization," Proc. Interspeech, 2008, pp. 363-366.
- M. Zamalloa et al., "Low Latency Online Speaker Tracking on the AMI Corpus of Meeting Conversations," Proc. ICASSP, 2010, pp. 4962-4965.
- A.K. Noulas and B.J.A. Krose, "Online Multimodal Speaker Diarization," Int. Conf. Multi-modal Inferences, 2007, pp. 350- 357.
- J. Schmalenstroeer et al., "Fusing Audio and Video Information for Online Speaker Diarization," Proc. ASRU, 2007, pp. 1163- 1166.
- C. Vaquero, O. Vinyals, and G. Friedland, "A Hybrid Approach to Online Speaker Diarization," Proc. Interspeech, 2010, pp. 2638-2631.
- C. Wooters and M. Huijbregts, "The ICSI RT07s Speaker Diarization System," Proc. RT Meeting Recognition Evaluation Workshop, 2007.
- D.A. Reynolds, T.F. Quatieri, and R.B. Dunn, "Speaker Verification Using Adapted Gaussian Mixture Models," Digit. Signal Process., vol. 10, no. 1-3, 2000, pp. 19-41. https://doi.org/10.1006/dspr.1999.0361
- R. Kuhn et al., "Rapid Speaker Adaptation in Eigenvoice Space," IEEE Trans. Speech Audio Process., vol. 8, no. 4, 2000, pp. 695- 707. https://doi.org/10.1109/89.876308
- C.H. Huang, J.T. Chien, and H.M. Wang, "A New Eigenvoice Approach to Speaker Adaptation," Proc. Int. Symp. Chinese Spoken Language Process., 2004, pp. 109-112.
- X. Anguera et al., "Frame Purification for Cluster Comparison in Speaker Diarization," Proc. 2nd Int. Workshop Multimodal User Authentication, 2006, pp. 135-139.
- I.T. Jolliffe, Principal Component Analysis, Springer-Verlag, 1986.
- A. Dempster, N. Laird, and D. Rubin, "Maximum Likelihood from Incomplete Data via the EM Algorithm," J. Royal Statistical Soc., series B, vol. 39, no. 1, 1977, pp. 1-38.
- S. Berrani, L. Amsaleg, and P. Gros, "Robust Content-Based Image Searches for Copyright Protection," Proc. ACM Workshop Multimedia Databases, 2003, pp. 70-77.
- P. Zezula et al., "Similarity Search: The Metric Space Approach," Adv. Database Syst., vol. 32, 2006, pp. 23-38.
- S. Kullback and R.A. Leibler, "On Information and Sufficiency," Annals of Mathematical Statistics, vol. 22, no. 1, 1951, pp. 79-86. https://doi.org/10.1214/aoms/1177729694
- M.H. Moattar and M.M. Homayounpour, "A Weighted Feature Voting Approach for Robust and Real-Time Voice Activity Detection," ETRI J., vol. 33, no. 1, 2011, pp. 99-109. https://doi.org/10.4218/etrij.11.1510.0158
- J. Garofolo et al., "NIST Rich Transcription 2002 Evaluation: A Preview," Proc. Language Resources Evaluation Conf., May 2002.
- Y.K. Muthusamy et al. "The OGI Multi-language Telephone Speech Corpus," Proc. ICSLP, vol. 2, 1992, pp. 895-898.
- M. Bijankhan, Great Farsdat Database, Technical report, Iran Research Center on Intelligent Signal Processing, 2002.
- The 2009 (RT-09) Rich Transcription Evaluation Plan, http://www.itl.nist.gov/iad/mig//tests/rt/2009/docs/rt09-meetingeval- plan-v2.pdf, last accessed on Dec. 6, 2010.
- L.R. Rabiner and B.H. Juang, Fundamentals of Speech Recognition, Englewood Cliffs, NJ: Prentice-Hall, 1993.
- W. Wang et al., "A Decision-Tree-Based Online Speaker Clustering," Lect. Notes Comput. Sci., vol. 4477, 2007, pp. 555- 562.
- K. Chen et al., "Fast Speaker Adaptation Using Eigenspace-Based Maximum Likelihood Linear Regression," Proc. ICSLP, 2000, pp. 742-745.
- B. Mak, J.T. Kwok, and S. Ho, "A Study of Various Composite Kernels for Kernel Eigenvoice Speaker Adaptation," Proc. ICASSP, vol. 1, 2004, pp. 325-328.