DOI QR코드

DOI QR Code

Determining the Optimal Number of Signal Clusters Using Iterative HMM Classification

  • Received : 2018.04.20
  • Accepted : 2018.05.09
  • Published : 2018.06.30

Abstract

In this study, we propose an iterative clustering algorithm that automatically clusters a set of voice signal data without a label into an optimal number of clusters and generates hmm model for each cluster. In the clustering process, the likelihood calculations of the clusters are performed using iterative hmm learning and testing while varying the number of clusters for given data, and the maximum likelihood estimation method is used to determine the optimal number of clusters. We tested the effectiveness of this clustering algorithm on a small-vocabulary digit clustering task by mapping the unsupervised decoded output of the optimal cluster to the ground-truth transcription, we found out that they were highly correlated.

Keywords

References

  1. Johnson, S.E. and Woodland, P.C., "Speaker clustering using direct maximisation of the MLLR-adapted likelihood," In Fifth International Conference on Spoken Language Processing. 1998.
  2. Solomonoff, A., Mielke, A., Schmidt, M. and Gish, H., "Clustering speakers by their voices," Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on (Vol. 2, pp. 757-760). IEEE, . May 1998.
  3. Siu, M.H., Gish, H., Chan, A., Belfield, W. and Lowe, S., "Unsupervised training of an HMM-based self-organizing unit recognizer with applications to topic classification and keyword discovery," Computer Speech & Language, 28(1), pp.210-223., 2014. https://doi.org/10.1016/j.csl.2013.05.002
  4. Siu, M.H., Gish, H., Chan, A. and Belfield, W., "Improved topic classification and keyword discovery using an HMM-based speech recognizer trained without supervision," Eleventh Annual Conference of the International Speech Communication Association., 2010.
  5. Le, V.B. and Besacier, L., "Automatic speech recognition for under-resourced languages: application to Vietnamese language." IEEE Transactions on Audio, Speech, and Language Processing, 17(8), pp.1471-1482, 2009.. https://doi.org/10.1109/TASL.2009.2021723
  6. Kamper, H., Livescu, K. and Goldwater, S., "An embedded segmental k-means model for unsupervised segmentation and clustering of speech," arXiv preprint arXiv:1703.08135, 2017.
  7. Kamper, H., Jansen, A. and Goldwater, S., "Unsupervised word segmentation and lexicon discovery using acoustic word embeddings," IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), 24(4), pp.669-679, 2016. https://doi.org/10.1109/TASLP.2016.2517567
  8. Siu, M.H., Gish, H., Lowe, S. and Chan, A., "Unsupervised audio patterns discovery using HMM-based self-organized units," Twelfth Annual Conference of the International Speech Communication Association. 2011.
  9. Loof, J., Gollan, C. and Ney, H., "Cross-language bootstrapping for unsupervised acoustic model training: Rapid development of a Polish speech recognition system," Tenth Annual Conference of the International Speech Communication Association, 2009.
  10. Jansen, A., Church, K. and Hermansky, H., "Towards spoken term discovery at scale with zero resources," Eleventh Annual Conference of the International Speech Communication Association, 2010.
  11. Ma, J., Matsoukas, S., Kimball, O. and Schwartz, R., "Unsupervised training on large amounts of broadcast news data," Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on (Vol. 3, pp. III-III). IEEE, May 2006.
  12. Gish, H., Siu, M.H., Chan, A. and Belfield, B., "Unsupervised training of an HMM-based speech recognizer for topic classification," Tenth Annual Conference of the International Speech Communication Association, 2009.
  13. Chang-Ho Han, Choon-Suk Oh. "Implementation of a 3D Recognition applying Depth map and HMM," The Journal of The Institute of Webcasting, Internet and Telecommunication VOL. 12 No. 2, pp.119-126, December 2012.
  14. Sun-Jin Oh, "Design and Evaluation of a Weighted Intrusion Detection Method," The Journal of The Institute of Webcasting, Internet and Telecommunication VOL. 11 No. 3, pp. 181-188, June 2011.