References
- Y. Zhang and J. R. Glass "Unsupervised spoken keyword spotting via segmental DTW on Gaussian posteriorgrams," Proc. ASRU. 398-403 (2009).
- G. Mantena a nd K . Prahallad, " Use of articulatory bottle-neck features for query-by-example spoken term detection in low resource scenarios," Proc. ICASSP. 7128-7132 (2014).
- H. Lim, Y. Kim, Y. Kim, and H. Kim, "CNN-based bottleneck feature for noise robust query-by-example spoken term detection," Proc. APSIPA. 1278-1281 (2017).
- G. Chen, C. Parada, and T. N. Sainath, "Query-byexample keyword spotting using long short-term memory networks," Proc. ICASSP. 5236-5240 (2015).
- S. Settle and K. Livescu, "Discriminative acoustic word embeddings: Recurrent neural network-based approaches," Proc. SLT. 503-510 (2016).
- M. Jung, H. Lim, J. Goo, Y. Jung, and H. Kim, "Additional shared decoder on Siamese multi-view encoders for learning acoustic word embeddings," Proc. ASRU. 629-636 (2019).
- H. Lim, Y. Kim, J. Goo, and H. Kim, "Interlayer selective attention network for robust personalized wake-up word detection," IEEE Signal Process. Lett. 27, 126-130 (2020). https://doi.org/10.1109/LSP.2019.2959902
- Y. Ganin, H. Ajakan, H. Larochelle, F. Laviolette, and V. Lempitsky, "Domain-adversarial training of neural networks," J. Mach. Learn. Res. 17, 2096-2030 (2016).
- E. Tzeng, J. Hoffman, K. Saenko, and T. Darrell, "Adversarial discriminative domain adaptation," Proc. CVPR. 7167-7176 (2017).
- Z. Pei, Z. Cao, M. Long, and J. Wang, "Multi-adversarial domain adaptation," Proc. AAAI. 3934-3941 (2018).
- R. Wang, M. Utiyama, A. Finch, L. Liu, K. Chen, and E. Sumita, "Sentence selection and weighting for neural machine translation domain adaptation," IEEE/ ACM Trans. Audio, Speech, Lang. Process. 26, 1727-1741 (2018). https://doi.org/10.1109/TASLP.2018.2837223
- A. Tripathi, A. Mohan, S. Anand, and M. Singh, "Adversarial learning of raw speech features for domain invariant speech recognition," Proc. ICASSP. 5959-5963 (2018).
- S. Sun, C. F. Yeh, M. Y. Hwang, M. Ostendorf, and L. Xie, "Domain adversarial training for accented speech recognition," Proc. ICASSP. 4854-4858 (2018).
- S. Mirsamadi and J. H. Hansen, "Multi-domain adversarial training of neural network acoustic models for distant speech recognition," Speech Commun. 106, 21-30 (2019). https://doi.org/10.1016/j.specom.2018.10.010
- Y. Ganin and V. Lempitsky, "Unsupervised domain adaptation by backpropagation," Proc. ICML. 1180-1189 (2015).
- D. B. Paul and J. M. Baker, "The design for the Wall Street Journal-based CSR corpus," Proc. Workshop Speech and Natural Lang. 357-362 (1992).
- D. Dean, S. Sridharan, R. Vogt, and M. Mason, "The QUT-NOISE-TIMIT corpus for the evaluation of voice activity detection algorithms," Proc. Interspeech, 3110-3113 (2010).
- H. G. Hirsch and D. Pearce, "The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions," Proc. ISCA ITRW ASR. 181-188 (2000).
- M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Lenvnberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wiche, Y. Yu, and X. Zheng, "Tensor Flow: Large-scale machine learning on heterogeneous systems," Proc. USENIX OSDI. 265-283 (2016).
- D. Kingma and J. Ba, "Adam: A method for stochastic optimization," Proc. ICLR. 1-15 (2015).
- K. Hajian-Tilaki, "Receiver operating characteristic (ROC) curve analysis for medical diagnostic test evaluation," Caspian J. Intern. Med. 4, 627-635 (2013).