Acknowledgement
이 연구는 2018년도 산업통상자원 부 및 산업기술평가관리원(KEIT) 연구비 지원에 의한 연구임(10080681).
References
- S. Zhao, X. Xiao, Z. Zhang, T. N. T. Nguyen, X. Zhong, B. Ren, L. Wang, L. J. Douglas, E. Chng, and H. Li, "Robust speech recognition using beamforming with adaptive microphone gains and multichannel noise reduction," Proc. IEEE Workshop on ASRU. 460-467 (2015).
- Y. Tachioka, T. Narita, I. Miura, T. Uramoto, N. Monta, S. Uenohara, K. Furuya, S. Watanabe, and J. L. Roux, "Coupled Initialization of multi-channel nonnegative matrix factorization based on spatial and spectral information," Proc. 2017 INTERSPEECH, 2461-2465 (2017).
- D. Kitamura, N. Ono, H. Sawada, H. Kameoka, and H. Saruwatari, "Determined blind source separation unifying independent vector analysis and non-negative matrix factorization," IEEE Trans. on Audio, Speech, and Lang. Process. 24, 1626-1641 (2016). https://doi.org/10.1109/TASLP.2016.2577880
- T. V. d. Bogaert, S. Doclo, J. Wouters, and M. Moonen, "Speech enhancement with multichannel Wiener filter techniques in multimicrophone binaural hearing aids," J. Acoust. Soc. Am. 125, 360-371 (2009). https://doi.org/10.1121/1.3023069
- E. A. Habets, J. Benesty, S. Gannot, and I. Cohen, "The MVDR beamformer for speech enhancement," Proc. Speech Processing in Modern Communication, 225-254 (2010).
- E. Warsitz and R. Haeb-Umbach, "Blind acoustic beam-forming based on generalized eigenvalue decomposition," IEEE Trans. on audio, speech, and lang. process. 15, 1529-1539 (2007). https://doi.org/10.1109/TASL.2007.898454
- S. Gannot and I. Cohen, "Speech enhancement based on the general transfer function GSC and postfiltering," IEEE Trans. on Speech and Audio Process. 12, 561-571(2004). https://doi.org/10.1109/TSA.2004.834599
- J. Heymann, L. Drude, A. Chinaev, and R. Haeb-Umbach, "BLSTM supported GEV beamformer frontend for the 3rd CHiME challenge," Proc. IEEE Workshop on ASRU. 444-451 (2015).
- C. Deng, H. Song, Y. Zhang, Y. Sha, and X. Li, "DNN-based mask estimation integrating spectral and spatial features for robust beamforming," Proc. ICASSP. 4647-4651 (2020).
- Y. Liu, A. Ganguly, K. Kamath, and T. Kristjansson, "Neural network based time-frequency masking and steering vector estimation for two-channel MVDR beamforming," Proc. ICASSP. 6717-6721 (2018).
- N. Shankar, G. S. Bhat, and I. M. Panahi, "Real-time dual-channel speech enhancement by VAD assisted MVDR beamformer for hearing aid applications using smartphone," Proc. 42nd Annual Int. Conf. of the IEEE EMBC. 952-955 (2020).
- Y. Zhou, Y. Chen, Y. Ma, and H. Liu, "A real-time dual-microphone speech enhancement algorithm assisted by bone conduction sensor," Sensors, 20, 5050 (2020). https://doi.org/10.3390/s20185050
- T. Higuchi, N. Ito, S. Araki, T. Yoshioka, M. Delcroix, and T. Nakatani, "Online MVDR beamformer based on complex Gaussian mixture model with spatial prior for noise robust ASR," IEEE Trans. on audio, speech, and lang. process. 25, 780-793 (2017). https://doi.org/10.1109/TASLP.2017.2665341
- J. Barker, R. Marxer, E. Vincent, and S. Watanabe, "The third 'CHiME'speech separation and recognition challenge: Dataset, task and baselines," Proc. 2015 IEEE Workshop on ASRU. 504-511 (2015).
- Z. Rafii, A. Liutkus, F. R. Stoter, S. I. Mimilakis, and R. Bittner, MUSDB18 - a corpus for music separation (2017).
- J. Heymann, L. Drude, and R. Haeb-Umbach, "Neural network based spectral mask estimation for acoustic beamforming," Proc. IEEE ICASSP. 196-200 (2016).
- E. Warsitz and R. Haeb-Umbach, "Blind acoustic beamforming based on generalized eigenvalue decomposition," IEEE Trans. on audio, speech, and lang. process. 15, 1529-1539 (2007). https://doi.org/10.1109/TASL.2007.898454
- J. S. Lim and A. V. Oppenheim, "Enhancement and bandwidth compression of noisy speech," Proc. IEEE. 1586-1604 (1979).
- D. Gala, A. Vasoya, and V. M. Misra, "Speech enhancement combining spectral subtraction and beam-forming techniques for microphone array," Proc. the Int. Conf. and Workshop on Emerging Trends in Technology, 163-166 (2010).
- Y. Takahashi, Y. Uemura, H. Saruwatari, K. Shikano, and K. Kondo, "Structure selection algorithm for less musical-noise generation in integration systems of beamforming and spectral subtraction," Proc. 2009 IEEE/SP 15th Workshop on Statistical Signal Processing, 701-704 (2009).
- S. Karimian-Azari and T. H. Falk, "Modulation spectrum based beamforming for speech enhancement," Proc. 2017 IEEE WASPAA. 91-95 (2017).
- H. Saruwatari, S. Kurita, K. Takeda, F. Itakura, T. Nishikawa, and K. Shikano, "Blind source separation combining independent component analysis and beam-forming," EURASIP J. Advances in Signal Processing, 2003, 569270 (2003). https://doi.org/10.1155/S1110865703305104
- Google WebRTC, https://webrtc.org/, (Last viewed September 1, 2021).
- Google Web Speech API, https://wicg.github.io/speechapi/, (Last viewed September 1, 2021).