Acknowledgement
이 논문은 2022년도 광운대학교 교내학술연구비 지원과 2021년도 정부(과학기술정보통신부)의 재원으로 한국연구재단의 지원(NRF-2021R1F1A1059233)과 2022년도 정부(산업통상자원부)의 재원으로 한국산업기술진흥원의 지원(P0017124, 산업혁신인재성장지원사업)을 받아 수행된 연구임.
References
- S. Maymon, E. Marcheret, and V. Goel, "Restoration of clipped signals with application to speech recognition," Proc. Interspeech, pp. 3294-3297, Aug. 2013. doi: https://doi.org/10.21437/Interspeech.2013-729
- W. Mack and E. A. P. Habets, "Declipping speech using deep filtering," IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, pp. 200-204, Dec. 2019. doi: https://doi.org/10.1109/WASPAA.2019.8937287
- P. Zaviska, P. Rajmic, A. Ozerov, and L. Rencker, "A survey and an extensive evaluation of popular audio declipping methods," IEEE J. Selected Topics in Signal Processing, Vol. 15, No. 1, pp. 5-24, Jan. 2021. doi: https://doi.org/10.1109/JSTSP.2020.3042071
- J. Y. Jung and G. B. Kim "Adaptation of classification model for improving speech intelligibility in noise," J. of Broadcast Engineering, Vol. 23, No. 4, pp. 511-518, July 2018. doi: https://doi.org/10.5909/JBE.2018.23.4.511
- Adobe Audition 3 USER GUIDE, https://help.adobe.com/archive/en_US/audition/3/audition_3_help.pdf (accessed Nov. 1, 2022).
- A. A. Nair and K. Koishida, "Cascaded time + time-frequency Unet for speech enhancement: Jointly addressing clipping, codec distortions, and gaps," Proc. IEEE Int. Conf. Acoust. Speech Signal Process. pp. 7153-7157, 2021. doi: https://doi.org/10.1109/ICASSP39728.2021.9414721
- C. Macartney and T. Weyde, "Improved speech enhancement with the Wave-U-net," arXiv preprint arXiv:1811.11307, Nov. 2018. doi: https://doi.org/10.48550/arXiv.1811.11307
- S. Pascual, A. Bonafonte, and J. Serra, "SEGAN: Speech enhancement generative adversarial network," Proc. Interspeech, pp. 3642-3646, Mar. 2017. doi: https://doi.org/10.21437/Interspeech.2017-1428
- X. Qin, Z. Zhang, C. Huang, M. Dehghan, O. R. Zaiane, and M. Jagersand, "U2-net: Going deeper with nested U-structure for salient object detection," Pattern Recognition, Vol. 106, No. 107404, Oct. 2020. doi: https://doi.org/10.1016/j.patcog.2020.107404
- H.-S. Choi, J.-H. Kim, J. Huh, A. Kim, J.-W. Ha, and K. Lee, "Phase-aware speech enhancement with deep complex u-net," arXiv preprint arXiv:1903.03107, Mar. 2019. doi: https://doi.org/10.48550/arXiv.1903.03107
- G. B. Kim "Binary mask estimation using training-based SNR estimation for improving speech intelligibility," J. of Broadcast Engineering, Vol. 1, No. 6, pp. 1061-1068, Nov. 2012. doi: http://dx.doi.org/10.5909/JBE.2012.17.6.1061
- Y. Hu and P. C. Loizou, "Evaluation of objective measures for speech enhancement." IEEE Trans. on Audio, Speech, and Language Processing, Vol. 16, No. 1, pp. 229-238, Jan. 2008. doi: https://doi.org/10.1109/TASL.2007.911054
- Evaluation measures open source, https://www.crcpress.com/downloads/K14513/K14513_CD_Files.zip (accessed Nov. 1, 2022).
- J. Yamagishi, C. Veaux, and K. MacDonald, "CSTR VCTK corpus: English multi-speaker corpus for CSTR voice cloning toolkit (version 0.92)," University of Edinburgh. The Centre for Speech Technology Research, 2019. doi: https://doi.org/10.7488/ds/2645
- D. P. Kingma and J. Ba, "Adam: A method for stochastic optimization," arXiv preprint arXiv:1412.6980, Dec. 2014. doi: https://doi.org/10.48550/arXiv.1412.6980