DOI QR코드

DOI QR Code

A COVID-19 Diagnosis Model based on Various Transformations of Cough Sounds

기침 소리의 다양한 변환을 통한 코로나19 진단 모델

  • Minkyung Kim (DataWorld Co., Ltd.) ;
  • Gunwoo Kim (Department of Business Administration, Hanbat National University) ;
  • Keunho Choi (Department of Business Administration, Hanbat National University)
  • Received : 2023.04.25
  • Accepted : 2023.08.01
  • Published : 2023.09.30

Abstract

COVID-19, which started in Wuhan, China in November 2019, spread beyond China in 2020 and spread worldwide in March 2020. It is important to prevent a highly contagious virus like COVID-19 in advance and to actively treat it when confirmed, but it is more important to identify the confirmed fact quickly and prevent its spread since it is a virus that spreads quickly. However, PCR test to check for infection is costly and time consuming, and self-kit test is also easy to access, but the cost of the kit is not easy to receive every time. Therefore, if it is possible to determine whether or not a person is positive for COVID-19 based on the sound of a cough so that anyone can use it easily, anyone can easily check whether or not they are confirmed at anytime, anywhere, and it can have great economic advantages. In this study, an experiment was conducted on a method to identify whether or not COVID-19 was confirmed based on a cough sound. Cough sound features were extracted through MFCC, Mel-Spectrogram, and spectral contrast. For the quality of cough sound, noisy data was deleted through SNR, and only the cough sound was extracted from the voice file through chunk. Since the objective is COVID-19 positive and negative classification, learning was performed through XGBoost, LightGBM, and FCNN algorithms, which are often used for classification, and the results were compared. Additionally, we conducted a comparative experiment on the performance of the model using multidimensional vectors obtained by converting cough sounds into both images and vectors. The experimental results showed that the LightGBM model utilizing features obtained by converting basic information about health status and cough sounds into multidimensional vectors through MFCC, Mel-Spectogram, Spectral contrast, and Spectrogram achieved the highest accuracy of 0.74.

2019년 11월 중국 우한시에서 발병한 코로나19는 2020년 중국을 넘어 세계로 퍼져나가 2020년 3월에는 전 세계적으로 확산되었다. 코로나19와 같이 전염성이 강한 바이러스는 예방과 확진시 적극적인 치료도 중요하지만 우선 전파 속도가 빠른 바이러스인 점을 감안할 때, 확진 사실을 재빠르게 파악하여 전파를 차단하는 것이 더욱 중요하다. 그러나 감염여부를 확인하기 위한 PCR검사는 비용과 시간이 많이 소요되고, 자가키트검사 또한 접근성은 쉽지만 매번 수시로 받기에는 키트의 가격이 부담이 될 수밖에 없는 실정이다. 이러한 상황에서 기침 소리를 기반으로 코로나19 양성 여부를 판단할 수 있게 된다면 누구나 쉽게 언제, 어디서든 확진 여부를 체크할 수 있어 신속성과 경제성 측면에서 큰 장점을 가질 수 있을 것이다. 따라서 본 연구는 기침 소리를 기반으로 코로나19 확진 여부를 식별할 수 있는 분류 모델을 개발하는 것을 목적으로 하였다. 이를 위해, 본 연구에서는 먼저 MFCC, Mel-Spectrogram, Spectral contrast, Spectrogram 등을 통해 기침 소리를 벡터화 하였다. 이 때, 기침 소리의 품질을 위해 SNR을 통해 잡음이 많은 데이터는 삭제하였고, chunk를 통해 음성 파일에서 기침 소리만 추출하였다. 이후, 추출된 기침 소리의 feature를 이용하여 코로나 양성과 음성을 분류하기 위한 모델을 구축하였으며, XGBoost, LightGBM, FCNN 알고리즘을 통해 모델 학습을 수행하고 각 알고리즘별 성능을 비교하였다. 또한, 기침 소리를 다차원 벡터로 변환한 경우와, 이미지로 변환한 경우에 대해 모델 성능에 대한 비교 실험을 수행하였다. 실험 결과, 건강상태에 대한 기본정보와 기침 소리를 MFCC, Mel-Spectogram, Spectral contrast, 그리고 Spectrogram을 통해 다차원 벡터로 변환한 feature를 모두 활용한 LightGBM 모델이 0.74의 가장 높은 정확도를 보였다.

Keywords

References

  1. 김가혜, 이소현. (2022). 코로나 전후 행복 이슈 변화 분석 및 행복 증진 방안 연구. 지능정보 연구, 28(3), 81-103.
  2. 손명진. (2022). 기침소리 데이터를 이용한 COVID-19 감염자 진단. 상명대학교 석사학위논문.
  3. 오상우. (2019). 인공지능 기반의 음성분석을 통한 우울증, 불안증, 조기치매, 또는 자살 징후 조기판별 시스템. 특허등록번호: 10-2041848.
  4. 유성주, 김재윤. (2022). 기침 파형 패턴을 활용한 COVID-19 확진자 식별 딥러닝 모델. 한국통신학회 동계종합학슬발표회, 1299-1300.
  5. 유소연, 임규건. (2021). 텍스트 마이닝과 의미 네트워크 분석을 활용한 뉴스 의제 분석: 코로나 19 관련 감정을 중심으로. 지능정보 연구, 27(1), 47-64.
  6. 이예진. (2022). 기침 소리를 이용한 COVID-19 진단에 최적화된 feature들 제안. 상명대학교 석사학위논문.
  7. 이혜빈, 박현진. (2021). 기침 음성 데이터를 이용한 COVID-19 분류. 대한전자공학회 하계학술대회논문집, 1701-1703.
  8. 정연길, 서수민, 강현민. (2022). 토픽 모델링을 활용한 코로나19 초기 생활체육 이슈 분석. 지능정보연구, 28(3), 57-79.
  9. 최혜진. (2022). VGG16모델을 활용한 코로나 진단에 관한 연구. 한양대학교 석사학위논문.
  10. Abeyratne, U. R., Swarnkar, V., Setyati, A., & Triasih, R. (2013). Cough sound analysis can rapidly diagnose childhood pneumonia. Annals of Biomedical Engineering, 41(11), 2448-2462.
  11. Alqudaihi, K. S., Aslam, N., Khan, I. U., Almuhaideb, A. M., Alsunaidi, S. J., Ibrahim, N. M. A. R., ... & Alshahrani, M. S. (2021). Cough sound detection and diagnosis using artificial intelligence techniques: challenges and opportunities. Ieee Access, 9, 102327-102344.
  12. Amrulloh, Y., Abeyratne, U., Swarnkar, V., & Triasih, R. (2015). Cough sound analysis for pneumonia and asthma classification in pediatric population. In 2015 6th International Conference on Intelligent Systems, Modelling and Simulation, Kuala Lumpur, Malaysia, 127-131.
  13. Brown, C., Chauhan, J., Grammenos, A., Han, J., Hasthanasombat, A., Spathis, D., Xia, T., Cicuta, P., & Mascolo, C. (2020). Exploring automatic diagnosis of COVID-19 from crowdsourced respiratory sound data. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 3474-3484.
  14. Chen, T., & Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, 785-794.
  15. Daoud, E. (2019). Comparison between XGBoost, LightGBM and CatBoost Using a Home Credit Dataset. International Journal of Computer and Information Engineering, 13(1), 6 - 10.
  16. Dave, N. (2013). Feature extraction methods LPC, PLP, and MFCC in speech recognition. International Journal for Advance Research in Engineering and Technology, 1(6), 1-5.
  17. Dhaliwal, S. S., Nahid, A. A., Abbas, R.(2018) Effective Intrusion Detection System Using XGBoost. Information, 9(7), 149.
  18. Doulah, A. B. M. S. U., & Fattah, S. A. (2014). Neuromuscular disease classification based on mel frequency cepstrum of motor unit action potential. In 2014 International Conference on Electrical Engineering and Information & Communication Technology, 1-4.
  19. Fakhry, A., Jiang, X., Xiao, J. Chaudhari, G., Han, A., & Khanzada, A. (2021). Virufy: A multi-branch deep learning network for automated detection of COVID-19. Interspeech.
  20. Gaurav, R., Yahia-Cherif, L., Pyatigorskaya, N., Mangone, G., Biondetti, E., Valabregue, R., ... & Lehericy, S. (2021). Longitudinal changes in neuromelanin MRI signal in Parkinson's disease: a progression marker. Movement Disorders, 36(7), 1592-1602.
  21. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Patatern Recognition, 770-778.
  22. Imran, A., Posokhova, I., Qureshi, H. N., Masood, U., Riaz, M. S., Ali, K., John, C. N., Hussain, M. I., & Nabeel, M. (2020). AI4COVID-19: AI enabled preliminary diagnosis for COVID-19 from cough samples via an app. Informatics in Medicine Unlocked, 20, 100378.
  23. Issa, D., Demirci, M. F., & Yazici, A. (2020). Speech emotion recognition with deep convolutional neural networks. Biomedical Signal Processing and Control, 59, 101894.
  24. Ittichaichareon, C., Suksri, S., & Yingthawornsuk, T. (2012). Speech recognition using MFCC.
  25. International Conference on Computer Graphics, Simulation and Modeling, Pattaya, Thailand. Jana, B., & Nath, P. K. (2022). A Single-Chip Solution for Diagnosing Peripheral Arterial Disease. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 30(5), 671-675.
  26. Jana, B., Oswal, K., Mitra, S., Saha, G., & Banerjee, S. (2019). Detection of peripheral arterial disease using Doppler spectrogram based expert system for Point-of-Care applications. Biomedical Signal Processing and Control, 54, 101599.
  27. Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., & Liu, T. Y. (2017). Lightgbm: A highly efficient gradient boosting decision tree. Advances in Neural Information Processing Systems, 30.
  28. Kumar, A., Abhishek, K., Ghalib, M. R., Nerurkar, P., Shah, K., Chandane, M., Bhirud, S., Patel, D., & Busnel, Y. (2022). Towards cough sound analysis using the internet of things and deep learning for pulmonary disease prediction. Transactions on Emerging Telecommunications Technologies, 33(10), e4184.
  29. McFee, B., Raffel, C., Liang, D., Ellis, D. P., McVicar, M., Battenberg, E., & Nieto, O. (2015). Librosa:audio and music signal analysis in python, In Proceedings of the 14th Python in Science Conference. Austin, TX, 18-24.
  30. Mishra, D., & Sahu, B. (2011). Feature selection for cancer classification: a signal-to-noise ratio approach. International Journal of Scientific & Engineering Research, 2(4), 1-7.
  31. Muda, L., Begam, M., & Elamvazuthi, I. (2010). Voice recognition algorithms using mel frequency cepstral coefficient (MFCC) and dynamic time warping (DTW) techniques. Journal of Computing, 2(3), 138-143.
  32. Ogunleye, A., and Wang, Q. G.(2020). XGBoost Model for Chronic Kidney Disease Diagnosis. In IEEE/ACM Transactions on Computational Biology and Bioinformatics, 17(6), 2131-2140. https://doi.org/10.1109/TCBB.2019.2911071
  33. Orlandic, L., Teijeiro, T., & Atienza, D. (2021). The COUGHVID crowdsourcing dataset: A corpus for the study of large-scale cough analysis algorithms. Science Data 8, 156, https://doi.org/10.1038/s41597-021-00937-4.
  34. Pramono, R. X. A., Imtiaz, S. A., & Rodriguez-Villegas, E. (2016). A cough-based algorithm for automatic diagnosis of pertussis. PloS One, 11(9), e0162128.
  35. Rezaee, K., Savarkar, S., Yu, X., & Zhang, J. (2022). A hybrid deep transfer learning-based approach for Parkinson's disease classification in surface electromyography signals. Biomedical Signal Processing and Control, 71, 103161.
  36. Rezende, E., Ruppert, G., Carvalho, T., Ramos, F., & Geus, P.(2017). Malicious Software Classification Using Transfer Learning of ResNet-50 Deep Neural Network. In Proceedings of 16th IEEE International Conference on Machine Learning and Applications (ICMLA), Cancun, Mexico, 1011-1014.
  37. Sadi, T. M., & Hassan, R. (2020). Development of classification methods for wheeze and crackle using mel frequency cepstral coefficient (MFCC): A deep learning approach. International Journal on Perceptive and Cognitive Computing, 6(2), 107-114.
  38. Singh, Y., & Biswas, A. (2022). Robustness of musical features on deep learning models for music genre classification. Expert Systems with Applications, 199, 116879.
  39. Su, Y., Zhang, K., Wang, J., & Madani, K. (2019). Environment sound classification using a two-stream CNN based on decision-level fusion. Sensors, 19(7), 1733.
  40. Tadesse, G. A., Javed, H., Thanh, N. L. N., Thi, H. D. H., Thwaites, L., Clifton, D. A., & Zhu, T. (2020). Multi-modal diagnosis of infectious diseases in the developing world. IEEE Journal of Biomedical and Health Informatics, 24(7), 2131-2141.
  41. Tiwari, V. (2010). MFCC and its applications in speaker recognition. International Journal on Emerging Technologies, 1(1), 19-22.
  42. Wang, D., Zhang, Y., & Zhao, Y.(2017). LightGBM: An effective miRNA classification method in breast cancer patients. In Proceedings of the 2017 International Conference on Computational Biology and Bioinformatics, 7-11.
  43. Wen, L., Li, X., & Gao, L. A.(2020). Transfer convolutional neural network for fault diagnosis based on ResNet-50. Neural Computing and Applications, 32, 6111-6124.