DOI QR코드

DOI QR Code

트랜스포머를 이용한 음성기반 코비드19 진단

Audio-based COVID-19 diagnosis using separable transformer

  • 강승태 (경북대학교 전자전기공학부) ;
  • 장길진 (경북대학교 전자전기공학부)
  • 투고 : 2023.04.18
  • 심사 : 2023.05.17
  • 발행 : 2023.05.31

초록

본 연구에서는 코로나 바이러스 감염증은 음성만으로 빠르게 진단하는 효율적인 방법을 제안하였다. 기존의 딥러닝 기반 방법들의 연산시간과 대용량 학습자료 요구조건을 완화하기 위해서 Separable Transformer(SepTr)의 구조를 개선하여 파라미터의 수를 대폭 감소시키고 빠른 진단을 가능하게 하는 새로운 Strided Convolution Separable Transformer(SC-SepTr)를 제안하였다. 공개 음향 데이터인 Coswara에 대하여 실험을 수행한 결과 제안된 방법은 상대적으로 소규모의 학습자료에 대해서도 Area Under the Curve(AUC) 성능을 보장하면서도 신속하게 진단을 수행할 수 있음을 보였다.

In this paper, we proposed an efficient method for rapid diagnosis of COVID-19 by voice. A novel Strided Convolution Separable Transformer (SC-SepTr) is proposed by modifying the conventional Separable Transformer (SepTr) for audio signal recognition. The proposed method reduces the memory and computational requirements to enable rapid diagnosis of COVID-19. As a result of experiments on Coswara, it was shown that the proposed method perform rapid diagnosis with guaranteeing Area Under the Curve (AUC) performance even for a relatively small amount of learning data.

키워드

과제정보

본 연구는 행정안전부의 재원으로 방역연계범부처감염병연구개발 사업단의 지원을 받아 수행되었습니다(과제고유번호 : 20016180, 100 %).

참고문헌

  1. World Health Organization Official COVID-19 info Official Website, https://www.who.int/covid-19, (Last viewed May 23, 2023). 
  2. L. Orlandic, T. Teijeiro, and D. Atienza, "The COUGHVID crowdsourcing dataset, a corpus for the study of large-scale cough analysis algorithms," Sci. Data, 8, 1-10 (2021). 
  3. N. K. Sharma, P. Krishnan, R. Kumar, S. Ramoji, S. R. Chetupalli, N. R., P. K. Ghosh, and S. Ganapathy, "Coswara - a database of breathing, cough, and voice sounds for COVID-19 diagnosis," Proc. Interspeech, 4811-4815 (2020). 
  4. D. T. Pizzo and S. Esteban, "IATos: AI-powered prescreening tool for COVID-19 from cough audio samples," arXiv:2104.13247 (2021). 
  5. A. Mallol-Ragolta, H. Cuesta, E. Gomez, and B. Schuller, "Multi-type outer product-based fusion of respiratory sounds for detecting COVID-19," Proc. Interspeech, 2163-2167 (2022). 
  6. V. S. Nallanthighal, A. Harma, and H. Strik. "COVID-19 detection based on respiratory sensing from speech," Proc. Interspeech, 2498-2502 (2022). 
  7. Y. Gong, Y.-A. Chung, and J. Glass, "AST: audio spectrogram transformer," Proc. Interspeech, 571-575 (2021). 
  8. N. C. Ristea, R. T. Ionescu, and F. S. Khan, "SepTr: separable transformer for audio spectrogram processing," Proc. Interspeech, 4103-4107 (2022). 
  9. N. K. Sharma, S. R. Chetupalli, D. Bhattacharya, D. Dutta, P. Mote, and S. Ganapathy, "The second DiCOVA challenge: Dataset and performance analysis for diagnosis of COVID-19 using acoustics," Proc. ICASSP, 556-560 (2022). 
  10. X.-Y. Chen, Q.-S. Zhu, J. Zhang, and L.-R. Dai, "Supervised and self-supervised pretraining based covid-19 detection using acoustic breathing /cough/speech signals," Proc. ICASSP, 561-565 (2022). 
  11. T. Dang, T. Quinnell, and C. Mascolo, "Exploring semi-supervised learning for audio-based COVID-19 detection using FixMatch," Proc. Interspeech, 2468-2472 (2022). 
  12. A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, "An image is worth 16×16 words: Transformers for image recognition at scale," Proc. ICLR, 1-22 (2021). 
  13. K. Lee and C. H. Lee, "Abnormal signal detection based on parallel autoencoders" (in Korean), J. Acoust. Soc. Kr. 40, 337-346 (2021).