Evaluating Chest Abnormalities Detection: YOLOv7 and Detection Transformer with CycleGAN Data Augmentation

Yoshua Kaleb Purwanto;Suk-Ho Lee;Dae-Ki Kang;

doi:10.7236/IJASC.2024.13.2.195

International journal of advanced smart convergence

Volume 13 Issue 2
/
Pages.195-204
/
2024
/
2288-2847(pISSN)
/
2288-2855(eISSN)

The Institute of Internet, Broadcasting and Communication (한국인터넷방송통신학회)

DOI QR Code

Evaluating Chest Abnormalities Detection: YOLOv7 and Detection Transformer with CycleGAN Data Augmentation

Yoshua Kaleb Purwanto (Department of Computer Engineering, Dongseo University) ;
Suk-Ho Lee (Department of Computer Engineering, Dongseo University) ;
Dae-Ki Kang (Department of Computer Engineering, Dongseo University)

Received : 2024.05.06
Accepted : 2024.05.21
Published : 2024.06.30

https://doi.org/10.7236/IJASC.2024.13.2.195 Citation PDF

Download PDF

⟨ Previous Next ⟩

Abstract

In this paper, we investigate the comparative performance of two leading object detection architectures, YOLOv7 and Detection Transformer (DETR), across varying levels of data augmentation using CycleGAN. Our experiments focus on chest scan images within the context of biomedical informatics, specifically targeting the detection of abnormalities. The study reveals that YOLOv7 consistently outperforms DETR across all levels of augmented data, maintaining better performance even with 75% augmented data. Additionally, YOLOv7 demonstrates significantly faster convergence, requiring approximately 30 epochs compared to DETR's 300 epochs. These findings underscore the superiority of YOLOv7 for object detection tasks, especially in scenarios with limited data and when rapid convergence is essential. Our results provide valuable insights for researchers and practitioners in the field of computer vision, highlighting the effectiveness of YOLOv7 and the importance of data augmentation in improving model performance and efficiency.

Keywords

Acknowledgement

This work was supported by Dongseo University, "Dongseo Cluster Project" Research Fund of 2023 (DSU-20230004).

References

A. Bochkovskiy, C. Wang, H. M. Liao, and R. Girshick, "YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors," Apr. 2021. DOI: https://doi.org/10.48550/arXiv.2207.02696
J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You Only Look Once: Unified, Real-Time Object Detection," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. DOI: https://doi.org/10.48550/arXiv.1506.02640
J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "YOLOv3: An Incremental Improvement," Apr. 2018. DOI: https://doi.org/10.48550/arXiv.1804.02767
A. Bochkovskiy, C. Wang, and H. M. Liao, "YOLOv4: Optimal Speed and Accuracy of Object Detection," Apr. 2020. DOI: https://doi.org/10.48550/arXiv.2004.10934
N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, "End-to-end object detection with transformers," in European Conference on Computer Vision (ECCV), 2020, pp. 213-229. Springer. DOI: https://doi.org/10.48550/arXiv.2005.12872
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Kaiser, and I. Polosukhin, "Attention is all you need," in Advances in neural information processing systems (NeurIPS), 2017. DOI: https://doi.org/10.48550/arXiv.1706.03762
A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, et al., "An image is worth 16x16 words: Transformers for image recognition at scale," Oct. 2020. DOI: https://doi.org/10.48550/arXiv.2010.11929
Z. Chen, Y. Duan, W. Wang, J. He, T. Lu, J. Dai, and Y. Qiao, "Vision Tranformer Adapter for Dense Predictions," In Proceedings of the 9th International Conference on Learning Representations (ICLR), Feb. 2023. DOI: https://doi.org/10.48550/arXiv.2205.08534
H. Q. Nguyen et al., "VinDr-CXR: An Open Dataset of Chest X-rays with Radiologist's Annotations," Jan. 2022. DOI: https://doi.org/10.48550/arXiv.2012.15029
J. Zhu, T. Park, P. Isola, and A. A. Efros, "Unpaired image-to-image translation using cycle-consistent adversarial networks," in Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017. DOI: https://doi.org/10.48550/arXiv.1703.10593
R. Zhang, P. Isola, and A. A. Efros, "Colorful image colorization," in European Conference on Computer Vision (ECCV), 2016. DOI: https://doi.org/10.48550/arXiv.1603.08511
L. A. Gatys, A. S. Ecker, and M. Bethge, "Image style transfer using convolutional neural networks," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
M. Tan and Q. V. Le, "EfficientDet: Scalable and efficient object detection," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020. DOI: https://doi.org/10.48550/arXiv.1911.09070
J. Hosang, R. Benenson, P. Dollar, and B. Schiele, "Learning non-maximum suppression," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. DOI: https://doi.org/10.48550/arXiv.1705.02950
D. Bahdanau, K. Cho, and Y. Bengio, "Neural machine translation by jointly learning to align and translate," in Proceedings of the 3rd International Conference on Learning Representations (ICLR), Sep. 2014. https://doi.org/10.48550/arXiv.1409.0473

International journal of advanced smart convergence

Evaluating Chest Abnormalities Detection: YOLOv7 and Detection Transformer with CycleGAN Data Augmentation

Abstract

Keywords

Acknowledgement

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)