Challenges and Future Directions for Large Language Models in Source Code Vulnerability Detection

  • 윤수빈 (서울대학교 전기정보공학부, 반도체공동연구소) ;
  • 김현준 (서울대학교 전기정보공학부, 반도체공동연구소) ;
  • 백윤흥 (서울대학교 전기정보공학부, 반도체공동연구소)
  • Subin Yun (Dept. of Electrical and Computer Engineering and Inter-University Semiconductor Research Center, Seoul National University) ;
  • Hyunjun Kim (Dept. of Electrical and Computer Engineering and Inter-University Semiconductor Research Center, Seoul National University) ;
  • Yunheung Paek (Dept. of Electrical and Computer Engineering and Inter-University Semiconductor Research Center, Seoul National University)
  • 발행 : 2024.10.31

초록

Detecting vulnerabilities in source code is essential for maintaining software security, but traditional methods like static and dynamic analysis often struggle with the complexity of modern software systems. Large Language Models (LLMs), such as GPT-4, have emerged as promising tools due to their ability to learn programming language patterns from extensive datasets. However, their application in vulnerability detection faces significant hurdles. This paper explores the key challenges limiting the effectiveness of LLMs in this domain, including limited understanding of code context, scarcity of high-quality training data, accuracy and reliability issues, constrained context windows, and lack of interpretability. We analyze how these factors impede the models' ability to detect complex vulnerabilities and discuss their implications for security-critical applications. To address these challenges, we propose several directions for improvement: developing specialized and diverse datasets, integrating LLMs with traditional static analysis tools, enhancing model architectures for better code comprehension, fostering collaboration between AI systems and human experts, and improving the interpretability of model outputs. By pursuing these strategies, we aim to enhance the capabilities of LLMs in vulnerability detection, contributing to the development of more secure and robust software systems.

키워드

과제정보

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government(MSIT) (RS-2023-00277326). This work was supported by the BK21 FOUR program of the Education and Research Program for Future ICT Pioneers, Seoul National University in 2024. This work was partly supported by the Institute of Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korea government(MSIT) (No.2020-0-01840,Analysis on technique of accessing and acquiring user data in smartphone, 0.5) and Korea Evaluation Institute of Industrial Technology(KEIT) grant funded by the Korea government(MOTIE) (No.2020-0-01840,Analysis on technique of accessing and acquiring user data in smartphone, 0.5). This work was supported by Institute of Information & communications Technology Planning & Evaluation (IITP) under the artificial intelligence semiconductor support program to nurture the best talents (IITP-2023-RS-2023-00256081) grant funded by the Korea government(MSIT) This research was supported by the MSIT(Ministry of Science and ICT), Korea, under the ITRC(Information Technology Research Center) support program(IITP-2023-2020-0-01602) supervised by the IITP(Institute for Information & Communications Technology Planning & Evaluation) This research was supported by Korea Planning &Evaluation Institute of Industrial Technology(KEIT) grant funded by the Korea Government(MOTIE) (No. RS-2024-00406121, Development of an Automotive Security Vulnerability-based Threat Analysis System(R&D))

참고문헌

  1. Li, Z., Zou, D., Xu, S., Ou, X., Jin, H., Wang, S., & Deng, Z. (2018). VulDeePecker: A deep learning-based system for vulnerability detection. Proceedings of the 25th Annual Network and Distributed System Security Symposium (NDSS).
  2. Chenyuan Zhang, Hao Liu, Jiutian Zeng, Kejing Yang, Yuhong Li, and Hui Li. 2024. Prompt-enhanced software vulnerability detection using chatgpt. Proceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings. 276-277.
  3. Saad Ullah, Mingji Han, Saurabh Pujar, Hammond Pearce, Ayse Coskun, and Gianluca Stringhini. Llms cannot reliably identify and reason about security vulnerabilities (yet?): A comprehensive evaluation, framework, and benchmarks. IEEE Symposium on Security and Privacy, 2024.
  4. Xin Zhou, Ting Zhang, and David Lo. 2024. Large Language Model for Vulnerability Detection: Emerging Results and Future Directions. 2024 International Conference on Software Engineering (ICSE), New Ideas and Emerging Results (NIER) Track. IEEE
  5. Y. Ding, Y. Fu, O. Ibrahim, C. Sitawarin, X. Chen, B. Alomair, D. Wagner, B. Ray, and Y. Chen, "Vulnerability detection with code language models: How far are we?" arXiv preprint arXiv:2403.18624, 2024.