DOI QR코드

DOI QR Code

Bug Report Quality Prediction for Enhancing Performance of Information Retrieval-based Bug Localization

정보검색기반 결함위치식별 기술의 성능 향상을 위한 버그리포트 품질 예측

  • 김미수 (성균관대학교 전자전기컴퓨터공학과) ;
  • 안준 (성균관대학교 전자전기컴퓨터공학과) ;
  • 이은석 (성균관대학교 컴퓨터공학과)
  • Received : 2017.04.10
  • Accepted : 2017.06.09
  • Published : 2017.08.15

Abstract

Bug reports are essential documents for developers to localize and fix bugs. These reports contain information regarding software bugs or failures that occur during software operation and maintenance phase. Information Retrieval-based Bug Localization (IR-BL) techniques have been proposed to reduce the time and cost it takes for developers to resolve bug reports. However, if a low-quality bug report is submitted, the performance of such techniques can be significantly degraded. To address this problem, we propose a quality prediction method that selects low-quality bug reports. This process; defines a Quality property of a Bug report as a Query (Q4BaQ) and predicts the quality of the bug reports using machine learning. We evaluated the proposed method with 3 open source projects. The results of the experiment show that the proposed method achieved an average F-measure of 87.31% and outperformed previous prediction techniques by up to 6.62% in the F-measure. Finally, a combination of the proposed method and traditional automatic query reformulation method improved the MRR and MAP by 0.9% and 1.3%, respectively.

버그리포트는 소프트웨어의 유지보수 단계에서 발생한 결함 정보를 담고 있는 문서로서 개발자가 해당 결함을 수정하기 위해 필수적인 정보이다. 이 때 개발자가 버그리포트를 해결하기 위해 결함을 추적하는 시간을 단축시키기 위한 정보검색기반 결함위치식별 기술들이 제안되었다. 그러나 정보검색에 유용하지 못한 내용들로 작성된 낮은 품질의 버그리포트가 등록 될 경우 결함위치식별 성능이 크게 저하된다. 본 논문에서는 낮은 품질의 버그리포트를 선별하기 위한 품질 예측 방법을 제안한다. 이 과정에서 버그리포트의 쿼리로써의 품질 요소를 정의하고, 기계학습을 사용하여 품질을 예측한다. 제안 방법을 오픈 소스 프로젝트에 적용하여 기존 품질 예측 기술 대비 평균 6.62% 더 정확하게 예측하였다. 또한 기존 결함위치식별 기술에 제안 예측 기술과 자동 쿼리 재구성 기술을 함께 적용한 경우 결함위치식별 정확도를 1.3% 향상시켜, 제안 품질 예측 기술이 정보검색기반 결함위치식별 기술의 성능 향상을 도울 수 있음을 확인하였다.

Keywords

Acknowledgement

Supported by : 한국연구재단, 정보통신 기술진흥센터

References

  1. S. W. Thomas, M. Nagappan, D. Blostein, and A. E. Hassan, "The impact of classifier configuration and classifier combination on bug localization," IEEE Transactions on Software Engineering, Vol. 39, No. 1, pp. 1427-1443, 2013. https://doi.org/10.1109/TSE.2013.27
  2. J. Zhou, H. Zhang, and D. Lo, "Where should the bugs be fixed? more accurate information retrievalbased bug localization based on bug reports," Proc. of the 34th International Conference on Software Engineering (ICSE 2012), pp. 14-24, 2012.
  3. K. C. Youm, J. Ahn, and E. Lee, "Improved bug localization based on code change histories and bug reports," Information and Software Technology, Vol. 82, No. 1, pp. 177-192, 2017. https://doi.org/10.1016/j.infsof.2016.11.002
  4. C. P. Wong, Y. Xiong, H. Zhang, D. Hao, L. Zhang, and H. Mei, "Boosting Bug-Report-Oriented Fault Localization with Segmentation and Stack-Trace Analysis," Proc. of the 30th International Conference on Software Maintenance and Evolution (ICSME 2014), pp. 181-190, 2014.
  5. S. C. Townsend, Y. Zhou, and W. B. Croft, "Predicting query performance," Proc. of the 25th Annual International ACM SIGIR conference on Research and Development in Information Retrieval, pp. 299-306, 2002.
  6. D. Carmel, and E. Y. Tov, "Estimating the Query Difficulty for Information Retrieval," Synthesis Lectures on Information Concepts, Retrieval, and Services, Vol. 2, No. 1, pp. 1-89, 2010.
  7. S. Haiduc, G. Bavota, R. Oliveto, and A. De. Lucia, "Automatic query performance assessment during the retrieval of software artifacts," Proc. of the 27th International Conference on Automated Software Engineering (ASE 2012), pp. 90-99, 2012.
  8. L. Moreno, G. Bavota, S. Haiduc, and M. D. Penta, "Query-based configuration of text retrieval solutions for software engineering tasks," Proc. of the 10th Joint Meeting on Foundations of Software Engineering (FSE 2015), pp. 567-578, 2015.
  9. S. Haiduc, G. Bavota, A. Marcus, and R. Oliveto, "Automatic query reformulations for text retrieval in software engineering," Proc. of the 35th International Conference on Software Engineering (ICSE 2013), pp. 842-851, 2013.
  10. T. D. B. Le, F. Thung, and D. Lo, "Predicting effectiveness of ir-based bug localization techniques," Proc. of the 25th International Symposium on Software Reliability Engineering (ISSRE 2014), pp. 335-345, 2014.
  11. P. S. Kochhar, X. Xia, D. Lo, and S. Li, "Practitioners' expectations on automated fault localization," Proc. of the 25th International Symposium on Software Testing and Analysis (ISSTA 2016), pp. 165-176, 2016.
  12. T. Zimmermann, and N. Nagappan, "Predicting defects using network analysis on dependency graphs," Proc. of the 30th International Conference on Software Engineering (ICSE 2008), pp. 531-540, 2008.
  13. N. Ali, A. Sabane, Y.G. Gueheneuc, and G. Antoniol, "Improving bug location using binary class relationships," Proc. of the 12th International Conference on Source Code Analysis and Manipulation (SCAM), pp. 174-183, 2012.
  14. Weka: [Online]. Available: http://www.cs.waikato.ac.nz/ml/weka/
  15. Java call graph: [Online]. Available: https://github.com/gousiosg/java-callgraph
  16. J. J. Rocchio, "Relevance Feedback in Information Retrieval," The Smart Retrieval System, pp. 313-323, 1971.