토픽 모델과 소셜 네트워크를 이용한 개발자 추천방법

A Developer Recommendation Technique Based on Topic Model and Social Network

  • 투고 : 2014.03.25
  • 심사 : 2014.05.27
  • 발행 : 2014.08.15

초록

최근 소프트웨어 규모가 더욱 커지고 복잡해지고 있다. 하루에도 수많은 버그 리포트들이 버그저장소에 전송 되어 개발자들의 업무가 늘어나고 있다. 이러한 버그 리포트들을 적절한 개발자에게 전달하여 빠르고 정확하게 소프트웨어의 결함이 수정되어야 하는데, 많은 버그 리포트들이 적절하지 않는 개발자에게 배정되어 다른 개발자에게 다시 재배정 되는 경우가 빈번하게 일어나고 있다. 이것은 배정자가 전송받은 버그 리포트들을 정확히 이해하지 못했거나, 또는 모든 개발자들의 능력을 바르게 파악하지 못해 발생한다. 이것은 소프트웨어 유지보수에 개발자의 시간과 노력을 많이 필요하게 한다. 이러한 문제를 해결하기 위해 본 연구에서는 버그 리포트와 관련된 토픽을 찾아내고, 토픽 내 개발자들의 소셜 네트워크 관계를 분석해서 적절한 개발자를 추천하는 기법을 제안한다. 그리고 공개 소스 프로젝트를 이용한 개발자 추천에 대한 성능비교 실험을 통하여 본 연구에서 제안한 방법이 효과적이라는 것을 보인다.

Recently, software projects have been increasing and getting complex. Due to the large number of submitted bug reports, developers' workload increases. Generally in bug triage process, the triagers assign the bug report to fixer (developer) in order to resolve the bug. However, bug reports have been reassigned to other developers because fixers are not suitable. This is why the triagers did not correctly check and understand the bug report and decide the appropriate developers to fix the bug. This results in increase of developers' time and efforts in software maintenance. To resolve these problems, in this paper, we propose a novel method for developer recommendation based on topic model and social network. First, we build a basis of topic(s) from bug reports. Next, when a new bug report (test data set) comes, we select the most similar topic(s) and extract the participated developers from the topic(s). Finally, by applying social network, we analyze the developers' behavior (comment and commit activity) and recommend the appropriate developers. In this paper we compare our work with related studies through performance experiments on open source projects. The results show that our approach is more effective than other studies in bug triage.

키워드

과제정보

연구 과제 주관 기관 : 서울시립대학교

참고문헌

  1. G. Jeong, S. Kim and T. Zimmermann, "Improving Bug Triage with Bug Tossing Graph," Proc. of Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, pp.111- 120, 2009.
  2. C. Weiss, R. Premraj, T. Zimmermann and A. Zeller, "How Long Will It Take to Fix This Bug?," Proc. of International Workshop on Mining Software Repositories, 2007.
  3. N. Jalbert and W. Weimer, "Automated Duplicate Detection for Bug Tracking Systems," Proc. of International Conference on Dependable System & Networks, pp.52-61, 2008.
  4. X. Xie, W. Zhang, Y. Yang and Q. Wang, "DRETOM: Developer Recommendation based on Topic Models for Bug Resolution," Proc. of International Conference on Predictive Models in Software Engineering, pp.19-28, 2012.
  5. E. Linstead, P. Rigor, S. Bajracharya, C. Lopes and P. Baldi, "Mining Eclipse Developer Contributions via Author-Topic Models," Proc. of International Workshop on Mining Software Repositories, pp.30- 33, 2007.
  6. P. J. Moreno, P. P. Ho and N. Vasconcelos, "A Kullback-Leibler Divergence Based Kernel for SVM Classification in Multimedia Applications," Advanced in Neural Information Processing System, vol.16, pp.1-9, 2004.
  7. X. Xie, W. Zhang, Y. Yang and Q. Wang, "A Topicbased Approach for Narrowing the Search Space of Buggy Files from a Bug Report," Proc. of International Conference on Automated Software Engineering, pp.263-272, 2011.
  8. D. Matter, A. Kuhn and O. Nierstrasz, "Assigning Bug Reports using a Vocabulary-Based Expertise Model of Developers," Proc. of International Working Conference on Mining Software Repositories, pp. 131-140, 2009.
  9. T. Zhang and B. Lee, "A Hybrid Bug Triage Algorithm for Developer Recommendation," Proc. of the Annual ACM Symposium on Applied Computing, pp.1088-1094, 2013.
  10. C. H. Brooks and N. Montanez, "Improved Annotation of The Blogosphere via Autotagging and Hierarchical Clustering," Proc. of International Conference on World Wide Web, pp.625-631, 2006.
  11. T. Zhang and B. Lee, "How to Recommend Appropriate Developers for Bug Fixing?," Proc. of International Conference on Computer Software and Applications, pp.170-175, 2012.
  12. W. Wu, W. Zhang, Y. Yang and Q. Wang, "DREX: Developer Recommendation with K-Nearest-Neighbor Search and Expertise Ranking," Proc. of Asia- Pacific Software Engineering Conference, pp.389- 396, 2011.
  13. G. Yang, T. Zhang and B. Lee, "Utilizing a Multi- Developer Network-based Developer Recommendation Algorithm to Fix Bugs Effectively," Proc. of the Annual Symposium On Applied Computing, pp. 1134-1139, 2014.
  14. J. Anvik, L. Hiew and G. C. Murphy, "Who Should Fix This Bug?," Proc. of International Conference on Software Engineering, pp.361-370, 2006.
  15. S. Tong and D. Koller, "Support Vector Machine Active Learning with Applications to Text Classification," Proc. of Journal of Machine Learning Research, vol.2, pp.45-66, 2002.
  16. S. Banitaan and M. Alenezi, "TRAM: An Approach for Assigning Bug Reports using their Metadata," Proc. of International Conference on Communications and Information Technology, pp.215-219, 2013.
  17. Y. Yang and J. O. Pedersen, "A Comparative Study on Feature Selection in Text Categorization," Proc. of Machine Learning-International Workshop Then Conference, pp.412-420, 1997.
  18. D. Cubranic and G. C. Murphy, "Automatic Bug Triage using Text Categorization," Proc. of Software Engineering and Knowledge Engineering, pp. 92-97, 2004.
  19. J. Xuan, H. Jiang, Z. Ren, J. Yan and Z. Luo, "Automatic Bug Triage Using Semi-Supervised Text Classification," Proc. Of International Conference on Software Engineering and Knowledge Engineering, pp.209-214, 2010.
  20. P. Runeson, M. Alexandersson and O. Nyholm, "Detection of Duplicate Defect Reports Using Natural Language Processing," Proc. of International Conference on Software Engineering, pp.499-510, 2007.
  21. C. Friedman, P. Kra, H. Yu, M. Krauthammer and A. Rzhetsky, "GENIES: A Natural-Language Processing System for the Extraction of Molecular Pathways from Journal Articles," in Bioinformatics, vol.17, pp.S74-S82, 2001. https://doi.org/10.1093/bioinformatics/17.suppl_1.S74
  22. C. Sun, D. Lo, X. Wang, J. Jiang and S. C. Khoo, "A Discriminative Model Approach for Accurate Duplicate Bug Report Retrieval," Proc. of ACM/ IEEE International Conference on Software Engineering, pp.45-54, 2010.
  23. https://bugs.eclipse.org/bugs/show_bug.cgi?id=69978
  24. M. Ohira, A. E. Hassan and N. Osawa, "The Impact of Bug Management Patterns on Bug Fixing: A Case Study of Eclipse Projects," Proc. of International Conference on Software Maintenance, pp. 264-273, 2012.
  25. I. X. Chen, C. Z. Yang, T. K. Lu and H. Jaygarl, "Implicit Social Network Model for Predicting and Tracking the Location of Faults," Proc of IEEE International Computer Software and Applications Conference, IEEE CS press, pp.136-143, 2008.
  26. http://nlp.stanford.edu/software/corenlp.shtml
  27. http://nlp.stanford.edu/software/tmt/tmt-0.4/
  28. https://bugs.eclipse.org/bugs/
  29. https://netbeans.org/bugzilla/
  30. https://bugzilla.mozilla.org/
  31. J. Davis, and M. Goadrich, "The Relationship between Precision-Recall and ROC Curves," Proc. of International Conference on Machine Learning, pp. 233-240, 2006.
  32. R. Wu, H. Zhang, S. Kim, and S.C. Cheung, "ReLink: Recovering Links between Bugs and Changes," Proc. of Joint Meeting of the European Software Engineering Conference and ACM SIGSOFT Symposium on the Foundations of Software Engineering, pp.15-25, 2011.
  33. J. Zhou, H. Zhang, and D. Lo, "Where Should the Bugs be Fixed? More Accurate Information Retrieval- Based Bug Localization Based on Bug Reports," Proc. of International Conference on Software Engineering, IEEE Press, pp.14-24, 2012.
  34. F. Wilcoxon, "Individual comparisons by ranking methods," Biometrics Bulletin, vol.1, no.6, pp.80-83, 1945. https://doi.org/10.2307/3001968
  35. The T-Test, Research Methods Knowledge Base, http://www.socialresearchmethods.net/kb/stat_t.php.
  36. R Core Team, "R: A Language and Environment for Statistical Computing," R Foundation for Statistical Computing, 2012.
  37. Shapiro-Wilk test, WIKIPEDIA, http://en.wikipedia. org/wiki/Shapiro-Wilk_test.