Fig. 1. The work flow of the proposed approach.
Table 1. Preprocessing results
Table 2. Example of annotation
Table 3. Results of Top-25 and Top-250 identification
References
- R. E. Kraut, M. Burke, J. Riedl, and P. Resnick, "The challenges of dealing with newcomers," in Building Successful Online Communities: Evidence-Based Social Design. Cambridge, MA: MIT Press, pp. 179-230, 2002.
- G. Chandrika, "Study on software reliability and reliability testing," Asia-pacific Journal of Convergent Research Interchange, vol. 1, no. 1, pp. 7-20, 2015. DOI: 10.21742/apjcri.2015.03.02.
- A. J. Ko, B. A. Myers, M. J. Coblenz, and H. H. Aung, "An exploratory study of how developers seek, relate, and collect relevant information during software maintenance tasks," IEEE Transaction on Software Engineering, vol. 32, no. 12, pp. 971-987, 2006. DOI: 10.1109/TSE.2006.116.
- M. Zelkowitz, A. Shaw, and J. Gannon, Principles of Software Engineering and Design. Englewood Cliffs, NJ: Prentice-Hall, 1979.
- R. Jones, R. Kumar, B. Pang, and A. Tomkins, ""I know what you did last summer": query logs and user privacy," in Proceedings of the 16th ACM conference on Conference on Information and Knowledge Management, Lisbon, Portugal, pp. 909-914, 2007. DOI: 10.1145/1321440.1321573.
- T. D. LaToza, G. Venolia, and R. DeLine, "Maintaining mental models: a study of developer work habits," in Proceedings of the 28th International Conference on Software Engineering, Shanghai, China, pp. 492-501, 2006. DOI: 10.1145/1134285.1134355.
- I. Steinmacher, I. S. Wiese, T. Conte, M. A. Gerosa, and D. Redmiles, "The hard life of open source software project newcomers," in Proceedings of the 7th International Workshop on Cooperative and Human Aspects of Software Engineering, Hyderabad, India, pp. 72-78. 2014. DOI: 10.1145/2593702.2593704.
- Y. Park and C. Jensen, "Beyond pretty pictures: examining the benefits of code visualization for open source newcomers," in Proceedings of the 5th IEEE International Workshop on Visualizing Software for Understanding and Analysis, Edmonton, Canada, pp. 3-10, 2009. DOI: 10.1109/VISSOF.2009.5336433.
- A. Kuhn, S. Ducasse, and T. Girba, "Semantic clustering: identifying topics in source code," Information and Software Technology, vol. 49, no. 3, pp. 230-243, 2007. DOI: 10.1016/j.infsof.2006.10.017.
- B. Dit, L. Guerrouj, D. Poshyvanyk, and G. Antoniol, "Can better identifier splitting techniques help feature location?," in Proceedings of IEEE 19th International Conference on Program Comprehension, Kingston, Canada, pp. 11-20, 2011. DOI: 10.1109/ICPC.2011.47.
- G. W. Furnas, T. K. Landauer, L. M. Gomez, and S. T. Dumais, "The vocabulary problem in human-system communication," Communications of the ACM, vol. 30, no. 11, pp. 964-971, 1987. DOI: 10.1145/32206.32212.
- J. Daiber, M. Jakob, C. Hokamp, and P. N. Mendes, "Improving efficiency and accuracy in multilingual entity extraction," in Proceedings of the 9th International Conference on Semantic Systems, Graz, Austria, pp. 121-124. 2013. DOI: 10.1145/2506182.2506198.
- I. Steinmacher, I. S. Wiese, and M. A. Gerosa, "Recommending mentors to software project newcomers," in Proceedings of 2012 3rd International Workshop on Recommendation Systems for Software Engineering, Zurich, Switzerland, pp. 63-67, 2012. DOI: 10.1109/RSSE.2012.6233413.
- G. Canfora, M. Di Penta, R. Oliveto, and S. Panichella, "Who is going to mentor newcomers in open source projects?," in Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering, Cary, NC, pp. 1-11, 2012. DOI: 10.1145/2393596.2393647.
- D. Cubranic and G. C. Murphy, "Hipikat: recommending pertinent software development artifacts," in Proceedings of 25th International Conference on Software Engineering, Portland, OR, pp. 408-418, 2003.
- Y. Malheiros, A. Moraes, C. Trindade, and S. Meira, "A source code recommender system to support newcomers," in Proceedings of IEEE 36th Annual Computer Software and Applications Conference, Izmir, Turkey, pp. 19-24, 2012. DOI: 10.1109/COMPSAC.2012.11.
- P. N. Mendes, M. Jakob, A. Garcia-Silva, and C. Bizer, "DBpedia spotlight: shedding light on the web of documents," in Proceedings of the 7th International Conference on Semantic Systems, Graz, Austria, pp. 1-8. 2011. DOI: 10.1145/2063518.2063519.
- G. Salton and C. Buckley, "Term-weighting approaches in automatic text retrieval," Information Processing & Management, vol. 24, no. 5, pp. 513-523, 1988. DOI: 10.1016/0306-4573(88)90021-0.
- G. Gousios and D. Spinellis, "GHTorrent: GitHub's data from a firehose," in Proceedings of 2012 9th IEEE Working Conference on Mining Software Repositories (MSR), Zurich, Switzerland, pp. 12-21, 2012. DOI: 10.1109/MSR.2012.6224294.
- R. Nielek, O. Jarczyk, K. Pawlak, L. Bukowski, R. Bartusiak, and A. Wierzbicki, "Choose a job you love: predicting choices of GitHub developers," in Proceedings of IEEE/WIC/ACM International Conference on Web Intelligence, Omaha, NE, pp. 200-207, 2016. DOI: 10.1109/WI.2016.0037.
- Y. Zhang, D. Lo, P. S. Kochhar, X. Xia, Q. Li, and J. Sun, "Detecting similar repositories on GitHub," in Proceedings of IEEE 24th International Conference on Software Analysis, Evolution and Reengineering, Klagenfurt, Austria, pp. 13-23, 2017. DOI: 10.1109/SANER.2017.7884605.
- G. Salton and M. J. McGill, Introduction to Modern Information Retrieval. New York, NY: McGraw-Hill, 1983.
- J. Kim and E. Lee, "Understanding review expertise of developers: a reviewer recommendation approach based on latent Dirichlet allocation," Symmetry, vol. 10, article no. 114, 2018. DOI: 10.3390/sym10040114.