DOI QR코드

DOI QR Code

Improvement of Classification Accuracy on Success and Failure Factors in Software Reuse using Feature Selection

특징 선택을 이용한 소프트웨어 재사용의 성공 및 실패 요인 분류 정확도 향상

  • 김영옥 (강릉원주대학교 컴퓨터공학과) ;
  • 권기태 (강릉원주대학교 컴퓨터공학과)
  • Received : 2012.10.04
  • Accepted : 2012.12.06
  • Published : 2013.04.30

Abstract

Feature selection is the one of important issues in the field of machine learning and pattern recognition. It is the technique to find a subset from the source data and can give the best classification performance. Ie, it is the technique to extract the subset closely related to the purpose of the classification. In this paper, we experimented to select the best feature subset for improving classification accuracy when classify success and failure factors in software reuse. And we compared with existing studies. As a result, we found that a feature subset was selected in this study showed the better classification accuracy.

특징 선택은 기계 학습 및 패턴 인식 분야에서 중요한 이슈 중 하나로, 분류 정확도를 향상시키기 위해 원본 데이터가 주어졌을 때 가장 좋은 성능을 보여줄 수 있는 데이터의 부분집합을 찾아내는 방법이다. 즉, 분류기의 분류 목적에 가장 밀접하게 연관되어 있는 특징들만을 추출하여 새로운 데이터를 생성하는 것이다. 본 논문에서는 소프트웨어 재사용의 성공 요인과 실패 요인에 대한 분류 정확도를 향상시키기 위해 특징 부분 집합을 찾는 실험을 하였다. 그리고 기존 연구들과 비교 분석한 결과 본 논문에서 찾은 특징 부분 집합으로 분류했을 때 가장 좋은 분류 정확도를 보임을 확인하였다.

Keywords

References

  1. Morisio, M., Ezran, M., Tully, C., "Success and failure factors in software reuse", IEEE Transactions on Software Engineering, Vol.28, Issue 4, pp.340-357, 2002. https://doi.org/10.1109/TSE.2002.995420
  2. Menzies, T., Di Stefano, J.S., "More success and failure factors in software reuse", IEEE Transactions on Software Engineering, Vol.29, Issue 5, pp.474-477, 2003. https://doi.org/10.1109/TSE.2003.1199076
  3. IH Witten, E Frank, "Data Mining : Practical Machine Learning Tools and Techniques", Second Edi., Morgan Kaufmann, 2005.
  4. PN Tan, M Steinbach, V Kumar, "Introduction to data mining", Addison-Wesley, 2006.
  5. Masahide Watanabe, Kaihei Kuwata, Ryu Katayama., "Adaptive Tree-Structured Self Generating Radial Basis Function and its Application to Nonlinear Identification Problem", PROCEEDINGS of The 3rd International Conference on Fuzzy Logic, Neural Nets and Soft Computing, pp.167-170, 1994.
  6. Young-Sup Hwang, Sung-Yang Bang, "An Efficient Method to Construct a Radial Basis Function Neural Network Classifier", Journal of KIISE : Software and Applications, Vol.24, Issue 5, pp.451-460, 1997.
  7. Satnam Alag, "Collective Intelligence in Action", Manning Publications Co., 2009.
  8. MA Hall, "Correlation-based feature selection for machine learning", lri.fr, 1999.
  9. Lei Yu, Huan Liu, "Feature Selection for High-Dimensional Data : A Fast Correlation-Based Filter Solution", Proceedings of the Twentieth International Conference on Machine Learning(ICML-2003), Washington DC, 2003.
  10. K. Selvakuberan, M. Indradevi, Dr. R. Rajaram "Combined Feature Selection and classification - A novel approach for the categorization of web pages", Journal of Information and Computing Science, Vol.3, Issue 2, pp.083-089, 2008.
  11. Karim O. Elish, Mahmoud O. Elish, "Predicting defect-prone software modules using support vector machines", The Journal of Systems and Software, Vol.81, Issue 5, pp.649-660, 2008. https://doi.org/10.1016/j.jss.2007.07.040
  12. K Michalak, H Kwasnicka, "Correlation-based feature selection strategy in classification problems", Int. J. Appl. Math. Comput. Sci., Vol.16, Issue 4, pp.503-511, 2006.
  13. Richard J. Roiger, Michael W. Geatz, "Data mining a tutorial-based primer", Addison Wesley, 2003.
  14. "CMMI for Development, Version 1.2", Carnegie Mellon University, 2006.
  15. Jürgen Börstler, "Feature-Oriented Classification for Software Reuse", Proceedings SEKE '95, The 7th International Conference on Software Engineering and Knowledge Engineering, Rockville, MD, USA, 1995.