Browse > Article
http://dx.doi.org/10.9716/KITS.2012.11.1.211

Software Quality Classification using Bayesian Classifier  

Hong, Euy-Seok (성신여자대학교 IT학부)
Publication Information
Journal of Information Technology Services / v.11, no.1, 2012 , pp. 211-221 More about this Journal
Abstract
Many metric-based classification models have been proposed to predict fault-proneness of software module. This paper presents two prediction models using Bayesian classifier which is one of the most popular modern classification algorithms. Bayesian model based on Bayesian probability theory can be a promising technique for software quality prediction. This is due to the ability to represent uncertainty using probabilities and the ability to partly incorporate expert's knowledge into training data. The two models, Na$\ddot{i}$veBayes(NB) and Bayesian Belief Network(BBN), are constructed and dimensionality reduction of training data and test data are performed before model evaluation. Prediction accuracy of the model is evaluated using two prediction error measures, Type I error and Type II error, and compared with well-known prediction models, backpropagation neural network model and support vector machine model. The results show that the prediction performance of BBN model is slightly better than that of NB. For the data set with ambiguity, although the BBN model's prediction accuracy is not as good as the compared models, it achieves better performance than the compared models for the data set without ambiguity.
Keywords
Software Quality; Prediction Model; Naive Bayes; Bayesian Belief Network;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 Ebert, C., "Fuzzy classification for software criticality analysis : Expert Systems with Applications, Vol.11, No.3(1996), pp.323-342.   DOI
2 Catal, C., "Software fault prediction : A literature review and current trends", Expert Systems with Applications, Vol.38, No.4(2011), pp.4626-4636.   DOI   ScienceOn
3 Menzies, T., J. Greenwald, and A. Frank, "Data mining static code attributes to learn defect predictors", IEEE Trans Software Engineering, Vol.33, No.1(2007), pp.2-13.   DOI   ScienceOn
4 홍의석, "훈련 데이터집합을 사용하지 않는 소프트웨어 품질예측 모델," 정보처리학회논문지, 제10-D권, 제4호(2003), pp.689-696.
5 홍의석, "Support Vector Machine을 이용한 초기 소프트웨어 품질 예측," 한국IT서비스학회지, 제10권, 제2호(2011), pp.235-245.
6 Elish, K. O. and M. O. Elish, "Predicting defect prone software modules using support vector machines", J. Systems Software, Vol. 81, No.5(2008), pp.649-660.   DOI   ScienceOn
7 홍의석, "소프트웨어 품질 예측 모델을 위한 분류 프레임워크," 한국콘텐츠학회논문지, 제10 권, 제6호(2010), pp.134-143.   과학기술학회마을   DOI
8 Catal, C. and B. Diri, "A systematic review of software fault prediction studies", Expert Systems with Applications, Vol.36, No.4(2009), pp.7346-7354.   DOI
9 Zhong, S., T. M. Khoshgoftaar, and N. Seliya, "Analyzing Software Measurement Data with Clustering Techniques", IEEE Intelligent Systems, Vol.19, No.2(2004), pp.20-27.
10 Seliya N. and T. M. Khoshgoftaar, "Software quality analysis of unlabeled program modules with semisupervised clustering", IEEE Trans. Systems, Man and Cybernetics, Vol.37, No.2(2007), pp.201-211.   DOI   ScienceOn
11 Seliya, N. and T. M. Khoshgoftaar, "Software quality estimation with limited fault data : A semi supervised learning perspective", Software Quality Journal, Vol.15, No.3 (2007), pp.327-344.   DOI
12 Catal, C. and B. Diri, "Unlabeled Extra Data do not Always Mean Extra Performance for Semi-Supervised Fault Prediction", Expert Systems, Vol.26, No.5(2009), pp.458-471.   DOI   ScienceOn
13 Menzies, T., J. DiStefano, A. Orrego, and R. Chapman, "Assessing predictors of software defects", Proc. workshop on Predictive software models, 2004.
14 Pai, G. J. and J. B. Dugan, "Empirical analysis of software fault content and fault proneness using Bayesian methods", IEEE Trans. Software Engineering, Vol.33, No.10 (2007), pp.675-686.   DOI
15 Turhan, B. and A. Bener, "Analysis of Naive Bayes' assumptions on software fault data : An empirical study", Data and Knowledge Engineering, Vol.68, No.2(2009), pp. 278-290.   DOI
16 Cooper, G. F. and E. Herskovits, "A Bayesian method for the induction of probabilistic networks from data", Machine Learning, Vol.9, No.4(1992), pp.309-347.