Browse > Article
http://dx.doi.org/10.22937/IJCSNS.2022.22.10.1

Centroid and Nearest Neighbor based Class Imbalance Reduction with Relevant Feature Selection using Ant Colony Optimization for Software Defect Prediction  

B., Kiran Kumar (Department of Information Technology, Kakatiya Institute of Technology & Science)
Gyani, Jayadev (Department of CS, College of Computer and Information Sciences, Majmaah University)
Y., Bhavani (Department of Information Technology, Kakatiya Institute of Technology & Science)
P., Ganesh Reddy (Department of Information Technology, Kakatiya Institute of Technology & Science)
T, Nagasai Anjani Kumar (Department of Information Technology, Kakatiya Institute of Technology & Science)
Publication Information
International Journal of Computer Science & Network Security / v.22, no.10, 2022 , pp. 1-10 More about this Journal
Abstract
Nowadays software defect prediction (SDP) is most active research going on in software engineering. Early detection of defects lowers the cost of the software and also improves reliability. Machine learning techniques are widely used to create SDP models based on programming measures. The majority of defect prediction models in the literature have problems with class imbalance and high dimensionality. In this paper, we proposed Centroid and Nearest Neighbor based Class Imbalance Reduction (CNNCIR) technique that considers dataset distribution characteristics to generate symmetry between defective and non-defective records in imbalanced datasets. The proposed approach is compared with SMOTE (Synthetic Minority Oversampling Technique). The high-dimensionality problem is addressed using Ant Colony Optimization (ACO) technique by choosing relevant features. We used nine different classifiers to analyze six open-source software defect datasets from the PROMISE repository and seven performance measures are used to evaluate them. The results of the proposed CNNCIR method with ACO based feature selection reveals that it outperforms SMOTE in the majority of cases.
Keywords
Ant Colony Optimization; Class imbalance; Feature selection; Oversampling;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Abdullateef, Balogun., Fatimah B Lafenwa, Balogun., Hammed, Mojeed. & Fatima Enehezei Hamza, Usman. (2020) Data sampling-based feature selection framework for software defect prediction. The International Conference on Emerging Applications and Technologies for Industry 4.0. Springer
2 Haitao, Xu., Ruifeng, Duan., Shengsong, Yang. & Lei, Guo. (2021) An empirical study on data sampling for just-in-time defect prediction. International Conference on Artificial Intelligence and Security. Springer.
3 Faseeha, Matloob., Taher, M, Ghazal., Nasser, Taleb., Shabib, Aftab., Munir, Ahmad. & Muham- mad, Adnan Khan. (2021) Software defect prediction using ensemble learning: a systematic literature review. IEEE Access, 9: 98754-98771   DOI
4 Shubhra, Goyal Jindal. & Arvinder, Kaur. (2019) Bug severity prediction using class imbalance problem. International Journal of Recent Technology and Engineering (IJRTE), 8(4): 2687-2695   DOI
5 Kalaivani, N. & Beena, R. (2020) Boosted relief feature subset selection and heterogeneous cross project defect prediction using firefly particle swarm optimization. International Journal of Recent Technology and Engineering (IJRTE), 8(5): 2605-2613   DOI
6 Faseeha, Matloob., Shabib, Aftab., Munir, Ahmad., Adnan Khan, Muhammad., Fatima, Areej., Iqbal, Muhammad., Alruwaili, Wesam Mohsen. & Elmitwally, NouhSabri. (2021) Software defect prediction using supervised machine learning techniques: a systematic literature review. Intelligent Automation & Soft Computing, 29(2): 403-421
7 Shuo, Feng., Jacky, Keung., Xiao, Yu., Yan, Xiao. & Miao, Zhang. (2021) Investigation on the stability of SMOTE-based oversampling techniques in software defect prediction. Information and Software Technology, 139
8 Kun, Zhu., Shi, Ying., Nana, Zhang. & DandanZhun. (2021) Software defect prediction based on enhanced metaheuristic feature selection optimization and a hybrid deep neural network, Journal of Systems and Software, 180
9 Ha, Th Minh Phuong., Le Thi My Hanh. & Nguyen Thanh, Binh. (2021) A Comparative Analysis of Filter-Based Feature Selection Methods, Journal of Research and Development on Information and Communication Technology, 1(6):1-7
10 Ramesh, Ponnala. & Reddy, CRK. (2021) Software defect prediction using machine learning algorithms: current state of the art. Solid State Technology. 64(2)
11 Inderpreet, Kaur. & Arvinder, Kaur. (2021) Comparative analysis of software fault prediction using various categories of classifiers. International Journal of System Assurance Engineering and Management 12(1):520-535
12 Mohammad AmimulIhsan, Aquil. & Wan Hussain, Wan Ishak. (2020) Predicting software defects using machine learning techniques. International Journal of Advanced Trends in Computer Science and Engineering, 9(4): 6609 - 6616   DOI
13 Rahul, Yedida. & Menzies, Tim. (2021) On the value of oversampling for deep learning in software defect prediction. IEEE Transactions on Software Engineering. 2021:1-11
14 Xu, Xiaolong., Chen, Wen. & Wang, Xinheng. (2021) RFC: A feature selection algorithm for software defect prediction. Journal of Systems Engineering and Electronics. 32(2): 389-398   DOI
15 Zheng, Jianming., Wang, Xingqi., Wei, Dan., Chen, Bin. & Shao, Yanli. (2021) A novel imbalanced ensemble learning in software defect predication. IEEE Access 9:86855-86868   DOI
16 Ebiaredoh-Mienye, Sarah., Esenogho, Ebenezer. & Swart, Theo. (2021) Improved machine learning methods for classification of imbalanced data
17 Guo, Shikai., Dong, Jian., Li, Hui. & Wang, Jiahui. (2021) Software defect prediction with imbalanced distribution by radius-synthetic minority over-sampling technique. Journal of Software: Evolution and Process 33(1)
18 Somya, G. (2021) Handling class-imbalance with KNN (Neighborhood) under-sampling for software defect prediction. Artificial Intelligence Review: 1-42
19 Shang, Zheng., Jinjing, Gai., Hualong, Yu., Haitao Zou. & Shang, Gao. (2021) Training data selection for imbalanced cross-project defect prediction. Computers & Electrical Engineering, 94
20 Sushant Kumar, Pandey. & Anil Kumar, Tripathi. (2021) An empirical study toward dealing with noise and class imbalance issues in software defect prediction. Soft Computing, 25: 13465-13492   DOI
21 Asad, Ali. & Gravino, Carmine. (2021) Software fault prediction using bio-inspired algorithms to select the features to be employed: an empirical study. 29thInternational Conference on Information Systems Development
22 Harzevili., Shiri, Nima. & Alizadeh, Sasan H. (2021) Analysis and modeling conditional mutual de- pendency of metrics in software defect prediction using latent variables. Neuro Computing 460:309-330
23 Amit,Singh., Ranjeet Kumar, Ranjan. & Abhishek, Tiwari. (2021) Credit card fraud detection under extreme imbalanced data: a comparative study of data-level algorithms. Journal of Experimental & Theoretical Artificial Intelligence. 1-28
24 SrinivasaKumar, C., RangaSwamy, Sirisati. & Srinivasulu, Thonukunuri. (2021) Software defect prediction using optimized cuckoo search based nature-inspired technique. Smart Computing Techniques and Applications. Springer: 183-192.
25 ZYuqing, Zhang., Xuefeng, Yan. & Arif Ali, Khan. (2020) A kernel density estimation-based variation sampling for class imbalance in defect prediction. IEEE International Conference on Big Data and Cloud Computing
26 Jayalath, Ekanayake. (2021) Bug severity prediction using keywords in imbalanced learning environment. International Journal of Information Technology and Computer Science, 3:53-60   DOI
27 Satya Srinivas, Maddipati. & Srinivas, Malladi. (2021) Machine learning approach for classification from imbalanced software defect data using PCA & CSANFIS. Materials Today: Proceedings
28 Jiang, Z., Pan, T., Zhang, C. & Yang, J. (2021) A new oversampling method based on the classification contribution degree. Symmetry 13(2):1-13
29 Mahesh Kumar, Thota., Francis H, Shajin. & Rajesh, P. (2020) Survey on software defect prediction techniques. International Journal of Applied Science and Engineering 17(4): 331-344
30 RamanaRao, GNV., Balaram, VVSSS. & Vishnuvardhan, B. (2018) Software defect prediction: past present and future. International Journal of Computer Engineering & Technology (IJCET), 9(5):116-131.