• Title/Summary/Keyword: Defect prediction models

Search Result 24, Processing Time 0.026 seconds

An Experiment for Determining Threshold of Defect Prediction Models using Object Oriented Metrics (객체지향 메트릭을 이용한 결함 예측 모형의 임계치 설정에 관한 실험)

  • Kim, Yun-Kyu;Chae, Heung-Seok
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.12
    • /
    • pp.943-947
    • /
    • 2009
  • To support an efficient management of software verification and validation activities, many defect prediction models have been proposed based on object oriented metrics. In order to apply defect prediction models, we need to determine a threshold value. Because we cannot know actually where defects are, it is difficult to determine threshold. Therefore, we performed a series of experiments to explore the issue of determining a threshold. In the experiments, we applied defect prediction models to other systems different from the system used in building the prediction model. Specifically, we have applied three models - Olague model, Zhou model, and Gyimothy model - to four different systems. As a result, we found that the prediction capabilities varied considerably with a chosen threshold value. Therefore, we need to perform a study on the determination of an appropriate threshold value to improve the applicably of defect prediction models.

Defect Severity-based Defect Prediction Model using CL

  • Lee, Na-Young;Kwon, Ki-Tae
    • Journal of the Korea Society of Computer and Information
    • /
    • v.23 no.9
    • /
    • pp.81-86
    • /
    • 2018
  • Software defect severity is very important in projects with limited historical data or new projects. But general software defect prediction is very difficult to collect the label information of the training set and cross-project defect prediction must have a lot of data. In this paper, an unclassified data set with defect severity is clustered according to the distribution ratio. And defect severity-based prediction model is proposed by way of labeling. Proposed model is applied CLAMI in JM1, PC4 with the least ambiguity of defect severity-based NASA dataset. And it is evaluated the value of ACC compared to original data. In this study experiment result, proposed model is improved JM1 0.15 (15%), PC4 0.12(12%) than existing defect severity-based prediction models.

Development of a New Cluster Index for Semiconductor Wafer Defects and Simulation - Based Yield Prediction Models (변동계수를 이용한 반도체 결점 클러스터 지표 개발 및 수율 예측)

  • Park, Hang-Yeob;Jun, Chi-Hyuck;Hong, Yu-Shin;Kim, Soo-Young
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.21 no.3
    • /
    • pp.371-385
    • /
    • 1995
  • The yield of semiconductor chips is dependent not only on the average defect density but also on the distribution of defects over a wafer. The distribution of defects leads to consider a cluster index. This paper briefly reviews the existing yield prediction models ad proposes a new cluster index, which utilizes the information about the defect location on a wafer in terms of the coefficient of variation. An extensive simulation is performed under a variety of defect distributions and a yield prediction model is derived through the regression analysis to relate the yield with the proposed cluster index and the average number of defects per chip. The performance of the proposed simulation-based yield prediction model is compared with that of the well-known negative binomial model.

  • PDF

An Experimental Study of Generality of Software Defects Prediction Models based on Object Oriented Metrics (객체지향 메트릭 기반인 결함 예측 모형의 범용성에 관한 실험적 연구)

  • Kim, Tae-Yeon;Kim, Yun-Kyu;Chae, Heung-Seok
    • The KIPS Transactions:PartD
    • /
    • v.16D no.3
    • /
    • pp.407-416
    • /
    • 2009
  • To support an efficient management of software verification and validation activities, much research has been conducted to predict defects in early phase. And defect prediction models have been proposed to predict defects. But the generality of the models has not been experimentally studied for other software system. In other words, most of prediction models were applied only to the same system that had been used to build the prediction models themselves. Therefore, we performed an experiment to explore generality of major prediction models. In the experiment, we applied three defects prediction models to three different systems. As a result, we cannot find their generality of defect prediction capability. The cause is analyzed to result from a different metric distribution between the systems.

Software Defect Prediction Based on SAINT (SAINT 기반의 소프트웨어 결함 예측)

  • Sriman Mohapatra;Eunjeong Ju;Jeonghwa Lee;Duksan Ryu
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.5
    • /
    • pp.236-242
    • /
    • 2024
  • Software Defect Prediction (SDP) enhances the efficiency of software development by proactively identifying modules likely to contain errors. A major challenge in SDP is improving prediction performance. Recent research has applied deep learning techniques to the field of SDP, with the SAINT model particularly gaining attention for its outstanding performance in analyzing structured data. This study compares the SAINT model with other leading models (XGBoost, Random Forest, CatBoost) and investigates the latest deep learning techniques applicable to SDP. SAINT consistently demonstrated superior performance, proving effective in improving defect prediction accuracy. These findings highlight the potential of the SAINT model to advance defect prediction methodologies in practical software development scenarios, and were achieved through a rigorous methodology including cross-validation, feature scaling, and comparative analysis.

A Comparative Experiment of Software Defect Prediction Models using Object Oriented Metrics (객체지향 메트릭을 이용한 결함 예측 모형의 실험적 비교)

  • Kim, Yun-Kyu;Kim, Tae-Yeon;Chae, Heung-Seok
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.8
    • /
    • pp.596-600
    • /
    • 2009
  • To support an efficient management of software verification and validation activities, many defect prediction models have been proposed based on object oriented metrics. They usually adopt logistic regression analysis, And, they state that the correctness of prediction is about 60${\sim}$70%, We performed a similar experiment with Eclipse 3.3 to check their prediction effectiveness, However, the result shows that correctness is about 40% which is much lower than the original results. We also found that univariate logistic regression analysis produces better results than multivariate logistic regression analysis.

Centroid and Nearest Neighbor based Class Imbalance Reduction with Relevant Feature Selection using Ant Colony Optimization for Software Defect Prediction

  • B., Kiran Kumar;Gyani, Jayadev;Y., Bhavani;P., Ganesh Reddy;T, Nagasai Anjani Kumar
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.10
    • /
    • pp.1-10
    • /
    • 2022
  • Nowadays software defect prediction (SDP) is most active research going on in software engineering. Early detection of defects lowers the cost of the software and also improves reliability. Machine learning techniques are widely used to create SDP models based on programming measures. The majority of defect prediction models in the literature have problems with class imbalance and high dimensionality. In this paper, we proposed Centroid and Nearest Neighbor based Class Imbalance Reduction (CNNCIR) technique that considers dataset distribution characteristics to generate symmetry between defective and non-defective records in imbalanced datasets. The proposed approach is compared with SMOTE (Synthetic Minority Oversampling Technique). The high-dimensionality problem is addressed using Ant Colony Optimization (ACO) technique by choosing relevant features. We used nine different classifiers to analyze six open-source software defect datasets from the PROMISE repository and seven performance measures are used to evaluate them. The results of the proposed CNNCIR method with ACO based feature selection reveals that it outperforms SMOTE in the majority of cases.

Software Quality Prediction based on Defect Severity (결함 심각도에 기반한 소프트웨어 품질 예측)

  • Hong, Euy-Seok
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.5
    • /
    • pp.73-81
    • /
    • 2015
  • Most of the software fault prediction studies focused on the binary classification model that predicts whether an input entity has faults or not. However the ability to predict entity fault-proneness in various severity categories is more useful because not all faults have the same severity. In this paper, we propose fault prediction models at different severity levels of faults using traditional size and complexity metrics. They are ternary classification models and use four machine learning algorithms for their training. Empirical analysis is performed using two NASA public data sets and a performance measure, accuracy. The evaluation results show that backpropagation neural network model outperforms other models on both data sets, with about 81% and 88% in terms of accuracy score respectively.

Data Segmentation for a Better Prediction of Quality in a Multi-stage Process

  • Kim, Eung-Gu;Lee, Hye-Seon;Jun, Chi-Hyuek
    • Journal of the Korean Data and Information Science Society
    • /
    • v.19 no.2
    • /
    • pp.609-620
    • /
    • 2008
  • There may be several parallel equipments having the same function in a multi-stage manufacturing process, which affect the product quality differently and have significant differences in defect rate. The product quality may depend on what equipments it has been processed as well as what process variable values it has. Applying one model ignoring the presence of different equipments may distort the prediction of defect rate and the identification of important quality variables affecting the defect rate. We propose a procedure for data segmentation when constructing models for predicting the defect rate or for identifying major process variables influencing product quality. The proposed procedure is based on the principal component analysis and the analysis of variance, which demonstrates a better performance in predicting defect rate through a case study with a PDP manufacturing process.

  • PDF

Ambiguity Analysis of Defectiveness in NASA MDP Data Sets (NASA MDP 데이터 집합의 결함도 모호성 분석)

  • Hong, Euyseok
    • Journal of Information Technology Services
    • /
    • v.12 no.2
    • /
    • pp.361-371
    • /
    • 2013
  • Public domain defect data sets, such as NASA data sets which are available from the NASA MDP and PROMISE repositories, make it possible to compare the results of different defect prediction models by using the same data sets. This means that repeatable and general prediction models can be built. However, some recent studies have raised questions about the quality of two versions of NASA data set, and made new cleaned data sets by applying their data cleaning processes. We find that there are two ways in the NASA MDP versions to determine the defectiveness of a module, 0 or 1, and the two results are different in some cases. This serious problem, to our knowledge, has not been addressed in previous studies. To handle this ambiguity problem, we define two kinds of module defectiveness and two conditions that can be used to determine the ambiguous cases. We meticulously analyze 5 projects among the 13 NASA projects by using our ambiguity analysis method. The results show that JM1 and PC4 are the best projects with few ambiguous cases.