Search | Korea Science

Severity-based Software Quality Prediction using Class Imbalanced Data

Hong, Euy-Seok;Park, Mi-Kyeong
- Journal of the Korea Society of Computer and Information
- /
- v.21 no.4
- /
- pp.73-80
- /
- 2016
Most fault prediction models have class imbalance problems because training data usually contains much more non-fault class modules than fault class ones. This imbalanced distribution makes it difficult for the models to learn the minor class module data. Data imbalance is much higher when severity-based fault prediction is used. This is because high severity fault modules is a smaller subset of the fault modules. In this paper, we propose severity-based models to solve these problems using the three sampling methods, Resample, SpreadSubSample and SMOTE. Empirical results show that Resample method has typical over-fit problems, and SpreadSubSample method cannot enhance the prediction performance of the models. Unlike two methods, SMOTE method shows good performance in terms of AUC and FNR values. Especially J48 decision tree model using SMOTE outperforms other prediction models.
https://doi.org/10.9708/jksci.2016.21.4.073 인용 PDF KSCI

Machine Learning Process for the Prediction of the IT Asset Fault Recovery (IT자산 장애처리의 사전 예측을 위한 기계학습 프로세스)

Moon, Young-Joon;Rhew, Sung-Yul;Choi, Il-Woo
- KIPS Transactions on Software and Data Engineering
- /
- v.2 no.4
- /
- pp.281-290
- /
- 2013
The IT asset is a core part that supports the management objective of an organization, and the fast settlement of the IT asset fault is very important. In this study, a fault recovery prediction technique is proposed, which uses the existing fault data to address the IT asset fault. The proposed fault recovery prediction technique is as follows. First, the existing fault recovery data were pre-processed and classified by fault recovery type; second, a rule was established for the keyword mapping of the classified fault recovery types and reported data; and third, a machine learning process that allows the prediction of the fault recovery method based on the established rule was presented. To verify the effectiveness of the proposed machine learning process, company A's 33,000 computer fault data for the duration of six months were tested. The hit rate for fault recovery prediction was approximately 72%, and it increased to 81% via continuous machine learning.
https://doi.org/10.3745/KTSDE.2013.2.4.281 인용 PDF KSCI

Software Fault Prediction at Design Phase

Singh, Pradeep;Verma, Shrish;Vyas, O.P.
- Journal of Electrical Engineering and Technology
- /
- v.9 no.5
- /
- pp.1739-1745
- /
- 2014
Prediction of fault-prone modules continues to attract researcher's interest due to its significant impact on software development cost. The most important goal of such techniques is to correctly identify the modules where faults are most likely to present in early phases of software development lifecycle. Various software metrics related to modules level fault data have been successfully used for prediction of fault-prone modules. Goal of this research is to predict the faulty modules at design phase using design metrics of modules and faults related to modules. We have analyzed the effect of pre-processing and different machine learning schemes on eleven projects from NASA Metrics Data Program which offers design metrics and its related faults. Using seven machine learning and four preprocessing techniques we confirmed that models built from design metrics are surprisingly good at fault proneness prediction. The result shows that we should choose Naïve Bayes or Voting feature intervals with discretization for different data sets as they outperformed out of 28 schemes. Naive Bayes and Voting feature intervals has performed AUC > 0.7 on average of eleven projects. Our proposed framework is effective and can predict an acceptable level of fault at design phases.
https://doi.org/10.5370/JEET.2014.9.5.1739 인용 PDF KSCI KPUBS HTML

A study on Reliability Enhancement Method and the Prediction Model Construction of Medium-Voltage Customers Causing Distribution Line Fault Using Data Mining Techniques (데이터 마이닝 기법을 이용한 특별고압 파급고장 발생가능 고객 예측모델 구축 및 신뢰도 향상방안에 관한 연구)

Bae, Sung-Hwan;Kim, Ja-Hee;Hong, Jung-Sik;Lim, Han-Seung
- The Transactions of The Korean Institute of Electrical Engineers
- /
- v.58 no.10
- /
- pp.1869-1880
- /
- 2009
Distribution line fault has been reduced gradually by the efforts on improving the quality of electrical materials and distribution system maintenance. However faults caused by medium voltage customers have been increased gradually even though we have done many efforts. The problem is that we don't know which customer will cause the fault. This paper presents the concept to find these customers using data mining techniques, which is based on accumulated fault records of medium voltage customers in the past. It also suggests the prediction model construction of medium voltage customers causing distribution line fault and methods to enhance the reliability of distribution system. We expect that we can effectively reduce faults resulted from medium voltage customers, which is 30% of total faults.
PDF KSCI

The Computer Fault Prediction and Diagnosis Fuzzy Expert System (컴퓨터 고장 예측 및 진단 퍼지 전문가 시스템)

최성운
- Journal of Korean Society of Industrial and Systems Engineering
- /
- v.23 no.54
- /
- pp.155-165
- /
- 2000
The fault diagnosis is a systematic and unified method to find based on the observing data resulting in noises. This paper presents the fault prediction and diagnosis using fuzzy expert system technique to manipulate the uncertainties efficiently in predictive perspective. We apply a fuzzy event tree analysis to the computer system, and build up the fault prediction and diagnosis using fuzzy expert system that predicts and diagnoses the error of the system in the advance of error.
PDF

Development of Kalman Hybrid Redundancy for Sensor Fault-Tolerant of Safety Critical System (Safety Critical 시스템의 센서 결함 허용을 위한 Kalman Hybrid Redundancy 개발)

Kim, Man-Ho;Lee, Suk;Lee, Kyung-Chang
- Journal of Institute of Control, Robotics and Systems
- /
- v.14 no.11
- /
- pp.1180-1188
- /
- 2008
As many systems depend on electronics, concern for fault tolerance is growing rapidly in the safety critical system such as intelligent vehicle. In order to make system fault tolerant, there has been a body of research mainly from aerospace field including predictive hybrid redundancy by Lee. Although the predictive hybrid redundancy has the fault tolerant mechanism to satisfy the fault tolerant requirement of safety crucial system such as x-by-wire system, it suffers form the variability of prediction performance according to the input feature of system. As an alternative to the prediction method of predictive hybrid redundancy for robust fault tolerant, Kalman prediction has attracted some attention because of its well-known and often-used with its structure called Kalman hybrid redundancy. In addition, several numerical simulation results are given where the Kalman hybrid redundancy outperforms with predictive smoothing voter.
https://doi.org/10.5302/J.ICROS.2008.14.11.1180 인용 PDF KSCI

A Study on Constructing the Prediction System Using Data Mining Techniques to Find Medium-Voltage Customers Causing Distribution Line Faults (특별고압 수전설비 관리에 데이터 마이닝 기법을 적용한 파급고장 발생가능고객 예측시스템 구현 연구)

Bae, Sung-Hwan;Kim, Ja-Hee;Lim, Han-Seung
- The Transactions of The Korean Institute of Electrical Engineers
- /
- v.58 no.12
- /
- pp.2453-2461
- /
- 2009
Faults caused by medium-voltage customers have been increased and enlarged their portion in total distribution faults even though we have done many efforts. In the previous paper, we suggested the fault prediction model and fault prevention method for these distribution line faults. However we can't directly apply this prediction model in the field. Because we don't have an useful program to predict those customers causing distribution line faults. This paper presents the construction method of data warehouse in ERP system and the program to find customers who cause distribution line faults in medium-voltage customer's electric facility management applying data mining techniques. We expect that this data warehouse and prediction program can effectively reduce faults resulted from medium-voltage customer facility.
PDF KSCI

Analyzing Machine Learning Techniques for Fault Prediction Using Web Applications

Malhotra, Ruchika;Sharma, Anjali
- Journal of Information Processing Systems
- /
- v.14 no.3
- /
- pp.751-770
- /
- 2018
Web applications are indispensable in the software industry and continuously evolve either meeting a newer criteria and/or including new functionalities. However, despite assuring quality via testing, what hinders a straightforward development is the presence of defects. Several factors contribute to defects and are often minimized at high expense in terms of man-hours. Thus, detection of fault proneness in early phases of software development is important. Therefore, a fault prediction model for identifying fault-prone classes in a web application is highly desired. In this work, we compare 14 machine learning techniques to analyse the relationship between object oriented metrics and fault prediction in web applications. The study is carried out using various releases of Apache Click and Apache Rave datasets. En-route to the predictive analysis, the input basis set for each release is first optimized using filter based correlation feature selection (CFS) method. It is found that the LCOM3, WMC, NPM and DAM metrics are the most significant predictors. The statistical analysis of these metrics also finds good conformity with the CFS evaluation and affirms the role of these metrics in the defect prediction of web applications. The overall predictive ability of different fault prediction models is first ranked using Friedman technique and then statistically compared using Nemenyi post-hoc analysis. The results not only upholds the predictive capability of machine learning models for faulty classes using web applications, but also finds that ensemble algorithms are most appropriate for defect prediction in Apache datasets. Further, we also derive a consensus between the metrics selected by the CFS technique and the statistical analysis of the datasets.
https://doi.org/10.3745/JIPS.04.0077 인용 PDF KSCI

A study on the prediction method of the real fault distance using probability to the relay data of transmission line fault location (송전선로 거리표정치에 대한 실 고장거리의 확률적 예측방안)

Lee, Y.H.;Back, D.H.;Jang, S.H.
- Proceedings of the KIEE Conference
- /
- 2006.07a
- /
- pp.10-11
- /
- 2006
The fault location is obtained from the distance relay that detects the fault of the transmission line. In this time, transmission line crews track down the fault location and the reasons. However, because of having error at the fault location of the distance relay, there is a discordance between real and obtained fault location. As this reason, the inspection time for finding fault location can be longer. In this paper, we proposed the statistical (regression) analysis method based on each type of relay's the historical fault location data and the real fault distance data to improve the problems. With finding the regression equation based on the regression analysis, and putting the relay fault location into that equation, the real fault distance is calculated. As a result of the Prediction fault location, the inspection time of transmission line can be reduced.
PDF

Performance Comparison of GPS Fault Detection and Isolation via Pseudorange Prediction Model based Test Statistics

Yoo, Jang-Sik;Ahn, Jong-Sun;Lee, Young-Jae;Sung, Sang-Kyung
- Journal of Electrical Engineering and Technology
- /
- v.7 no.5
- /
- pp.797-806
- /
- 2012
Fault detection and isolation (FDI) algorithms provide fault monitoring methods in GPS measurement to isolate abnormal signals from the GPS satellites or the acquired signal in receiver. In order to monitor the occurred faults, FDI generates test statistics and decides the case that is beyond a designed threshold as a fault. For such problem of fault detection and isolation, this paper presents and evaluates position domain integrity monitoring methods by formulating various pseudorange prediction methods and investigating the resulting test statistics. In particular, precise measurements like carrier phase and Doppler rate are employed under the assumption of fault free carrier signal. The presented position domain algorithm contains the following process; first a common pseudorange prediction formula is defined with the proposed variations in pseudorange differential update. Next, a threshold computation is proposed with the test statistics distribution considering the elevation angle. Then, by examining the test statistics, fault detection and isolation is done for each satellite channel. To verify the performance, simulations using the presented fault detection methods are done for an ideal and real fault case, respectively.
https://doi.org/10.5370/JEET.2012.7.5.797 인용 PDF KSCI

Search Result 256, Processing Time 0.025 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)