• Title/Summary/Keyword: fault prediction

Search Result 256, Processing Time 0.026 seconds

Severity-based Software Quality Prediction using Class Imbalanced Data

  • Hong, Euy-Seok;Park, Mi-Kyeong
    • 한국컴퓨터정보학회논문지
    • /
    • 제21권4호
    • /
    • pp.73-80
    • /
    • 2016
  • Most fault prediction models have class imbalance problems because training data usually contains much more non-fault class modules than fault class ones. This imbalanced distribution makes it difficult for the models to learn the minor class module data. Data imbalance is much higher when severity-based fault prediction is used. This is because high severity fault modules is a smaller subset of the fault modules. In this paper, we propose severity-based models to solve these problems using the three sampling methods, Resample, SpreadSubSample and SMOTE. Empirical results show that Resample method has typical over-fit problems, and SpreadSubSample method cannot enhance the prediction performance of the models. Unlike two methods, SMOTE method shows good performance in terms of AUC and FNR values. Especially J48 decision tree model using SMOTE outperforms other prediction models.

IT자산 장애처리의 사전 예측을 위한 기계학습 프로세스 (Machine Learning Process for the Prediction of the IT Asset Fault Recovery)

  • 문영준;류성열;최일우
    • 정보처리학회논문지:소프트웨어 및 데이터공학
    • /
    • 제2권4호
    • /
    • pp.281-290
    • /
    • 2013
  • IT자산은 조직의 경영목적을 지원해주는 핵심영역이며, IT자산의 장애 발생시 신속한 처리를 지원하는 것은 매우 중요하다. 본 연구에서는 IT자산의 장애가 발생할 경우, 장애해결을 위하여 기존의 장애 데이터를 기초로 장애처리 예측 기법을 제시한다. 제안한 장애처리 예측 기법은 첫째, 기존의 장애처리 데이터를 전처리하여 장애처리 유형별로 분류하고 둘째, 분류된 장애처리 유형과 장애 발생 후 접수된 내용을 키워드 매핑시키는 규칙을 제정하였으며 셋째, 제정된 규칙에 의하여 장애 발생 후 장애처리 방법이 사전에 예측 가능한 기계학습 프로세스를 제시하였다. 제시한 기계학습 프로세스의 유효성을 입증하기 위하여 A사에서 6개월 동안 접수된 33,000여건의 전산기기 장애 데이터를 실험한 결과 장애처리 예측의 적중률이 약 72%였으며, 지속적인 기계학습을 통하여 81%로 향상되었다.

Software Fault Prediction at Design Phase

  • Singh, Pradeep;Verma, Shrish;Vyas, O.P.
    • Journal of Electrical Engineering and Technology
    • /
    • 제9권5호
    • /
    • pp.1739-1745
    • /
    • 2014
  • Prediction of fault-prone modules continues to attract researcher's interest due to its significant impact on software development cost. The most important goal of such techniques is to correctly identify the modules where faults are most likely to present in early phases of software development lifecycle. Various software metrics related to modules level fault data have been successfully used for prediction of fault-prone modules. Goal of this research is to predict the faulty modules at design phase using design metrics of modules and faults related to modules. We have analyzed the effect of pre-processing and different machine learning schemes on eleven projects from NASA Metrics Data Program which offers design metrics and its related faults. Using seven machine learning and four preprocessing techniques we confirmed that models built from design metrics are surprisingly good at fault proneness prediction. The result shows that we should choose Naïve Bayes or Voting feature intervals with discretization for different data sets as they outperformed out of 28 schemes. Naive Bayes and Voting feature intervals has performed AUC > 0.7 on average of eleven projects. Our proposed framework is effective and can predict an acceptable level of fault at design phases.

데이터 마이닝 기법을 이용한 특별고압 파급고장 발생가능 고객 예측모델 구축 및 신뢰도 향상방안에 관한 연구 (A study on Reliability Enhancement Method and the Prediction Model Construction of Medium-Voltage Customers Causing Distribution Line Fault Using Data Mining Techniques)

  • 배성환;김자희;홍정식;임한승
    • 전기학회논문지
    • /
    • 제58권10호
    • /
    • pp.1869-1880
    • /
    • 2009
  • Distribution line fault has been reduced gradually by the efforts on improving the quality of electrical materials and distribution system maintenance. However faults caused by medium voltage customers have been increased gradually even though we have done many efforts. The problem is that we don't know which customer will cause the fault. This paper presents the concept to find these customers using data mining techniques, which is based on accumulated fault records of medium voltage customers in the past. It also suggests the prediction model construction of medium voltage customers causing distribution line fault and methods to enhance the reliability of distribution system. We expect that we can effectively reduce faults resulted from medium voltage customers, which is 30% of total faults.

컴퓨터 고장 예측 및 진단 퍼지 전문가 시스템 (The Computer Fault Prediction and Diagnosis Fuzzy Expert System)

  • 최성운
    • 산업경영시스템학회지
    • /
    • 제23권54호
    • /
    • pp.155-165
    • /
    • 2000
  • The fault diagnosis is a systematic and unified method to find based on the observing data resulting in noises. This paper presents the fault prediction and diagnosis using fuzzy expert system technique to manipulate the uncertainties efficiently in predictive perspective. We apply a fuzzy event tree analysis to the computer system, and build up the fault prediction and diagnosis using fuzzy expert system that predicts and diagnoses the error of the system in the advance of error.

  • PDF

Safety Critical 시스템의 센서 결함 허용을 위한 Kalman Hybrid Redundancy 개발 (Development of Kalman Hybrid Redundancy for Sensor Fault-Tolerant of Safety Critical System)

  • 김만호;이석;이경창
    • 제어로봇시스템학회논문지
    • /
    • 제14권11호
    • /
    • pp.1180-1188
    • /
    • 2008
  • As many systems depend on electronics, concern for fault tolerance is growing rapidly in the safety critical system such as intelligent vehicle. In order to make system fault tolerant, there has been a body of research mainly from aerospace field including predictive hybrid redundancy by Lee. Although the predictive hybrid redundancy has the fault tolerant mechanism to satisfy the fault tolerant requirement of safety crucial system such as x-by-wire system, it suffers form the variability of prediction performance according to the input feature of system. As an alternative to the prediction method of predictive hybrid redundancy for robust fault tolerant, Kalman prediction has attracted some attention because of its well-known and often-used with its structure called Kalman hybrid redundancy. In addition, several numerical simulation results are given where the Kalman hybrid redundancy outperforms with predictive smoothing voter.

특별고압 수전설비 관리에 데이터 마이닝 기법을 적용한 파급고장 발생가능고객 예측시스템 구현 연구 (A Study on Constructing the Prediction System Using Data Mining Techniques to Find Medium-Voltage Customers Causing Distribution Line Faults)

  • 배성환;김자희;임한승
    • 전기학회논문지
    • /
    • 제58권12호
    • /
    • pp.2453-2461
    • /
    • 2009
  • Faults caused by medium-voltage customers have been increased and enlarged their portion in total distribution faults even though we have done many efforts. In the previous paper, we suggested the fault prediction model and fault prevention method for these distribution line faults. However we can't directly apply this prediction model in the field. Because we don't have an useful program to predict those customers causing distribution line faults. This paper presents the construction method of data warehouse in ERP system and the program to find customers who cause distribution line faults in medium-voltage customer's electric facility management applying data mining techniques. We expect that this data warehouse and prediction program can effectively reduce faults resulted from medium-voltage customer facility.

Analyzing Machine Learning Techniques for Fault Prediction Using Web Applications

  • Malhotra, Ruchika;Sharma, Anjali
    • Journal of Information Processing Systems
    • /
    • 제14권3호
    • /
    • pp.751-770
    • /
    • 2018
  • Web applications are indispensable in the software industry and continuously evolve either meeting a newer criteria and/or including new functionalities. However, despite assuring quality via testing, what hinders a straightforward development is the presence of defects. Several factors contribute to defects and are often minimized at high expense in terms of man-hours. Thus, detection of fault proneness in early phases of software development is important. Therefore, a fault prediction model for identifying fault-prone classes in a web application is highly desired. In this work, we compare 14 machine learning techniques to analyse the relationship between object oriented metrics and fault prediction in web applications. The study is carried out using various releases of Apache Click and Apache Rave datasets. En-route to the predictive analysis, the input basis set for each release is first optimized using filter based correlation feature selection (CFS) method. It is found that the LCOM3, WMC, NPM and DAM metrics are the most significant predictors. The statistical analysis of these metrics also finds good conformity with the CFS evaluation and affirms the role of these metrics in the defect prediction of web applications. The overall predictive ability of different fault prediction models is first ranked using Friedman technique and then statistically compared using Nemenyi post-hoc analysis. The results not only upholds the predictive capability of machine learning models for faulty classes using web applications, but also finds that ensemble algorithms are most appropriate for defect prediction in Apache datasets. Further, we also derive a consensus between the metrics selected by the CFS technique and the statistical analysis of the datasets.

송전선로 거리표정치에 대한 실 고장거리의 확률적 예측방안 (A study on the prediction method of the real fault distance using probability to the relay data of transmission line fault location)

  • 이용희;백두현;장석한
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 2006년도 제37회 하계학술대회 논문집 A
    • /
    • pp.10-11
    • /
    • 2006
  • The fault location is obtained from the distance relay that detects the fault of the transmission line. In this time, transmission line crews track down the fault location and the reasons. However, because of having error at the fault location of the distance relay, there is a discordance between real and obtained fault location. As this reason, the inspection time for finding fault location can be longer. In this paper, we proposed the statistical (regression) analysis method based on each type of relay's the historical fault location data and the real fault distance data to improve the problems. With finding the regression equation based on the regression analysis, and putting the relay fault location into that equation, the real fault distance is calculated. As a result of the Prediction fault location, the inspection time of transmission line can be reduced.

  • PDF

Performance Comparison of GPS Fault Detection and Isolation via Pseudorange Prediction Model based Test Statistics

  • Yoo, Jang-Sik;Ahn, Jong-Sun;Lee, Young-Jae;Sung, Sang-Kyung
    • Journal of Electrical Engineering and Technology
    • /
    • 제7권5호
    • /
    • pp.797-806
    • /
    • 2012
  • Fault detection and isolation (FDI) algorithms provide fault monitoring methods in GPS measurement to isolate abnormal signals from the GPS satellites or the acquired signal in receiver. In order to monitor the occurred faults, FDI generates test statistics and decides the case that is beyond a designed threshold as a fault. For such problem of fault detection and isolation, this paper presents and evaluates position domain integrity monitoring methods by formulating various pseudorange prediction methods and investigating the resulting test statistics. In particular, precise measurements like carrier phase and Doppler rate are employed under the assumption of fault free carrier signal. The presented position domain algorithm contains the following process; first a common pseudorange prediction formula is defined with the proposed variations in pseudorange differential update. Next, a threshold computation is proposed with the test statistics distribution considering the elevation angle. Then, by examining the test statistics, fault detection and isolation is done for each satellite channel. To verify the performance, simulations using the presented fault detection methods are done for an ideal and real fault case, respectively.