Browse > Article
http://dx.doi.org/10.3745/KIPSTB.2008.15-B.2.137

Fault Localization for Self-Managing Based on Bayesian Network  

Piao, Shun-Shan (성균관대학교 전자전기 컴퓨터공학과)
Park, Jeong-Min (성균관대학교 컴퓨터공학과)
Lee, Eun-Seok (성균관대학교 정보통신공학부)
Abstract
Fault localization plays a significant role in enormous distributed system because it can identify root cause of observed faults automatically, supporting self-managing which remains an open topic in managing and controlling complex distributed systems to improve system reliability. Although many Artificial Intelligent techniques have been introduced in support of fault localization in recent research especially in increasing complex ubiquitous environment, the provided functions such as diagnosis and prediction are limited. In this paper, we propose fault localization for self-managing in performance evaluation in order to improve system reliability via learning and analyzing real-time streams of system performance events. We use probabilistic reasoning functions based on the basic Bayes' rule to provide effective mechanism for managing and evaluating system performance parameters automatically, and hence the system reliability is improved. Moreover, due to large number of considered factors in diverse and complex fault reasoning domains, we develop an efficient method which extracts relevant parameters having high relationships with observing problems and ranks them orderly. The selected node ordering lists will be used in network modeling, and hence improving learning efficiency. Using the approach enables us to diagnose the most probable causal factor with responsibility for the underlying performance problems and predict system situation to avoid potential abnormities via posting treatments or pretreatments respectively. The experimental application of system performance analysis by using the proposed approach and various estimations on efficiency and accuracy show that the availability of the proposed approach in performance evaluation domain is optimistic.
Keywords
Fault Localization; Node Ordering List; Performance Evaluation; Preprocessing; Probabilistic Dependency Analysis; Self-Managing;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Charles River Analytics Inc, About Bayesian Belief Networks, Charles River Analytics, Inc., 2004
2 Irina Rish, Mark Brodie, Sheng Ma, Natalia Odintsova, Alina Beygelzimer, Genady Grabarnik, and Karina Hernandez, “Adaptive Diagnosis in Distributed Systems,” IEEE Transactions on Neural Networks, March 2005
3 J.Bronstein, A.Das., “Self-Aware Services- Using Bayesian Networks for Detecting Anomalies in Internet-based Services”, HP Labs Technical Reports HPL-2001-23R1, 2001
4 Malgorzata Steinder, Adarshpal S.Sethi, “Probabilistic Fault Localization in Communication Systems Using Belief Networks”, IEEE/ACM Transactions on Networking, pp.809-822, October 2004   DOI   ScienceOn
5 http://www.risi.com/services/sla.html
6 http://www.cs.ualberta.ca/~jcheng/bnpchlp/index.html
7 Jeffrey O. Kephart David M. Chess IBM Thomas J. Watson Research Center, “The Vision of Autonomic Computing,” IEEE Computer Society, January 2003   DOI   ScienceOn
8 Jie Cheng, David A. Bell,Weiru Liu, “An algorithm for Bayesian Belief Network construction from Data”, In Proceedings of AI &STAT', pp. 83-90, 1997
9 Cheng, J., Bell, D. and W. Liu, “Learning Bayesian Networks from Data: An Efficient Approach Based on Information Theory”, In Proceedings of the sixth ACM International Conference on Information and Knowledge Management, 1997
10 R. K. Sahoo, A. J. Oliner, I. Rish, M. Gupta, J. E. Moreira, S. Ma, R. Vilalta, and A. Sivasubramaniam, “Critical event prediction for proactive management in large-scale computer clusters,” In Proceedings of the ACM SIGKDD, Intl. Conf. on Knowledge Discovery and Data Mining, pp.426.435, August 2003
11 Yuan-Shun Dai, “Autonomic Computing and Reliability Improvement,” Proceedings of Eighth IEEE International Symposium on Object-Oriented Real-Time Distributed Computing (ISORC'05), pp. 204-206, 2005   DOI
12 IBM Self-Aware Distributed Systems: http://domino. watson.ibm.com/comm/research.nsf/pages/r.ai.innovation.2. html
13 Sun Microsystems: Predictive Self-Healing in the Solaris 10 Operating System: http://www.sun.com/ bigadmin/content/selfheal 0
14 Bhaskara Reddy Moole and Raghu Babu Korrapati, “Enterprise web site problem diagnosis using Bayesian Belief Networks”, SoutheastCon, Proceedings, IEEE, pp. 384-396, 2005
15 Rui Zhang, Steve Moyle and Steve McKeever, and Alan Bivens, “Performance Problem Localization in Self-Healing, Service-Oriented Systems using Bayesian Networks”, Proceedings of the 2007 ACM symposium on Applied computing, pp. 104-109, 2007   DOI
16 Jianguo Ding, Bernd Kramer, Yingcai Bai, and hansheng Chen, “Backward inference in Bayesian networks for distributed systems management,” Journal of Network and Systems Management, Vol.13, No. 4, December 2005   DOI
17 Ethem Alpaydm, Introduction of Machine Learning. Massachusetts Institute of Technology, pp.39-60, 2004