Browse > Article

Fault Recovery and Optimal Checkpointing Strategy for Dual Modular Redundancy Real-time Systems  

Kwak, Seong-Woo (Dept. of EE, Keimyung University)
Publication Information
Abstract
In this paper, we propose a new checkpointing strategy for dual modular redundancy real-time systems. For every checkpoints the execution results from two processors, and the result saved in the previous checkpoint are compared to detect faults. We devised an operation algorithm in chectpoints to recover from transient faults as well as permanent faults. We also develop a Markov model for the optimization of the proposed checkpointing strategy. The probability of successful task execution within its deadline is derived from the Markov model. The optimal number of checkpoints is the checkpoints which makes the successful probability maximum.
Keywords
Checkpoint; Dual Modular Redundancy; Real-time Task; Fault Recovery;
Citations & Related Records
연도 인용수 순위
  • Reference
1 C. M. Krishna and A. D. Singh, 'Reliability of checkpointed real-time systems using time redundancy,' IEEE Trans. on Reliability, vol. 42, pp. 427-435, Sep. 1993   DOI   ScienceOn
2 Seong Woo Kwak, Byung Jae Choi and Byung Kook Kim, 'Optimal Checkpointing Strategy for Real-Time Control Systems under Faults with Exponential Duration', IEEE Trans. on Reliability, vol.50, no.3, pp. 293-301, Sep. 2001   DOI   ScienceOn
3 곽성우, 하드데드라인을 가지는 다중 실시간 주기적 태스크에서의 체크포인팅 기법, 전기학회논문지-D, 제53권 제8호, pp. 594-601, 2004년8월8
4 Avi Ziv and Jehoshua Bruck, 'An on-line algorithm for checkpoint placement,' IEEE Trans. on Computers, vol. 46, pp. 976-984, Sep. 1997   DOI   ScienceOn
5 John W. Young, 'A first order approximation to the optimal checkpoint intervals,' Comm. of the ACM, vol. 17, pp.530-531, Nov. 1974   DOI
6 곽성우, 유관호, TMR 실시간 제어시스템의 내고장성 기법 및 신뢰도 해석, 제어.자동화시스템공학논문지, vol.10, no.8, pp.748-754, 2004년 8월
7 R. Geist, R. Reynolds, and J. Westall, 'Selection of a checkpoint interval in a critical-task environment,' IEEE Trans. on Reliability, vol. 37, pp. 395-400, Oct. 1988   DOI   ScienceOn
8 Kang G. Shin, Tein-Hsiang Lin, and Yann-Hang Lee, 'Optimal checkpointing of real-time tasks,' IEEE Trans. on Computers, vol. C-36, pp. 1328-1341, Nov. 1987   DOI   ScienceOn
9 Seong Woo Kwak and Byung Kook Kim, 'Task Scheduling Strategies for Reliable TMR Controllers using Task Grouping and Assignment', IEEE Trans. on Reliability, vol. 49, no.4, pp. 355-362, Dec. 2000   DOI   ScienceOn
10 H. Kim and K. G. Shin, 'Design and Analysis of an Optimal Instruction Retry Policy for TMR Controller Computers', IEEE Trans. on Computers, vol 45, pp. 1217-1225, Nov. 1996   DOI   ScienceOn
11 Seong Woo, Kwak, 'Reliability Analysis and Design of Real-time Fault Tolerant Control Systems under Transient Faults', Ph.D thesis, KAIST, 2000
12 C. M. Krishna and A. D. Singh, 'Optimal configuration of redundant real-time systems in the face of correlated failure,' IEEE Trans. on Reliability, vol. 44, pp. 587-594. Dec.1995   DOI   ScienceOn