DOI QR코드

DOI QR Code

Determination of Optimal Checkpoint Intervals for Real-Time Tasks Using Distributed Fault Detection

분산 고장 탐지 방식을 이용한 실시간 태스크에서의 최적 체크포인터 구간 선정

  • Kwak, Seong Woo (Department of Electronic Engineering, Keimyung University) ;
  • Yang, Jung-Min (School of Electronics Engineering, Kyungpook National University)
  • Received : 2016.04.18
  • Accepted : 2016.05.23
  • Published : 2016.06.25

Abstract

Checkpoint placement is an effective fault tolerance technique against transient faults in which the task is re-executed from the latest checkpoint when a fault is detected. In this paper, we propose a new checkpoint placement strategy separating data saving and fault detection processes that are performed together in conventional checkpoints. Several fault detection processes are performed in one checkpoint interval in order to decrease the latency between the occurrence and detection of faults. We address the placement method of fault detection processes to maximize the probability of successful execution of a task within the given deadline. We develop the Markov chain model for a real-time task having the proposed checkpoints, and derive the optimal fault detection and checkpoint interval.

체크포인터를 삽입한 실시간 시스템에서는 고장이 발생하면 고장 직전의 체크포인터로 회귀하여 태스크를 재실행함으로써 과도 고장을 효과적으로 극복할 수 있다. 이번 논문에서는 체크포인터에서 실행되는 데이터 저장과 고장 탐지 과정을 분리한 새로운 체크포인터 방식을 제안한다. 하나의 체크포인터 구간 내에 여러 개의 고장 탐지 과정을 추가하면 고장 발생에서 탐지까지의 지연 시간을 줄일 수 있다. 본 논문에서는 태스크가 데드라인 이내에서 성공적으로 수행될 확률을 최대화하는 고장 탐지 과정의 삽입 방법을 제안한다. 고장 탐지 과정이 분리된 체크포인터 방식을 마코프 체인으로 모델링하고 실시간 태스크의 성공적 수행 확률을 계산하는 모의실험을 수행하여 최적의 해를 구하는 과정을 제시한다.

Keywords

References

  1. S. Punnekkat, A. Burns, and R. Davis, “Analysis of checkpointing for real-time systems,” International Journal of Time-Critical Computing Systems, Vol. 20, No. 1, pp. 83-102, 2001.
  2. T. Ozaki, T. Dohi, H. Okamura, and N. Kaio, “Distributionfree checkpoint placement algorithms based on min-max principle,” IEEE Transactions on Dependable and Secure Computing, Vol. 3, No. 2, pp. 130-140, 2006. https://doi.org/10.1109/TDSC.2006.22
  3. J. W. Young, “A first order approximation to the optimal checkpoint intervals,” Communications of the ACM, Vol. 17, No. 9, pp. 530-531, 1974. https://doi.org/10.1145/361147.361115
  4. Y. Ling, J. Mi, and X. Lin, “A variational calculus approach to optimal checkpoint placement,” IEEE Transactions on Computers, Vol. 50, No. 7, pp. 699-708, 2001. https://doi.org/10.1109/12.936236
  5. S. W. Kwak and Y. J. Jung, "Determination of optimal checkpoint interval for RM scheduled real-time tasks," Transactions of the Korean Institute of Electrical Engineers, vol. 56, No. 6, pp. 1122-1129, 2007.
  6. S. W. Kwak and J.-M. Yang, “Determining checkpoint intervals of non-preemptive rate monotonic scheduling using probabilistic optimization,” Journal of Korean Institute of Intelligent Systems, Vol. 21, No. 1, pp. 120-127, 2011. https://doi.org/10.5391/JKIIS.2011.21.1.120
  7. S. W. Kwak and J.-M. Yang, “Optimal checkpoint placement for real-time systems with multi-tasks having deadlines longer than periods,” Transactions of the Korean Institute of Electrical Engineers, Vol. 61, No. 1, pp. 148-154, 2012. https://doi.org/10.5370/KIEE.2012.61.1.148