• Title/Summary/Keyword: checkpointing

Search Result 72, Processing Time 0.026 seconds

A Study of Optimal Checkpointing Interval in Real-Time Systems (실시간 시스템에서의 효과적인 Checkpointing Interval에 대한 연구)

  • 변계섭;김재훈
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2000.04a
    • /
    • pp.15-17
    • /
    • 2000
  • 실시간 시스템에서 예상치 못한 오류 발생은 성능에 악영향을 미친다. 이를 예방하기 위하여 체크포인팅이라는 후방 에러복구기법을 이용하여 오류 발생시에도 예측 가능한 결과를 보장할 수 있다. 실시간 시스템에서의 체크포인팅은 비실시간 시스템과는 달리 시간제약성을 만족시켜야 하기 때문에 비실시간에는 최적인 체크포인팅 간겨곽는 다르게 고려 되어야 한다. 이런 체크포인트 간격에 따른 성능의 차이를 시뮬레이션을 통하여 확인하였고 결과를 분석하였다.

  • PDF

A Study on Optimal Checkpointing Interval in Real-Time Systems (실시간 시스템에서의 효과적인 체크포인트 간격에 대한 연구)

  • 변계섭;김재훈
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.26 no.7A
    • /
    • pp.1220-1226
    • /
    • 2001
  • 실시간 시스템에서 예상치 못한 오류 방생은 성능에 악영향을 미친다. 이를 예방하기 위하여 체크포인팅(checkpointing)이라는 후방 에러복구기법을 이용하여 오류 발생시에도 예측 가능한 결과를 보장할 수 있다. 실시간 시스템에서의 체크포인팅은 비실시간 시스템과는 달리 시간제약성을 만족시켜야 하기 때문에 비실시간에서 최적인 체크포인팅 간격과는 다르게 고려되어야 한다. 본 논문에서는 체크포인트 간격에 따른 실시간 시스템과 비실시간 시스템간의 성능의 차이를 시뮬레이션을 통하여 확인하였고 결과를 분석하였다.

  • PDF

RELIABILITY ANALYSIS OF CHECKPOINTING MODEL WITH MULTIPLE VERIFICATION MECHANISM

  • Lee, Yutae
    • Bulletin of the Korean Mathematical Society
    • /
    • v.56 no.6
    • /
    • pp.1435-1445
    • /
    • 2019
  • We consider a checkpointing model for silent errors, where a checkpoint is taken every fixed number of verifications. Assuming generally distributed i.i.d. inter-occurrence times of errors, we derive the reliability of the model as a function of the number of verifications between two checkpoints and the duration of work interval between two verifications.

An Efficient Checkpointing Method for Mobile Hosts via the Software Agent (이동 기기에 적합한 소프트웨어 에이전트 기반의 효율적 체크포인팅 기법)

  • Lim, Sung-Chae
    • The KIPS Transactions:PartA
    • /
    • v.15A no.2
    • /
    • pp.111-118
    • /
    • 2008
  • With the advance in mobile communication systems, the need for distributed applications running on multiple mobile devices also grows gradually. As such applications are subject to H/W failures of the mobile device or communication disruptions, compared to the traditional applications in fixed networks, it is crucial to develop any recovery mechanism suitable for them. For this, checkpointing is widely used to restart interrupted applications. In this paper, we devise an efficient checkpointing method that adopts the software agent executed at the mobile support station. The agent, called the checkpointing agent, is aimed at supporting the concept of rollback-distance (R-distance) that bounds the maximum number of roll-backed local checkpoints. By means of the R-distance, our method can prevent undesirable domino effects and heavy checkpoint overhead, while providing high flexibility in checkpoint creation.

Design of Main-Memory Database Prototype System using Fuzzy Checkpoint Technique in Real-Time Environment (실시간 시스템에서 퍼지 검사점을 이용한 주기억 데이터베이스 프로토타입 시스템의설계)

  • Park, Yong-Mun;Lee, Chan-Seop;Choe, Ui-In
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.6
    • /
    • pp.1753-1765
    • /
    • 2000
  • As the areas of computer application are expanded, real-time application environments that must process as many transactions as possible within their deadlines, such as a stock transaction systems, ATM switching systems etc, have been increased recently. The reason why the conventional database systems can't process soft real-time applications is the lack of prediction and poor performance on processing transaction's deadline. If transactions want to access data stored at the secondary storage, they can not satisfy requirements of real-time applications because of the disk delay time. This paper designs a main-memory database prototype systems to be suitable to real-time applications and then this system can produce rapid results without disk i/o as all of the information are loaded in main memory database. In thesis proposed the improved techniques with respect to logging, checkpointing, and recovering in our environment. In order to improve the performance of the system, a) the frequency of log analysis and redo processing is reduced by the proposed redo technique at system failure, b) database consistency is maintained by improved fuzzy checkpointing. The performance model is proposed which consists of two parts. The first part evaluates log processing time for recovery and compares with other research activities. The second part examines checkpointing behavior.

  • PDF

Fault Recovery and Optimal Checkpointing Strategy for Dual Modular Redundancy Real-time Systems (중복구조 실시간 시스템에서의 고장 극복 및 최적 체크포인팅 기법)

  • Kwak, Seong-Woo
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.44 no.7 s.361
    • /
    • pp.112-121
    • /
    • 2007
  • In this paper, we propose a new checkpointing strategy for dual modular redundancy real-time systems. For every checkpoints the execution results from two processors, and the result saved in the previous checkpoint are compared to detect faults. We devised an operation algorithm in chectpoints to recover from transient faults as well as permanent faults. We also develop a Markov model for the optimization of the proposed checkpointing strategy. The probability of successful task execution within its deadline is derived from the Markov model. The optimal number of checkpoints is the checkpoints which makes the successful probability maximum.

Performance Comparisons of Duplex Scheme and Checkpointing Scheme for Fault-Tolerant Real-Time Systems (결함허용 실시간 시스템을 위한 이중화 기법과 체크포인팅 기법의 성능 비교)

  • Im, Seong-Hwa;Kim, Jae-Hun;Kim, Seong-Su
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.9
    • /
    • pp.2533-2539
    • /
    • 1999
  • Two scheme are widely used for fault-tolerant systems : one is the duplex system that has a physical redundancy, and the other one is the checkpointing scheme that rolls back to the last checkpoint at a failure. The average execution time and availability are important factors for measuring the performance of the fault-tolerant systems. However, in fault-tolerant real-time systems with a time constraint, meeting the time constrain instead of reducing the average execution time is the most important factor in the performance evaluation. We analyze and compare the performance of two fault-tolerant scheme (the duplex system and the checkpointing scheme) for real-time applications.

  • PDF

New execution model for CAPE using multiple threads on multicore clusters

  • Do, Xuan Huyen;Ha, Viet Hai;Tran, Van Long;Renault, Eric
    • ETRI Journal
    • /
    • v.43 no.5
    • /
    • pp.825-834
    • /
    • 2021
  • Based on its simplicity and user-friendly characteristics, OpenMP has become the standard model for programming on shared-memory architectures. Checkpointing-aided parallel execution (CAPE) is an approach that utilizes the discontinuous incremental checkpointing technique (DICKPT) to translate and execute OpenMP programs on distributed-memory architectures automatically. Currently, CAPE implements the OpenMP execution model by utilizing the DICKPT to distribute parallel jobs and their data to slave machines, and then collects the results after executing these distributed jobs. Although this model has been proven to be effective in terms of performance and compatibility with OpenMP on distributed-memory systems, it cannot fully exploit the capabilities of multicore processors. This paper presents a novel execution model for CAPE that utilizes two levels of parallelism. In the proposed model, we add another level of parallelism in the form of multithreaded processes on slave machines with the goal of better exploiting their multicore CPUs. Initial experimental results presented near the end of this paper demonstrate that this model provides significantly enhanced CAPE performance.

A Study for Checkpointing Schemes based on a TMR System (TMR 시스템 기반의 Checkpointing 기법에 관한 연구)

  • Kim, Tae-Wook;Kang, Myung-Seok;Kim, Hag-Bae
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2003.11a
    • /
    • pp.397-400
    • /
    • 2003
  • TMR(Triple Modular redundancy)은 공간여분(W/H 및 S/W)을 정적으로 활용하는 가장 간단한 구조를 지닌 대표적인 고장포용 기법중의 하나이다. TMR 구조 고장시 TMR 시스템 고장복구를 위해 잘못된 결과를 가지고 있는 프로그램의 일부분을 재실행 또는 프로그래밍 전체를 재시작하는 기법을 적용하는 것은 일반적으로 상당한 시간을 필요로 한다. 이러한 단점을 극복하기 위해 본 논문에서는 TMR 고장을 효과적으로 복구하기 위해 또 다른 형태의 시간여분 기법인 rollback과 rol1-forward 기법에 checkpoint들을 적용하여 처리하는 시간 및 공간여분을 혼용하는 기법을 제안하였다.

  • PDF

Power-aware Real-time Task Scheduling in Dependable Embedded Systems (신뢰도를 요구하는 임베디드 시스템에서의 저전력 태스크 스케쥴링)

  • Kim, Kyong Hoon;Kim, Yuna;Kim, Jong
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.3 no.1
    • /
    • pp.25-29
    • /
    • 2008
  • In this paper, we provide an adaptive power-aware checkpointing scheme for fixed priority-based DVS scheduling in dependable real-time systems. In the provided scheme, we analyze the minimum number of tolerable faults of a task and the optimal checkpointing interval in order to meet the deadline and guarantee its specified reliability. The energy-efficient voltage level at a fault arrival is also analyzed and used in the recovery of the faulty task.

  • PDF