• Title/Summary/Keyword: rollback

Search Result 80, Processing Time 0.025 seconds

A Multistriped Checkpointing Scheme for the Fault-tolerant Cluster Computers (다중 분할된 구조를 가지는 클러스터 검사점 저장 기법)

  • Chang, Yun-Seok
    • The KIPS Transactions:PartA
    • /
    • v.13A no.7 s.104
    • /
    • pp.607-614
    • /
    • 2006
  • The checkpointing schemes should reduce the process delay through managing the checkpoints of each node to fit the network load to enhance the performance of the process running on the cluster system that write the checkpoints into its global stable storage. For this reason, a cluster system with single IO space on a distributed RAID chooses a suitable checkpointng scheme to get the maximum IO performance and the best rollback recovery efficiency. In this paper, we improved the striped checkpointing scheme with dynamic stripe group size by adapting to the network bandwidth variation at the point of checkpointing. To analyze the performance of the multi striped checkpointing scheme, we applied Linpack HPC benchmark with MPI on our own cluster system with maximum 512 virtual nodes. The benchmark results showed that the multistriped checkpointing scheme has better performance than the striped checkpointing scheme on the checkpoint writing efficiency and rollback recovery at heavy system load.

An Efficient Checkpointing Method for Mobile Hosts via the Software Agent (이동 기기에 적합한 소프트웨어 에이전트 기반의 효율적 체크포인팅 기법)

  • Lim, Sung-Chae
    • The KIPS Transactions:PartA
    • /
    • v.15A no.2
    • /
    • pp.111-118
    • /
    • 2008
  • With the advance in mobile communication systems, the need for distributed applications running on multiple mobile devices also grows gradually. As such applications are subject to H/W failures of the mobile device or communication disruptions, compared to the traditional applications in fixed networks, it is crucial to develop any recovery mechanism suitable for them. For this, checkpointing is widely used to restart interrupted applications. In this paper, we devise an efficient checkpointing method that adopts the software agent executed at the mobile support station. The agent, called the checkpointing agent, is aimed at supporting the concept of rollback-distance (R-distance) that bounds the maximum number of roll-backed local checkpoints. By means of the R-distance, our method can prevent undesirable domino effects and heavy checkpoint overhead, while providing high flexibility in checkpoint creation.

New Z-Cycle Detection Algorithm Using Communication Pattern Transformation for the Minimum Number of Forced Checkpoints (통신 유형 변형을 이용하여 검사점 생성 개수를 개선한 검사점 Z-Cycle 검출 기법)

  • Woo Namyoon;Yeom Heon Young;Park Taesoon
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.31 no.12
    • /
    • pp.692-703
    • /
    • 2004
  • Communication induced checkpointing (CIC) is one of the checkpointing techniques to provide fault tolerance for distributed systems. Independent checkpoints that each distributed process produces without coordination are likely to be useless. Useless checkpoints, which cannot belong to any consistent global checkpoint sets, induce nondeterminant rollback. To prevent the useless checkpoints, CIC forces processes to take additional checkpoints at proper moment. The number of those forced checkpoints is the main source of failure-free overhead in CIC. In this paper, we present two new CIC protocols which satisfy 'No Z-Cycle (NZC)'property. The proposed protocols reduce the number of forced checkpoints compared to the existing protocols with the drawback of the increase in message delay. Our simulation results with the synthetic data show that the proposed protocols have lower failure-free overhead than the existing protocols. Additionally, we show that the classical 'index-based checkpointing' protocols are inefficient in constructing the consistent global cut in distributed executions.

A Dynamic Checkpoint Scheduling Scheme for Fault Tolerant Distributed Computing Systems (결함 내성 분산 시스템에서의 동적 검사점 스케쥴링 기법)

  • Park, Tae-Soon
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.29 no.2
    • /
    • pp.75-86
    • /
    • 2002
  • The selection of the optimal checkpointing interval has been a very critical issue to implement a checkpointing recovery scheme for the fault tolerant distributed system. This paper presents a new scheme that allows a process to select the proper checkpointing interval dynamically. A process in the system evaluates the cost of checkpointing and possible rollback for each checkpointing interval and selects the proper time interval for the next checkpointing Unlike the other scheme, the overhead incurred by both of the checkpointing and rollback activities are considered for the cost evaluation and current communication pattern is reflected in the selection of the checkpointing interval. Moreover, the proposed scheme requires no extra message communication for the checkpointing interval selection and can easily be incorporated into the existing checkpointing coordination schemes.

Blank Design for Optimized Thickness Distribution for Axi-symmetric Superplastic Blow Forming (축대칭 초소성 블로성형의 두께분포 최적화를 위한 블랭크 설계)

  • 이정민;홍성석;김용환
    • Transactions of Materials Processing
    • /
    • v.8 no.1
    • /
    • pp.92-100
    • /
    • 1999
  • A procedure is proposed for determining the initial thickness distribution in oder to produce a specified final thickness distribution for the axisymmetrical superplastic blow forming processes. Weighted parameter is introduced to improve the simple ad $d_traction method and the initial blank thickness distribution is obtained by optimizing the weighted parameter. This method is applied to superplastic free bulging process with the uniform thickness distribution of final shape to confirm its validity. The optimum initial blank thickness distributions is obtained from arbitrary axisymmetrical superplastic blow forming processes such as dome, cone and cylindrical cup forming with die contact. It is concluded that the ad $d_traction method with weighted parameter is an effective method for an optimum blank thickness distribution design.esign.

  • PDF

Determination of Optimal Checkpoint Interval for RM Scheduled Real-time Tasks (RM 스케줄링된 실시간 태스크에서의 최적 체크 포인터 구간 선정)

  • Kwak, Seong-Woo;Jung, Young-Joo
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.56 no.6
    • /
    • pp.1122-1129
    • /
    • 2007
  • For a system with multiple real-time tasks of different deadlines, it is very difficult to find the optimal checkpoint interval because of the complexity in considering the scheduling of tasks. In this paper, we determine the optimal checkpoint interval for multiple real-time tasks that are scheduled by RM(Rate Monotonic) algorithm. Faults are assumed to occur with Poisson distribution. Checkpoints are inserted in the execution of task with equal distance in the same task, but different distances in other tasks. When faults occur, rollback to the latest checkpoint and re-execute task after the checkpoint. We derive the equation of maximum slack time for each task, and determine the number of re-executable checkpoint intervals for fault recovery. The equation to check the schedulibility of tasks is also derived. Based on these equations, we find the probability of all tasks executed within their deadlines successfully. Checkpoint intervals which make the probability maximum is the optimal.

An analysis of the potential impact of various ozone regulatory standards on mortality

  • Kim, Yong-Ku
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.1
    • /
    • pp.125-136
    • /
    • 2011
  • Ground-level ozone, an air pollutant that is monitored by the Environmental Protection Agency (EPA), damages human health by irritating the respiratory system, reducing lung function, damaging lung cells, and aggravating asthma and other chronic conditions. In March 2008, the EPA strengthened ozone standards by lowering acceptable limits from 84 parts per billion to 75 parts per billion. Here epidemiologic data is used to study the effects of ozone regulation on human health and assessed how various regulatory standards for ozone may affect nonaccidental mortality, including respiratory-related deaths during ozone season. The assessment uses statistical methods based on hierarchical Bayesian models to predict the potential effects of the different regulatory standards. It also analyzes the variability of the results and ho they are impacted by different modeling assumptions. We focused on the technical an statistical approach to assessing relationship between new ozone regulations and mortality while other researches have detailed the relationship between ozone and human mortality. We shows a statistical correlation between ozone regulations and mortality, with lower limits of acceptable ozone linked to a decrease in deaths, and projects that mortality is expected to decrease by reducing ozone regulatory standards.

An Application-Level Fault Tolerant System For Synchronous Parallel Computation (동기 병렬연산을 위한 응용수준의 결함 내성 연산시스템)

  • Park, Pil-Seong
    • Journal of Internet Computing and Services
    • /
    • v.9 no.5
    • /
    • pp.185-193
    • /
    • 2008
  • An MTBF(mean time between failures) of large scale parallel systems is known to be only an order of several hours, and large computations sometimes result in a waste of huge amount of CPU time, However. the MPI(Message Passing Interface), a de facto standard for message passing parallel programming, suggests no possibility to handle such a problem. In this paper, we propose an application-level fault tolerant computation system, purely on the basis of the current MPI standard without using any non-standard fault tolerant MPI library, that can be used for general scientific synchronous parallel computation.

  • PDF

Design and Implementation of a Recovery Method for High Dimensional Index Structures (고차원 색인구조를 위한 회복기법의 설계 및 구현)

  • Song, Seok-Il;Lee, Seok-Hui;Yu, Jae-Su
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.7
    • /
    • pp.2008-2019
    • /
    • 2000
  • In this paper, we propose a recovery method for high dimensional index structures. It recovers efficiently transactions including reinsert operations that needs undo or rollback due to system failures or transaction failures. It is based on WAL(Write Ahead Logging) protocol. We apply the method to the FCIR-Tree and implement it based on MiDAS-III which is the storage system of a multimedia DBMS, called BADA-III. We also show through performance evaluation that the recovery method with our algorithm recovers reinsert operations efficiently over that without our algorithm.

  • PDF

Design and Implementation of EJB 2.1 Timer Service (EJB 2.1 타이머 서비스 설계 및 구현)

  • 정숭욱;이경호;김중배
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2003.10c
    • /
    • pp.247-249
    • /
    • 2003
  • EJB(Enterprise Java Beans)는 웹 응용 서버 스펙인 J2EE(Java2 Enterprise Edition)의 핵심으로서, 비즈니스 업무를 웹 환경에서 컴포넌트 형태로 작성하여 재 사용성을 높이기 위한 서버 측 컴포넌트 프로그래밍 모델이다. EJB 2.1에서는 기존 EJB 2.0에 기술된 기능 이외에 웹 서비스, 타이머 서비스, EJB QL 업그레이드 등의 기능을 추가하였다. 타이머 서비스는 지정된 시간마다 EJB 빈의 특정 함수를 호출하는 기능이다. 또한, 타이머 서비스는 트랜잭션과 연관된 경우 해당 트랜잭션 컨텍스트(context) 내에서 타이머의 롤백(rollback)을 지원해야 하며, 시스템의 고장 후 재시작 시에 기존 타이머의 복구 기능을 지원해야 한다. 본 논문에서는 EJB 스펙 2.1에서 제시한 타이머 서비스의 요구 사항에 대해 알아보고, ETRI 에서 개발한 E504 EJB 서버에서 타이머 서비스를 구현한 방법에 대해 논의한다.

  • PDF