• Title/Summary/Keyword: Fault-Tolerance by Replication

Search Result 12, Processing Time 0.028 seconds

An adaptive fault tolerance strategy for cloud storage

  • Xiai, Yan;Dafang, Zhang;Jinmin, Yang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.11
    • /
    • pp.5290-5304
    • /
    • 2016
  • With the growth of the massive amount of data, the failure probability of the cloud storage node is becoming more and more big. A single fault tolerance strategy, such as replication and erasure codes, has some unavoidable disadvantages, which can not meet the needs of the today's fault tolerance. Therefore, according to the file access frequency and size, an adaptive hybrid redundant fault tolerance strategy is proposed, which can dynamically change between the replication scheme and erasure codes scheme throughout the lifecycle. The experimental results show that the proposed scheme can not only save the storage space(reduced by 32% compared with replication), but also ensure the fast recovery of the node failures(increased by 42% compared with erasure codes).

A Design of Low Power MAC Operator with Fault Tolerance (에러 내성을 갖는 저전력 MAC 연산기 설계)

  • Jung, Han-Sam;Ku, Sung-Kwan;Chung, Ki-Seok
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.45 no.11
    • /
    • pp.50-55
    • /
    • 2008
  • As more DSP functionalities are integrated into an embedded mobile device, power consumption and device reliability have emerged as crucial issues. As the complexity of mobile embedded designs increases very rapidly, verifying the functionality of the mobile devices has become extremely difficult. Therefore, designs with error (fault) tolerance are often required since these capabilities will enable the design to operate properly even with some existence of errors. However, designs with fault tolerance may suffer from significant power overhead since fault tolerance is often achieved by resource replication. In this paper, we propose a low power and fault tolerant MAC (multiply-and-accumulate) design. The proposed MAC design is based on multiple barrel shifters since MAC designs with barrel-shifters and adders are known to be excellent in terms of power consumption.

A Verification of Replicated Operation In P2P Computing (P2P 컴퓨팅에서 중복 수행 결과의 정확성 검증 기법)

  • Park, Chan Yeol
    • The Journal of Korean Association of Computer Education
    • /
    • v.7 no.3
    • /
    • pp.35-43
    • /
    • 2004
  • Internet-based P2P computing with independent machines suffers from frequent disconnections and security threats caused by leaving, failure, network diversity, or anonymity of participated machines. Replication schemes of shared resources are used for solving these issues in many studies and implementations. We propose an operational replication scheme in P2P computing to share computing resources, and the scheme verifies the correctness of operation against faults and security threats. This verifications are carried out periodically on replicated and dependent working units without global message exchanges over the whole system. The verified working units are treated as checkpoints, and thus they could be put to practical use for fault-tolerance with rollback recovery.

  • PDF

A Research to Enhance the Fault Tolerance of the CORBA Based Traffic Information Systems (CORBA 기반 교통정보시스템의 Fault Tolerance 향상을 위한 연구)

  • Seh, Woon-Suk;Ryu, Kwang-Taek;Lee, Eun-Seok
    • The KIPS Transactions:PartD
    • /
    • v.10D no.6
    • /
    • pp.991-998
    • /
    • 2003
  • There are many methods to enhance the fault tolerance of the CORBA based real time systems by viewpoints. Among them, this paper provides a method to enable seamless services where the systems based on the CORBA have object's faults originated processing real time information. Namely, this paper observes a method to deal efficiently with object's faults happening in 3 tier architecture environments. It is possible to replicate objects as a way to enhance the fault tolerance considering object's faults. Along with it, this paper shows a method to enhance the fault tolerance ultimately and then keep the service continuity by prividing a way to allow to continue to run the systems until the FT-CORBA based one's faults are recovered.

Efficient Replication Protocols for Mobile Agent Systems (이동 에이전트 시스템을 위한 효율적인 중복 프로토콜)

  • Ahn, Jin-Ho
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.33 no.12
    • /
    • pp.907-917
    • /
    • 2006
  • In this paper, we propose a strategy to improve fault-tolerance and scalability of replicated services in mobile agent systems by applying an appropriate passive replication protocol for each replicated service according to whether the service is deterministic or non-deterministic. For this purpose, two passive replication protocols, PRPNS and PRPDS, are designed for non-deterministic and deterministic services respectively. They both allow visiting mobile agents to be forwarded to and execute their tasks on any node performing a service agent, not necessarily the primary agent. Especially, in the protocol PRPDS, after a backup service agent has received each mobile agent request and obtained its delivery sequence number from the primary service agent, the backup is responsible for processing the request and coordinating with the other replica service agents. Therefore, our strategy using the two proposed protocols can promise high scalability of replicated services a large number of mobile agents attempt to access in mobile agent systems. Our simulation results show that the proposed strategy performs much better than the one using only the traditional passive replication protocol.

Mobile Agent Location Management Protocol for Spatial Replication-based Approach in Mobile Agent Computing Environments (이동 에이전트 컴퓨팅 환경에서 공간적 복제 기반 기법을 위한 이동 에이전트 위치관리 프로토콜)

  • Yoon, Jun-Weon;Choi, Sung-Jin;Ahn, Jin-Ho
    • The KIPS Transactions:PartA
    • /
    • v.13A no.5 s.102
    • /
    • pp.455-464
    • /
    • 2006
  • In multi-regional mobile agent computing environments, spatial replication-based approach may be used as a representative mobile agent fault-tolerance technique because it allows agent execution to make progress without blocking even in case of agent failures. However, to apply this approach to real mobile agent-based computing systems, it is essential to minimize the overhead of locating and managing mobile agents replicated on each stage. This paper presents a new mobile agent location management protocol SRLM to solve this problem. The proposed protocol allows only the primary among all the replicated workers of each stage to register with its regional server and then, significantly reduces its location updating and message delivery overheads compared with the previous protocols. Also, the protocol addresses the location management problem incurred by electing the new primary among the remaining workers at a stage in case of the primary worker's failure.

Analytic Model for Optimal Checkpoints in Mobile Real-time Systems

  • Lim, Sung-Hwa;Lee, Byoung-Hoon;Kim, Jai-Hoon
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.8
    • /
    • pp.3689-3700
    • /
    • 2016
  • It is not practically feasible to apply hardware-based fault-tolerant schemes, such as hardware replication, in mobile devices. Therefore, software-based fault-tolerance techniques, such as checkpoint and rollback schemes, are required. In checkpoint and rollback schemes, the optimal checkpoint interval should be applied to obtain the best performance. Most previous studies focused on minimizing the expected execution time or response time for completing a given task. Currently, most mobile applications run in real-time environments. Therefore, it is extremely essential for mobile devices to employ optimal checkpoint intervals as determined by the real-time constraints of tasks. In this study, we tackle the problem of determining the optimal inter-checkpoint interval of checkpoint and rollback schemes to maximize the deadline meet ratio in real-time systems and to build a probabilistic cost model. From this cost model, we can numerically find the optimal checkpoint interval using mathematical tools. The performance of the proposed solution is evaluated using analytical estimates.

An ORB Extension for support of Fault-Tolerant CORBA (고장감내 CORBA를 지원하기 위한 객체중개자의 확장)

  • Shin, Bum-Joo;Son, Duk-Joo;Kim, Myung-Joon
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.7 no.2
    • /
    • pp.121-131
    • /
    • 2001
  • The failure of network and/or node on which server object is executed is a single point of system failure in the CORBA application. One of the possible ways to overcome such problem is to replicate server objects to several independent nodes. The replicated objects executing same tasks are called object group. In order to provide fault tolerance of server object, this paper proposes and implements new CORBA model that supports the object group based on active replication. The proposed model not only provides interoperability with existing CORBA application but also minimizes additional application interface required to support object group because it uses nop to exchange messages between client and server. And this paper extends IDL structure. Depending to application logic, it makes possible to prevent performance degradation caused by consistency maintenance. At present, this paper supports only active replication. But it can be easily extended to provide warm ancVor cold passive replication without modification of architecture required for active replication.

  • PDF

Update Propagation Protocol Using Tree of Replicated Data Items in Partially Replicated Databases

  • Bae, Misook;Hwang, Buhyun
    • Proceedings of the IEEK Conference
    • /
    • 2002.07c
    • /
    • pp.1859-1862
    • /
    • 2002
  • The replication of data is used to increase its availability, improve the performance of a system, and advance the fault-tolerance of a system. In this paper, it is required for the information about the location of a primary site of the replicas of each data item. The replicas of each data item are hierarchically organized to a tree based on the fact that the root is the primary replica in partially replicated databases. It eliminates useless propagation since the propagation can be done to only sites having replicas following the hierarchy of data. And our algorithm schedules transactions so that the execution order of updates at each primary site is identical at all sites by using timestamp. Using our algorithm, the consistent data are supplied and the performance of read-only transactions can be improved by using tree structure of replicas of each data item.

  • PDF

Method for Group Communication Support in CORBA using OCI (OCI를 이용한 CORBA에서의 그룹 통신 지원 방법)

  • Nam, Duk-Yun;Lee, Dong-Man
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.8 no.4
    • /
    • pp.399-410
    • /
    • 2002
  • Group communication is one of key components supporting object replication. CORBA provides little support for fault tolerance and high availability that can be supported by means of object replication. The existing approaches do not allow transparent plug-in of group communication protocols into CORBA with which CORBA application programmers are able to directly exploit group communication protocols. They either require modification of CORBA or OS, or provide no room for incorporating group communication transport protocols into CORBA. In this paper, we propose a generic group communication framework that allows transparent plug-in of various group communication protocols with no modification of standard CORBA. For this, we extend the Open Communications Interface(OCI) to support interoperability, reusability of existing group communication, and independency on ORB and OS. The proposed approach can also be applied to various group communication protocols.