• 제목/요약/키워드: Fault-tolerance

Search Result 570, Processing Time 0.03 seconds

Design Technique and Application for Distributed Recovery Block Using the Partitioning Operating System Based on Multi-Core System (멀티코어 기반 파티셔닝 운영체제를 이용한 분산 복구 블록 설계 기법 및 응용)

  • Park, Hansol
    • Journal of IKEEE
    • /
    • v.19 no.3
    • /
    • pp.357-365
    • /
    • 2015
  • Recently, embedded systems such as aircraft and automobilie, are developed as modular architecture instead of federated architecture because of SWaP(Size, Weight and Power) issues. In addition, partition operating system that support multiple logical node based on partition concept were recently appeared. Distributed recovery block is fault tolerance design scheme that applicable to mission critical real-time system to support real-time take over via real-time synchronization between participated nodes. Because of real-time synchronization, single-core based computer is not suitable for partition based distributed recovery block design scheme. Multi-core and AMP(Asymmetric Multi-Processing) based partition architecture is required to apply distributed recovery block design scheme. In this paper, we proposed design scheme of distributed recovery block on the multi-core based supervised-AMP architecture partition operating system. This paper implements flight control simulator for avionics to check feasibility of our design scheme.

Efficient Replication Protocols for Mobile Agent Systems (이동 에이전트 시스템을 위한 효율적인 중복 프로토콜)

  • Ahn, Jin-Ho
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.33 no.12
    • /
    • pp.907-917
    • /
    • 2006
  • In this paper, we propose a strategy to improve fault-tolerance and scalability of replicated services in mobile agent systems by applying an appropriate passive replication protocol for each replicated service according to whether the service is deterministic or non-deterministic. For this purpose, two passive replication protocols, PRPNS and PRPDS, are designed for non-deterministic and deterministic services respectively. They both allow visiting mobile agents to be forwarded to and execute their tasks on any node performing a service agent, not necessarily the primary agent. Especially, in the protocol PRPDS, after a backup service agent has received each mobile agent request and obtained its delivery sequence number from the primary service agent, the backup is responsible for processing the request and coordinating with the other replica service agents. Therefore, our strategy using the two proposed protocols can promise high scalability of replicated services a large number of mobile agents attempt to access in mobile agent systems. Our simulation results show that the proposed strategy performs much better than the one using only the traditional passive replication protocol.

Mobile Agent Location Management Protocol for Spatial Replication-based Approach in Mobile Agent Computing Environments (이동 에이전트 컴퓨팅 환경에서 공간적 복제 기반 기법을 위한 이동 에이전트 위치관리 프로토콜)

  • Yoon, Jun-Weon;Choi, Sung-Jin;Ahn, Jin-Ho
    • The KIPS Transactions:PartA
    • /
    • v.13A no.5 s.102
    • /
    • pp.455-464
    • /
    • 2006
  • In multi-regional mobile agent computing environments, spatial replication-based approach may be used as a representative mobile agent fault-tolerance technique because it allows agent execution to make progress without blocking even in case of agent failures. However, to apply this approach to real mobile agent-based computing systems, it is essential to minimize the overhead of locating and managing mobile agents replicated on each stage. This paper presents a new mobile agent location management protocol SRLM to solve this problem. The proposed protocol allows only the primary among all the replicated workers of each stage to register with its regional server and then, significantly reduces its location updating and message delivery overheads compared with the previous protocols. Also, the protocol addresses the location management problem incurred by electing the new primary among the remaining workers at a stage in case of the primary worker's failure.

Term Clustering and Duplicate Distribution for Efficient Parallel Information Retrieval (효율적인 병렬정보검색을 위한 색인어 군집화 및 분산저장 기법)

  • 강재호;양재완;정성원;류광렬;권혁철;정상화
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.1_2
    • /
    • pp.129-139
    • /
    • 2003
  • The PC cluster architecture is considered as a cost-effective alternative to the existing supercomputers for realizing a high-performance information retrieval (IR) system. To implement an efficient IR system on a PC cluster, it is essential to achieve maximum parallelism by having the data appropriately distributed to the local hard disks of the PCs in such a way that the disk I/O and the subsequent computation are distributed as evenly as possible to all the PCs. If the terms in the inverted index file can be classified to closely related clusters, the parallelism can be maximized by distributing them to the PCs in an interleaved manner. One of the goals of this research is the development of methods for automatically clustering the terms based on the likelihood of the terms' co-occurrence in the same query. Also, in this paper, we propose a method for duplicate distribution of inverted index records among the PCs to achieve fault-tolerance as well as dynamic load balancing. Experiments with a large corpus revealed the efficiency and effectiveness of our method.

HyperCerts : Privacy-Enhanced OTP-Based Educational Certificate Blockchian System (HyperCerts : 개인정보를 고려한 OTP 기반 디지털 졸업장 블록체인 시스템)

  • Jung, Seung Wook
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.28 no.4
    • /
    • pp.987-997
    • /
    • 2018
  • Blockchain has tamper-free, so many applications are developing to leverage tamper-free features of blockchain. MIT Media Labs proposed BlockCerts, educational certificate blockchain System, to solve problems of legacy certificate verifications. Existing educational certificate blockchain Systems are based on public blockchain such as bitcoin, Ethereum, so any entity can participate educational institute in principal. Moreover, the exisitng educational certricate blockchain system utilizes the integrity of blockchain, but the confidentiality of the educational certificate is not provided. This paper propose a digital certificate system based on private blockchain, name HyperCerts. Therefore, only trusted entity can participate in the private blockchain network, Hyperledger, as the issuer of digital certificate. Furthermore, the practical byzantine fault tolerance is used as consensus algorithm, HyperCerts reduce dramatically the latency of issuing digital certificate and required computing power. HyperCerts stores the hash value of digital certificate into the ledger, so breach of personal information by malicious entity in the private blockchain is protected.

Improving Fault Tolerance for High-capacity Shared Distributed File Systems using the Rotational Lease Under Network Partitioning (대용량 공유 분산 화일 시스템에서 망 분할 시 순환 리스를 사용한 고장 감내성 향상)

  • Tak, Byung-Chul;Chung, Yon-Dohn;Kim, Myoung-Ho
    • Journal of KIISE:Databases
    • /
    • v.32 no.6
    • /
    • pp.616-627
    • /
    • 2005
  • In the shared storage file system, systems can directly access the shared storage device through specialized data-only subnetwork unlike in the network attached file server system. In this shared-storage architecture, data consistency is maintained by some designated set of lock servers which use control network to send and receive the lock information. Furthermore, lease mechanism is introduced to cope with the control network failure. But when the control network is partitioned, participating systems can no longer make progress after the lease term expires until the network recovers. This paper addresses this limitation and proposes a method that allows partitioned systems to make progress under the partition of control network. The proposed method works in a manner that each participating system is rotationally given a predefined lease term periodically. It is also shown that the proposed mechanism always preserves data consistency.

Compound Backup Technique using Hot-Cold Data Classification in the Distributed Memory System (분산메모리시스템에서의 핫콜드 데이터 분류를 이용한 복합 백업 기법)

  • Kim, Woo Chur;Min, Dong Hee;Hong, Ji Man
    • Smart Media Journal
    • /
    • v.4 no.3
    • /
    • pp.16-23
    • /
    • 2015
  • As the IT technology advances, data processing system is required to handle and process large amounts of data. However, the existing On-Disk system has limit to process data which increase rapidly. For that reason, the In-Memory system is being used which saves and manages data on the fast memory not saving data into hard disk. Although it has fast processing capability, it is necessary to use the fault tolerance techniques in the In-Memory system because it has a risk of data loss due to volatility which is one of the memory characteristics. These fault tolerance techniques lead to performance degradation of In-Memory system. In this paper, we classify the data into Hot and Cold data in consideration of the data usage characteristics in the In-Memory system and propose compound backup technique to ensure data persistence. The proposed technique increases the persistence and improves performance degradation.

New Z-Cycle Detection Algorithm Using Communication Pattern Transformation for the Minimum Number of Forced Checkpoints (통신 유형 변형을 이용하여 검사점 생성 개수를 개선한 검사점 Z-Cycle 검출 기법)

  • Woo Namyoon;Yeom Heon Young;Park Taesoon
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.31 no.12
    • /
    • pp.692-703
    • /
    • 2004
  • Communication induced checkpointing (CIC) is one of the checkpointing techniques to provide fault tolerance for distributed systems. Independent checkpoints that each distributed process produces without coordination are likely to be useless. Useless checkpoints, which cannot belong to any consistent global checkpoint sets, induce nondeterminant rollback. To prevent the useless checkpoints, CIC forces processes to take additional checkpoints at proper moment. The number of those forced checkpoints is the main source of failure-free overhead in CIC. In this paper, we present two new CIC protocols which satisfy 'No Z-Cycle (NZC)'property. The proposed protocols reduce the number of forced checkpoints compared to the existing protocols with the drawback of the increase in message delay. Our simulation results with the synthetic data show that the proposed protocols have lower failure-free overhead than the existing protocols. Additionally, we show that the classical 'index-based checkpointing' protocols are inefficient in constructing the consistent global cut in distributed executions.

A Multi-path QoS Routing Protocol for the OFDM-TDMA Mesh Networks (OFDM-TDMA 메쉬 네트워크를 위한 다중경로 QoS 라우팅 프로토콜)

  • Choi, Jungwook;Lee, Hyukjoon
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.14 no.1
    • /
    • pp.57-67
    • /
    • 2015
  • A large amount of work has been done in the areas of routing, MAC, QoS, capacity, location service, cooperative communication, fault tolerance, mobility models and various applications of mesh networks thanks to their merits of cost-effective way of deployment and flexibility in extending wireline services. Although multi-path routing protocols have been proposed to be used to provide QoS and fault-tolerance, there has not been any significant results discussed that support both in the literature to our best knowledge as they are often required in military and public safety applications. In this paper, we present a novel routing protocol for a mesh network based on the OFDM-TDMA collision-free MAC that discovers and maintains multiple paths that allows retransmitting and forwarding packets that have been blocked due to a link failure using an alternative next-hop node such that the delay-capacity tradeoff is reduced and the reliability is enhanced. Simulation results show that the proposed protocol performs well in terms of both the QoS and delivery ratio.

A Robust Energy Saving Data Dissemination Protocol for IoT-WSNs

  • Kim, Moonseong;Park, Sooyeon;Lee, Woochan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.12
    • /
    • pp.5744-5764
    • /
    • 2018
  • In Wireless Sensor Networks (WSNs) for Internet of Things (IoT) environment, fault tolerance is a most fundamental issue due to strict energy constraint of sensor node. In this paper, a robust energy saving data dissemination protocol for IoT-WSNs is proposed. Minimized energy consumption and dissemination delay time based on signal strength play an important role in our scheme. The representative dissemination protocol SPIN (Sensor Protocols for Information via Negotiation) overcomes overlapped data problem of the classical Flooding scheme. However, SPIN never considers distance between nodes, thus the issue of dissemination energy consumption is becoming more important problem. In order to minimize the energy consumption, the shortest path between sensors should be considered to disseminate the data through the entire IoT-WSNs. SPMS (Shortest Path Mined SPIN) scheme creates routing tables using Bellman Ford method and forwards data through a multi-hop manner to optimize power consumption and delay time. Due to these properties, it is very hard to avoid heavy traffic when routing information is updated. Additionally, a node failure of SPMS would be caused by frequently using some sensors on the shortest path, thus network lifetime might be shortened quickly. In contrast, our scheme is resilient to these failures because it employs energy aware concept. The dissemination delay time of the proposed protocol without a routing table is similar to that of shortest path-based SPMS. In addition, our protocol does not require routing table, which needs a lot of control packets, thus it prevents excessive control message generation. Finally, the proposed scheme outperforms previous schemes in terms of data transmission success ratio, therefore our protocol could be appropriate for IoT-WSNs environment.