• Title/Summary/Keyword: Fault-Tolerance Computing System

Search Result 51, Processing Time 0.022 seconds

An Efficient Mutual Exclusion Protocol in a Mobile Computing Environment

  • Park, Sung-Hoon
    • International Journal of Contents
    • /
    • v.2 no.4
    • /
    • pp.25-30
    • /
    • 2006
  • The mutual exclusion (MX) paradigm can be used as a building block in many practical problems such as group communication, atomic commitment and replicated data management where the exclusive use of an object might be useful. The problem has been widely studied in the research community since one reason for this wide interest is that many distributed protocols need a mutual exclusion protocol. However, despite its usefulness, to our knowledge there is no work that has been devoted to this problem in a mobile computing environment. In this paper, we describe a solution to the mutual exclusion problem from mobile computing systems. This solution is based on the token-based mutual exclusion algorithm.

  • PDF

Mobile Agent Location Management Protocol for Spatial Replication-based Approach in Mobile Agent Computing Environments (이동 에이전트 컴퓨팅 환경에서 공간적 복제 기반 기법을 위한 이동 에이전트 위치관리 프로토콜)

  • Yoon, Jun-Weon;Choi, Sung-Jin;Ahn, Jin-Ho
    • The KIPS Transactions:PartA
    • /
    • v.13A no.5 s.102
    • /
    • pp.455-464
    • /
    • 2006
  • In multi-regional mobile agent computing environments, spatial replication-based approach may be used as a representative mobile agent fault-tolerance technique because it allows agent execution to make progress without blocking even in case of agent failures. However, to apply this approach to real mobile agent-based computing systems, it is essential to minimize the overhead of locating and managing mobile agents replicated on each stage. This paper presents a new mobile agent location management protocol SRLM to solve this problem. The proposed protocol allows only the primary among all the replicated workers of each stage to register with its regional server and then, significantly reduces its location updating and message delivery overheads compared with the previous protocols. Also, the protocol addresses the location management problem incurred by electing the new primary among the remaining workers at a stage in case of the primary worker's failure.

Construction of the Multiple Processing Unit by De Bruijn Graph (De Bruijn 그래프에 의한 다중처리기 구성)

  • Park, Chun-Myoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.10 no.12
    • /
    • pp.2187-2192
    • /
    • 2006
  • This paper presents a method of constructing the universal multiple processing element unit(UMPEU) by De Bruijn Graph. The second method is as following. First, we propose transformation operators in order to construct the De Bruijn UMPEU using properties of graph. Second, we construct the transformation table of De Bruijn graph using above transformation operators. Finally we construct the De Bruijn graph using transformation table. The proposed UMPEU be able to construct the De Bruijn graph for any prime number and integer value of finite fields. Also the UMPEU is applied to fault-tolerant computing system, pipeline class. parallel processing network, switching function and its circuits.

Development of Virtual Parallel Processing System for Flexible Task Allocation on the Web (웹 환경에서 유연성 있는 작업 할당을 위한 가상 병렬 처리 시스템 개발)

  • 정권호;송은하;정영식
    • Journal of Korea Multimedia Society
    • /
    • v.3 no.3
    • /
    • pp.320-332
    • /
    • 2000
  • Web consists of the grand virtual system which is made of all connected computers network. We can solve the huge problem which requires high quality in cost performance and powerful computing power to use a numerous idle state system on internet as process it parallel. However, we have to consider heterogeneous computing resources, accessibility, and reliability to carry out parallel system on global environment, not network but whole Internet. In this paper, We the WebImg system which has the power of web computing, and show the flexible task allocation strategy in heterogeneous hosts. Also, we evaluate its performance, moreover the proposed task allocation strategy supplies fault tolerance by controlling host situation at any time.

  • PDF

Migration Mechanism Supporting Eficient Fault-Tolerance on Agent Platform (에이전트 플랫폼에서의 효율적인 결함-허용을 제공하는 이주 기법)

  • Seo, Dong-Min;Yun, Jong-Hyeon;Yeo, Myung-Ho;Yoo, Jae-Soo;Cho, Ki-Hyung
    • The Journal of the Korea Contents Association
    • /
    • v.7 no.9
    • /
    • pp.89-99
    • /
    • 2007
  • With the development of the internet technology, network application services based on a large number of network nodes have been focused. However, such application services require much larger network size and traffic than current network. In order to develop them, efficient solutions as well as a simple improvement of network infra processing time are required. In this paper, to contribute a improvement of network computing technology, we design and implement the agent platform software based on the agent technology that performs works independently and asynchronously on a network and platform. The proposed agent platform software supports the scalability to accommodate the number of network hosts with rapid growth, the adaptability on a variable environments, and the availability for a fault-tolerance.

Adaptive Scheduling Technique Based on Reliability in Cloud Compuing Environment (클라우드 컴퓨팅 환경에서 신뢰성 기반 적응적 스케줄링 기법)

  • Cho, In-Seock;Yu, Heon-Chang
    • The Journal of Korean Association of Computer Education
    • /
    • v.14 no.2
    • /
    • pp.75-82
    • /
    • 2011
  • Cloud computing is a computing paradigm that provides user's services anywhere, anytime in a virtualized form composed of large computing resources based on internet or intranet. In Cloud computing environments, reliability of system is impact factor because many applications handle large data. In this paper, we propose an adaptive scheduling technique based on reliability with fault tolerance that manages resource variable and resolves problems(change of user's requirement, failure occurrence) in Cloud computing environment. Futhermore, we verified the performance of the proposed scheduling through experiments in CloudSim Simulation.

  • PDF

Fault Tolerant System based on Recovery Agents (회복 에이전트 기반 결함 포용 시스템)

  • Lee, Hwa-Min;Jung, Soon-Young;Yu, Heon-Chang
    • The Journal of Korean Association of Computer Education
    • /
    • v.5 no.2
    • /
    • pp.21-28
    • /
    • 2002
  • This paper proposes a new approach to rollback-recovery using multi-agent in distributed computing system. Previous rollback-recovery protocols are dependent on inherent communication and operating system, which causes a decline of computing performance in distributed computing system. By using multi-agent, we propose rollback-recovery protocol that is independent on operating system. We define three kinds of agent. One is a recovery agent that performs rollback-recovery protocol after a failure. Other is an information agent that constructs domain knowledge as a rule of fault tolerance and information during failure-free operation. The other is facilitator agent that controls the efficient communication between agents. Also we propose rollback-recovery protocol using multi-agent and simulated the proposed rollback-recovery protocol using JAVA and agent communication language in CORBA environment.

  • PDF

DOVE : A Distributed Object System for Virtual Computing Environment (DOVE : 가상 계산 환경을 위한 분산 객체 시스템)

  • Kim, Hyeong-Do;Woo, Young-Je;Ryu, So-Hyun;Jeong, Chang-Sung
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.6 no.2
    • /
    • pp.120-134
    • /
    • 2000
  • In this paper we present a Distributed Object oriented Virtual computing Environment, called DOVE which consists of autonomous distributed objects interacting with one another via method invocations based on a distributed object model. DOVE appears to a user logically as a single virtual computer for a set of heterogeneous hosts connected by a network as if objects in remote site reside in one virtual computer. By supporting efficient parallelism, heterogeneity, group communication, single global name service and fault-tolerance, it provides a transparent and easy-to-use programming environment for parallel applications. Efficient parallelism is supported by diverse remote method invocation, multiple method invocation for object group, multi-threaded architecture and synchronization schemes. Heterogeneity is achieved by automatic data arshalling and unmarshalling, and an easy-to-use and transparent programming environment is provided by stub and skeleton objects generated by DOVE IDL compiler, object life control and naming service of object manager. Autonomy of distributed objects, multi-layered architecture and decentralized approaches in hierarchical naming service and object management make DOVE more extensible and scalable. Also,fault tolerance is provided by fault detection in object using a timeout mechanism, and fault notification using asynchronous exception handling methods

  • PDF

Integrating Resilient Tier N+1 Networks with Distributed Non-Recursive Cloud Model for Cyber-Physical Applications

  • Okafor, Kennedy Chinedu;Longe, Omowunmi Mary
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.7
    • /
    • pp.2257-2285
    • /
    • 2022
  • Cyber-physical systems (CPS) have been growing exponentially due to improved cloud-datacenter infrastructure-as-a-service (CDIaaS). Incremental expandability (scalability), Quality of Service (QoS) performance, and reliability are currently the automation focus on healthy Tier 4 CDIaaS. However, stable QoS is yet to be fully addressed in Cyber-physical data centers (CP-DCS). Also, balanced agility and flexibility for the application workloads need urgent attention. There is a need for a resilient and fault-tolerance scheme in terms of CPS routing service including Pod cluster reliability analytics that meets QoS requirements. Motivated by these concerns, our contributions are fourfold. First, a Distributed Non-Recursive Cloud Model (DNRCM) is proposed to support cyber-physical workloads for remote lab activities. Second, an efficient QoS stability model with Routh-Hurwitz criteria is established. Third, an evaluation of the CDIaaS DCN topology is validated for handling large-scale, traffic workloads. Network Function Virtualization (NFV) with Floodlight SDN controllers was adopted for the implementation of DNRCM with embedded rule-base in Open vSwitch engines. Fourth, QoS evaluation is carried out experimentally. Considering the non-recursive queuing delays with SDN isolation (logical), a lower queuing delay (19.65%) is observed. Without logical isolation, the average queuing delay is 80.34%. Without logical resource isolation, the fault tolerance yields 33.55%, while with logical isolation, it yields 66.44%. In terms of throughput, DNRCM, recursive BCube, and DCell offered 38.30%, 36.37%, and 25.53% respectively. Similarly, the DNRCM had an improved incremental scalability profile of 40.00%, while BCube and Recursive DCell had 33.33%, and 26.67% respectively. In terms of service availability, the DNRCM offered 52.10% compared with recursive BCube and DCell which yielded 34.72% and 13.18% respectively. The average delays obtained for DNRCM, recursive BCube, and DCell are 32.81%, 33.44%, and 33.75% respectively. Finally, workload utilization for DNRCM, recursive BCube, and DCell yielded 50.28%, 27.93%, and 21.79% respectively.

Design and Implementation of a Multi-Intelligent Agent based Platform for a Bio-Inspired System (생태계 모방 시스템을 위한 멀티 지능형 에이전트 기반의 플랫폼 설계 및 구현)

  • Moon, Joo-Sun;Nang, Jong-Ho
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.13 no.7
    • /
    • pp.545-549
    • /
    • 2007
  • The Bio-Inspired System focuses on the creation of an effective system model for massive network applications and is being widely developed. However, the system has a problem-difficulty implementing three features in the system, which includes scalability, adaptability and survivability. To solve this problem, we designed an Ecogent as a multiple intelligence agent, and a Bio-platform to address the three features of scalability, adaptability and survivability. The Bio-Inspired System Platform consists of an ERS (Ecogent Runtime Services) Platform and a Bio-Platform. The ERS platform serves the basic functions of mobile agents, such as Registration, Life Cycle, Migration, Communication, Location and Fault Tolerance. The Bio-Platform includes the functions of Evolution Control and Stigmergy Control to address evolution and adaptation.