DOI QR코드

DOI QR Code

Partially Decentralized Passive Replication Algorithm

부분적 분산형 수동적 중복 알고리즘

  • 안진호 (경기대하교 정보과학부 전자계산학과)
  • Published : 2005.12.01

Abstract

This paper presents a partially decentralized passive replication algorithm for deterministic servers in message-passing distributed systems. The algorithm allows any backup server, not necessarily the primary server, to take responsibility for processing its received client request and coordinating with the other replica servers after obtaining the delivery sequence number of the request from the primary. Thanks to thus desirable feature, the algorithm with conventional load balancing techniques can efficiently avoid extreme load conditions on the primary. Therefore, it can provide better scalability of deterministic and replicated sewer systems than traditional passive replication algorithms. Simulation results indicate that the proposed algorithm can reduce $16.5\%{\~}52.3\%$ of the average response time of a client request compared with the traditional ones.

본 논문에서는 메시지 전달 분산시스템서 결정적 서버를 위한 부분적 분산형 수동적 중복 알고리즘을 제안한다. 이 알고리즘은 반드시 주 서버가 아니더라도 임의의 보조서버가 자신이 수신한 클라이언트 요구에 대해 서버로부터 그 요구의 전달일련번호를 얻은 후, 그 보조서버가 직접 해당 요구를 처리하고, 이에 대한 다른 중복 서버들과의 조정에 대한 책임자 역할을 수행할 수 있도록 한다. 이러한 바람직한 특성 때문에, 제안된 알고리즘이 기존의 부하균등 기법과 결합된다면, 주 서버에의 급격한 부하 발생을 효율적으로 피할 수 있다. 따라서, 이 알고리즘은 기존의 수동적 중복 알고리즘보다 결정적 중복 서버 시스템에 대한 높은 확장성을 제공할 수 있다. 본 논문에서 수행한 시뮬레이션에서 기존 알고리즘에 비해 제안한 알고리즘이 클라이언트 요구 당 평균응답시간을 $16.5\%{\~}52.3\%$까지 줄일 수 있다는 것을 보여준다.

Keywords

References

  1. K. P. Birman and T. A. Joseph, 'Reliable communication in the presence of failures,' ACM Transactions on Computer Systems, Vol.5, No.1, pp.47-76, 1987 https://doi.org/10.1145/7351.7478
  2. K. P. Birman, T. A. Joseph, T. Raeuchle and A. E. Abbadi, 'Implementing fault-tolerant distributed objects,' IEEE Transactions on Software Engineering, Vol.11, No.6, pp.502-508, 1985 https://doi.org/10.1109/TSE.1985.232242
  3. R. Bagrodia, R. Meyer, M. Takai, Y. Chen, X. Zeng, J. Martin and H. Y. Song, 'Parsec: A Parallel Simulation Environments for Complex Systems,' IEEE Computer, pp.77-85, 1998 https://doi.org/10.1109/2.722293
  4. H. Bryhni, E. Klovning and O. Kure, 'A Comparison of Load Balancing Techniques for Scalable Web Servers,' IEEE Network, Vol.14, pp.58-64, 2000 https://doi.org/10.1109/65.855480
  5. N. Budhiraja, K. Marzullo, F. Schneider and S. Toueg, 'The primary-backup approach, Distributed Systems,' ch.8, pp.199-216, 2nd Ed., Addison-Wesley, 1993
  6. V. Cardellini, M. Colajanni and P. Yu, 'Dynamic load balancing on Web- server systems,' IEEE Internet Computing, Vol.3, pp.28-39, 1999 https://doi.org/10.1109/4236.769420
  7. T. D. Chandra and S. Toueg, 'Unreliable failure detectors for reliable distributed systems,' Journal of ACM, Vol.43, No.2, pp.225-267, 1996 https://doi.org/10.1145/226643.226647
  8. X. Defago and A. Schiper, 'Semi-passive Replication and Lazy Consensus,' Journal of Parallel and Distributed Computing Systems, Vol.64, No.12, pp.1380-1398, 2004 https://doi.org/10.1016/j.jpdc.2004.08.006
  9. M. J. Fischer, N. A. Lynch and M. S. Paterson, 'Impossibility of distributed consensus with one faulty process,' Journal of ACM, Vol.32, pp.374-382, 1985 https://doi.org/10.1145/3149.214121
  10. R. Guerraoui and A. Schiper, 'Software-Based Replication for Fault Tolerance,' IEEE Computer, Vol.30, pp.68-74, 1997 https://doi.org/10.1109/2.585156
  11. M. Herlihy and J. Wing, 'Linearizability: a correctness condition for concurrent objects,' ACM Transactions on Progr. Languages and Syst., Vol.12, No.3, pp.463-492, 1990 https://doi.org/10.1145/78969.78972
  12. M. Malcom and W. Zhao, 'Hard real time communication in multiple-access networks,' Real-Time Systems, Vol.8, pp.35-77, 1995 https://doi.org/10.1007/BF01893145
  13. D. Powell, M. Chereque and D. Drackley, 'Fault-tolerance in Delta-4.' ACM Operating Systems Review, Vol.25, pp.122-125, 1991 https://doi.org/10.1145/122120.122137
  14. R. D. Schlichting and F. B. Schneider, 'Fail-stop processors: an approach to designing fault-tolerant distributed computing systems,' ACM Transactions on Computer Systems, Vol.1, pp.222-238, 1985 https://doi.org/10.1145/357369.357371
  15. F. Schneider, 'Implementing fault-tolerant services using the state machine approach: A tutorial,' ACM Computing Surveys, Vol.22, pp.299-319, 1990 https://doi.org/10.1145/98163.98167
  16. A. Spector, 'Performing remote operations efficiently on local computer network,' Communications of the ACM, Vol.25, No.4, pp.246-260, 1982 https://doi.org/10.1145/358468.358478
  17. R. B. Strom and S. Yemeni, 'Optimistic recovery in distributed systems,' ACM Transactions on Computer Systems, Vol.3, pp.204-226, 1985 https://doi.org/10.1145/3959.3962
  18. M. F. Wiesmann, A. Schiper, B. Kemme and G. Alonso, 'Understanding Replication in Databases and Distributed Systems,' In Proc, of the 21st International Conference on Distributed Computing Systems, pp.464-474, 2000 https://doi.org/10.1109/ICDCS.2000.840959