A Recovery Scheme of Single Node Failure using Version Caching in Database Sharing Systems

데이타베이스 공유 시스템에서 버전 캐싱을 이용한 단일 노드 고장 회복 기법

  • Published : 2004.08.01

Abstract

A database sharing system (DSS) couples a number of computing nodes for high performance transaction processing, and each node in DSS shares database at the disk level. In case of node failures in DSS, database recovery algorithms are required to recover the database in a consistent state. A database recovery process in DSS takes rather longer time compared with single database systems, since it should include merging of discrete log records in several nodes and perform REDO tasks using the merged lo9 records. In this paper, we propose a two version caching (2VC) algorithm that improves the cache fusion algorithm introduced in Oracle 9i Real Application Cluster (ORAC). The 2VC algorithm can achieve faster database recovery by eliminating the use of merged log records in case of single node failure. Furthermore, it can improve the performance of normal transaction processing by reducing the amount of unnecessary disk force overhead that occurs in ORAC.

데이타베이스 공유 시스템(DSS)은 고성능 트랜잭션 처리를 위하여 여러 개의 처리 노드를 연결한 구조로서, 각 노드는 데이타베이스를 저장한 디스크를 공유한다. DSS를 구성하는 노드들이 고장날 경우 데이타베이스를 정확한 상태로 복구하기 위한 회복 과정이 필요한데 DSS에서 회복 작업은 하나의 노드로 구성된 일반적인 데이타베이스 시스템보다 많은 시간이 소요된다. 그 이유는 데이타베이스를 회복하기 위해 여러 노드에 나누어 저장된 로그들을 병합하여야 하며, 병합된 로그들을 이용하여 REDO 작업을 수행하여야 하기 때문이다. 본 논문에서는 Oracle 9i Real Application Cluster (ORAC)에서 제안된 캐쉬 연합 알고리즘의 성능을 개선한 2VC(Two Version Caching) 알고리즘을 제안한다. 2VC는 단일 노드 고장에 대한 회복 작업에서 로그 병합 과정을 생략할 수 있으므로 빠른 데이타베이스 회복을 지원할 수 있다는 장점을 갖는다. 뿐만 아니라, ORAC에서 발생하는 불필요한 디스크 기록 오버헤드를 줄임으로써 정상적인 트랜잭션 처리의 성능을 향상시킬 수 있다.

Keywords

References

  1. D. DeWitt and J. Gray, 'Parallel Database Systems: The Future of High Performance Database Systems,' Comm ACM, vol.35, no.6, pp.85-98, 1992 https://doi.org/10.1145/129888.129894
  2. M. Yousif, 'Shared-Storage Clusters,' Cluster Computing, vol.2, no.4, pp.249-257, 1999 https://doi.org/10.1023/A:1019095112733
  3. E. Rahm, 'Empirical Performance Evaluation of Concurrency and Coherency Control Protocols for Database Sharing Systems,' ACM Trans. on Database Syst., vol.18, no.2, pp.333-377. 1993 https://doi.org/10.1145/151634.151639
  4. IBM DB2 Data Staring : Planning and Administration, IBM, SC26-9935-01, 2001
  5. R. Yevich and S. Lawson, DB2 Universal Database for OS/390, Prentice Hall, 2002
  6. Oracle 9i Real Application Clusters Concepts Release 1, OracIe Corp., part A89867-02, 2001
  7. H. Cho, 'Cache Coherency and Concurrency Control in a Multisystem Data Sharing Environment,' IEICE Trans. on Information and Syst., vol.E82-D, no.6, pp.1042-1050, 1999
  8. J. Josten, C. Mohan, I. Narang, and J. Teng, 'DB2's Use of the Coupling Facility for Data Sharing,' IBM System J., vol.36, no.2, pp.327-350, 1997 https://doi.org/10.1147/sj.362.0327
  9. T. Lahiri et al., 'Cache Fusion: Extending Shared-Disk Clusters with Shared Caches,' in: Proc. 27th Int. Conf., VLDB, pp.683-686, 2001
  10. C. Mohan and I. Narang, 'Recovery and Coherency Control Protocols for Fast Intersystem Page Transfer and Fine-Granularity Locking in a Shared Disks Transaction Environment,' in: Proc. 17th Int. Conf., VLDB, pp.193-207, 1991
  11. H. Cho, 'Database Recovery using Incomplete Page Versions in a Multisystem Data Sharing Environment,' Information Processing Letters, vol.83, no.1, pp.49-55, 2002 https://doi.org/10.1016/S0020-0190(01)00304-0
  12. A. Dan, P. Yu, and A. Jhingran, 'Recovery Analysis of Data Sharing Systems under Deferred Dirty Page Propagation Policies,' IEEE Trans. on Parallel and Distributed Syst., vol.8, no.7, pp. 695-711, 1997 https://doi.org/10.1109/71.598345
  13. R. Rastogi et al., 'Distributed Multi-Level Recovery in Main-Memory Databases:' Distributed and Parallel Databases, vol.6, no.1, pp. 41-71, 1998 https://doi.org/10.1023/A:1008694713931
  14. C. Mohan et al., 'ARIES: A Transaction Recovery Method Supporting Fine-Granularity Locking and Partial Rollbacks Using Write-Ahead Logging,' ACM Trans. on Database Syst., vol.17, no.1, pp.94-162, 1992 https://doi.org/10.1145/128765.128770
  15. E. Panagos, A. Biliris, H. V. Jagadish, and R. Rastogi, 'Client-Based Logging for high Performance Distributed Architectures,' in: Proc. Int. Conf. on Data Eng., pp.344-351, 1996 https://doi.org/10.1109/ICDE.1996.492182
  16. Oracle 7 Parallel Server Concepts and Administration, Oracle Corp., part A42522-1, 1996
  17. T. Lahiri et al., 'Fast-Start: Quick Fault Recovery in Oracle,' in: Proc. ACM SIGMOD, pp.593-598, 2001 https://doi.org/10.1145/375663.375751
  18. Oracle 9i Database Concepts Release 1, Oracle Corp., part A88856-02, 2001
  19. H. Schwetman, CSIM User's Guide for use with CSIM Revision 16, MCC, 1992