Replication of Checkpoints in Recoverable DSM Systems

J. Brzeziński and M. Szychowiak (Poland)


distributed shared memory, recovery, checkpointing


This paper presents a new technique for object-based Distributed Shared Memory (DSM) systems. The new technique, integrated with a coherence protocol for atomic consistency model, offers high availability of shared objects in spite of multiple node and communication failures, introducing little overhead. It ensures fast recovery in case of multiple node failures and enables a DSM system to circumvent the network partitioning, as far as a majority partition can be constituted.

