M. Fairén and A. Vinacua (Spain)
Distributed applications, Software development tools,
Fault-tolerance, Process state recovery.
Complex applications may beneﬁt from spreading out their
computational needs over the nodes of a network. How
ever, in doing so, they become more prone to failure be
cause of communications disruption or single-node fail
ATLAS, a framework supporting the development of
such distributed applications with minimal programming
effort, provides simple transaction-style mechanisms to re
cover from such failures, or even a total crash. This article
is an overview of the design criteria followed and the mech
anisms implemented in ATLAS to do so.