๐”– Bobbio Scriptorium
โœฆ   LIBER   โœฆ

Application controlled checkpointing coordination for fault-tolerant distributed computing systems

โœ Scribed by Taesoon Park; Heon Y. Yeom


Publisher
Elsevier Science
Year
2000
Tongue
English
Weight
826 KB
Volume
26
Category
Article
ISSN
0167-8191

No coin nor oath required. For personal study only.

โœฆ Synopsis


In order to provide fault tolerance for distributed systems, the checkpointing technique has widely been used and many researches have been performed to reduce the overhead of checkpointing coordination. In this paper, we present a new checkpointing coordination scheme in which the application controls the coordination activity by utilizing the communication pattern of the application program. Unlike the previous solutions which do not utilize the communication pattern of cooperating processes, it is possible to reduce the coordination eort as well as the number of checkpoints enforced to be taken. Extensive simulations have been performed to evaluate the proposed scheme and we have concluded that the proposed scheme signiยฎcantly reduces the coordination overhead compared with the existing loose coordination scheme.


๐Ÿ“œ SIMILAR VOLUMES


Distributed replication mechanism for bu
โœ Hideaki Hirayama; Toshio Shirakihara; Tatsunori Kanai ๐Ÿ“‚ Article ๐Ÿ“… 2000 ๐Ÿ› John Wiley and Sons ๐ŸŒ English โš– 161 KB

A distributed fault tolerant middleware system called ARTEMIS (Advanced Reliable disTributed Environment MIddleware System) was developed for the purpose of building fault tolerant systems without modifying either the source code or the binary code of application programs in open systems. In ARTEMIS