𝔖 Bobbio Scriptorium
✦   LIBER   ✦

Design, implementation and evaluation of ICARE: an efficient recoverable DSM

✍ Scribed by A.-M. Kermarrec; C. Morin; M. Banâtre


Publisher
John Wiley and Sons
Year
1998
Tongue
English
Weight
348 KB
Volume
28
Category
Article
ISSN
0038-0644

No coin nor oath required. For personal study only.

✦ Synopsis


In the light of the increasing throughput of local area networks, Networks Of Workstations (NOWs) which provide a Distributed Shared Memory (DSM) have become a convenient and cheaper alternative to parallel architectures in the framework of parallel scientific applications. However, the probability that a failure occurs in such a system made up of a large number of components must not be neglected, especially for long-running applications. This paper presents the design, implementation and performance evaluation of ICARE, a page-based recoverable DSM implemented on top of an ATM-based NOW running the CHORUS microkernel. ICARE relies on a Backward Error Recovery (BER) mechanism, and provides a way to combine both efficiency and high-availability. The fact that checkpoints are stored in volatile memory provides a low-cost fault-tolerance mechanism, as well as the opportunity to exploit the symbiotic relationship between the data replication implemented in DSM systems and that needed for fault-tolerance. Furthermore, ICARE efficiently implements transparent process rollback recovery. Performance evaluations show the efficiency of the ICARE prototype that implements the proposed algorithms.


📜 SIMILAR VOLUMES


The implementation and evaluation of the
✍ Susan D. Urban; Ling Fu; Jami J. Shah 📂 Article 📅 1999 🏛 John Wiley and Sons 🌐 English ⚖ 425 KB 👁 2 views

Many computer applications today require some form of distributed computing to allow different software components to communicate. Several different commercial products now exist based on the Common Object Request Broker Architecture (CORBA) of the Object Management Group. The use of such tools, how