𝔖 Bobbio Scriptorium
✦   LIBER   ✦

An adaptive approach to achieving hardware and software fault tolerance in a distributed computing environment

✍ Scribed by A. Bondavalli; S. Chiaradonna; F. Di Giandomenico; J. Xu


Publisher
Elsevier Science
Year
2002
Tongue
English
Weight
356 KB
Volume
47
Category
Article
ISSN
1383-7621

No coin nor oath required. For personal study only.

✦ Synopsis


This paper focuses on the problem of providing tolerance to both hardware and software faults in independent applications running on a distributed computing environment. Several hybrid-fault-tolerant architectures are identified and proposed. Given the highly varying and dynamic characteristics of the operating environment, solutions are developed mainly exploiting the adaptation property. They are based on the adaptive execution of redundant programs so as to minimise hardware resource consumption and to shorten response time, as much as possible, for a required level of fault tolerance. A method is introduced for evaluating the proposed architectures with respect to reliability, resource utilisation and response time. Examples of quantitative evaluations are also given.