An adaptive approach to achieving hardware and software fault tolerance in a distributed computing environment
✍ Scribed by A. Bondavalli; S. Chiaradonna; F. Di Giandomenico; J. Xu
- Publisher
- Elsevier Science
- Year
- 2002
- Tongue
- English
- Weight
- 356 KB
- Volume
- 47
- Category
- Article
- ISSN
- 1383-7621
No coin nor oath required. For personal study only.
✦ Synopsis
This paper focuses on the problem of providing tolerance to both hardware and software faults in independent applications running on a distributed computing environment. Several hybrid-fault-tolerant architectures are identified and proposed. Given the highly varying and dynamic characteristics of the operating environment, solutions are developed mainly exploiting the adaptation property. They are based on the adaptive execution of redundant programs so as to minimise hardware resource consumption and to shorten response time, as much as possible, for a required level of fault tolerance. A method is introduced for evaluating the proposed architectures with respect to reliability, resource utilisation and response time. Examples of quantitative evaluations are also given.