In this article, we simulate and evaluate various Twolevel Scheduling algorithms for cluster-based NUMA (Non-Uniform Memory Access) multiprocessors. Twolevel Scheduling is a kind of space partitioning scheduling. We evaluate the following variations: (1) Cluster-free Algorithm and (2) Cluster-limite
Probabilistic system-level fault diagnostic algorithms for multiprocessors
✍ Scribed by Tamás Bartha; Endre Selényi
- Publisher
- Elsevier Science
- Year
- 1997
- Tongue
- English
- Weight
- 914 KB
- Volume
- 22
- Category
- Article
- ISSN
- 0167-8191
No coin nor oath required. For personal study only.
✦ Synopsis
Massively parallel computers (MPCs) introduce new requirements for system-level fault diagnosis, like handling a huge number of processing elements in a heterogeneous system. They also have specific attributes, such as regular topology and low local complexity. Traditional deterministic methods of system-level diagnosis did not consider these issues. This paper presents a new approach, called local information diagnosis that exploits the characteristics of massively parallel systems. The paper defines the diagnostic model, which is based on generalized test invalidation to handle inhomogeneity in multiprocessors.
Five effective probabilistic diagnostic algorithms using the proposed method are also given, and their space and time complexity are estimated.
📜 SIMILAR VOLUMES
Several schemes for detecting and locating faulty processors through self-diagnosis in multiprocessor systems have been discussed in the past. These schemes attempt to start multiple copies (versions) of the tasks on available idle processors simultaneously and compare the results generated by the c
We studied adaptive system-level fault diagnosis for multiprocessor systems. Processors can test each other and future tests can be selected on the basis of previous test results. Fault-free testers give always correct test results, while faulty testers are completely unreliable. The aim of diagnosi
System reliability is an important aspect of real-time systems, because the result of a real-time application may be valid only if the application functions correctly and its timing constraints are satisfied. There are two kinds of faults, hardware and software faults, and the paper considers hardwa