๐”– Bobbio Scriptorium
โœฆ   LIBER   โœฆ

Fully Scalable Fault-Tolerant Simulations for BSP and CGM

โœ Scribed by Sung-Ryul Kim; Kunsoo Park


Publisher
Elsevier Science
Year
2000
Tongue
English
Weight
392 KB
Volume
60
Category
Article
ISSN
0743-7315

No coin nor oath required. For personal study only.

โœฆ Synopsis


In this paper we consider general simulations of algorithms designed for fully operational BSP and CGM machines on machines with faulty processors. The BSP (or CGM) machine is a parallel multicomputer consisting of p processors for which a memory of n words is evenly distributed and each processor can send and receive at most h messages in a superstep. The faults are deterministic (i.e., worst-case distributions of faults are considered) and static (i.e., they do not change in the course of computation). We assume that a constant fraction of processors are faulty.

We present two fault-tolerant simulation techniques for BSP and CGM:

  1. A deterministic simulation that achieves O(1) slowdown for local computations and O((log h p) 2 ) slowdown for communications per superstep, provided that a preprocessing is done that requires O((log h p) 2 ) supersteps and linear (in h) computation per processor in each superstep.

  2. A randomized simulation that achieves O(1) slowdown for local computations and O(log h p) slowdown for communications per superstep with high probability, after the same (deterministic) preprocessing as above.

Our results are fully scalable over all values of p from 3(1) to 3(n). Furthermore, our results imply that if p n = for 0<=<1 and h=3((nร‚ p) $ ) for 0<$ 1 (which hold in almost all practical BSP and CGM computations), algorithms can be made resilient to a constant fraction of processor faults without any asymptotic slowdown.


๐Ÿ“œ SIMILAR VOLUMES