Application Level Fault Tolerance in Het
โ
Adam Beguelin; Erik Seligman; Peter Stephan
๐
Article
๐
1997
๐
Elsevier Science
๐
English
โ 425 KB
We have explored methods for checkpointing and restarting processes within the distributed object migration environment (Dome), a C++ library of data parallel objects that are automatically distributed over heterogeneous networks of workstations (NOWs). System level checkpointing methods, although t