Processor Allocation and Checkpoint Interval Selection in Cluster Computing Systems
โ Scribed by James S. Plank; Michael G. Thomason
- Publisher
- Elsevier Science
- Year
- 2001
- Tongue
- English
- Weight
- 301 KB
- Volume
- 61
- Category
- Article
- ISSN
- 0743-7315
No coin nor oath required. For personal study only.
โฆ Synopsis
Performance prediction of checkpointing systems in the presence of failures is a well-studied research area. While the literature abounds with performance models of checkpointing systems, none addresses the issue of selecting runtime parameters other than the optimal checkpointing interval. In particular, the issue of processor allocation is typically ignored. In this paper, we present a performance model for long-running parallel computations that execute with checkpointing enabled. We then discuss how it is relevant to today's parallel computing environments and software, and present case studies of using the model to select runtime parameters.
๐ SIMILAR VOLUMES