𝔖 Bobbio Scriptorium
✦   LIBER   ✦

A Comparison of Implementation Strategies for Nonuniform Data-Parallel Computations

✍ Scribed by Salvatore Orlando; Raffaele Perego


Publisher
Elsevier Science
Year
1998
Tongue
English
Weight
348 KB
Volume
52
Category
Article
ISSN
0743-7315

No coin nor oath required. For personal study only.

✦ Synopsis


Data-parallel languages allow programmers to easily express parallel computations by means of high-level constructs. To reduce overheads, the compiler partitions the computations among the processors at compile-time, on the basis of the static data distribution suggested by the programmer. When execution costs are nonuniform and unpredictable, some processors may be assigned more work than others. Workload imbalance can be mitigated by cyclically distributing data and associated computations or by employing adaptive strategies which build a more balanced schedule at run-time, on the basis of the actual execution costs. This paper discusses static and hybrid (static+dynamic) scheduling strategies which can be used to balance the workloads derived from the execution of nonuniform parallel loops. A multidimensional flame simulation kernel has been used to evaluate different implementation strategies on a Cray T3E. We fed the benchmark code with synthetic input data sets built on the basis of a load imbalance model and we report and compare the results obtained.


πŸ“œ SIMILAR VOLUMES


Runtime Support for Parallelization of D
✍ Maher Kaddoura; Sanjay Ranka πŸ“‚ Article πŸ“… 1997 πŸ› Elsevier Science 🌐 English βš– 96 KB

In this paper we discuss the runtime support required for the parallelization of unstructured data-parallel applications on nonuniform and adaptive environments. We describe several optimization techniques for fast remapping of data and for reducing the amount of communications between machines when

Parallel implementation of a ray tracing
✍ Lee, Tong-Yee; Raghavendra, C. S.; Nicholas, John B. πŸ“‚ Article πŸ“… 1997 πŸ› John Wiley and Sons 🌐 English βš– 145 KB πŸ‘ 3 views

Ray tracing is a well known technique to generate life-like images. Unfortunately, ray tracing complex scenes can require large amounts of CPU time and memory storage. Distributed memory parallel computers with large memory capacities and high processing speeds are ideal candidates to perform ray tr

Towards a complete framework for paralle
✍ Succi, Giancarlo; Uhrik, Carl πŸ“‚ Article πŸ“… 1996 πŸ› John Wiley and Sons 🌐 English βš– 784 KB

Although logic languages, due to their nnn-declarative nature, are widely proclaimed to be conducive in theory to parallel implementation, in fact there appears to be insufficient practical evidence to stimulate further developments in this field. The paper puts forward various complications which a