๐”– Bobbio Scriptorium
โœฆ   LIBER   โœฆ

Termination detection in data-driven parallel computations/applications

โœ Scribed by Ashfaq A. Khokhar; Susanne E. Hambrusch; Erturk Kocalar


Publisher
Elsevier Science
Year
2003
Tongue
English
Weight
233 KB
Volume
63
Category
Article
ISSN
0743-7315

No coin nor oath required. For personal study only.

โœฆ Synopsis


High-performance computing applications with data-driven communication and computation characteristics require synchronization routines in the form of eureka, barrier, or termination synchronization. In this paper, we consider termination synchronization for two different execution models, the AP and the APS model. In the AP model, processors are either active or passive and a passive processor can be made active by another active processor. In the APS model, processors can also be in a server state. A passive processor entering the server state does not become active again. In addition, a server processor cannot change the status of other processors. We describe and analyze solutions for both models and present experimental work highlighting the differences between the models. We show that in almost all situations the use of an AP algorithm to detect termination in an APS environment will result in loss of performance. Our experimental work on the Cray T3E provides insight into where and why this performance loss occurs.


๐Ÿ“œ SIMILAR VOLUMES


Data driven parallelism in experimental
โœ Martin Pohl ๐Ÿ“‚ Article ๐Ÿ“… 1987 ๐Ÿ› Elsevier Science ๐ŸŒ English โš– 650 KB

I present global design principles for the implementation of High Energy Physics data analysis code on sequential and parallel processors with mixed shared and local memory. Potential parallelism in the structure of High Energy Physics tasks is identified with granularity varying from a few times 10

Termination detection in parallel loop n
โœ Max Geigl; Martin Griebl; Christian Lengauer ๐Ÿ“‚ Article ๐Ÿ“… 1999 ๐Ÿ› Elsevier Science ๐ŸŒ English โš– 879 KB

One central problem in the execution of parallel nested loops with non-ane bounds is the precise scanning (i.e., enumeration) of the points in their iteration space and the detection of their termination. Scanning schemes have been proposed for both shared-memory and distributed-memory implementatio

Runtime Support for Parallelization of D
โœ Maher Kaddoura; Sanjay Ranka ๐Ÿ“‚ Article ๐Ÿ“… 1997 ๐Ÿ› Elsevier Science ๐ŸŒ English โš– 96 KB

In this paper we discuss the runtime support required for the parallelization of unstructured data-parallel applications on nonuniform and adaptive environments. We describe several optimization techniques for fast remapping of data and for reducing the amount of communications between machines when

Decentralized remapping of data parallel
โœ Xu, Chengzhong; Lau, Francis C. M.; Diekmann, Ralf ๐Ÿ“‚ Article ๐Ÿ“… 1997 ๐Ÿ› John Wiley and Sons ๐ŸŒ English โš– 348 KB

In this paper we present a decentralized remapping method for data parallel applications on distributed memory multiprocessors. The method uses a generalized dimension exchange (GDE) algorithm periodically during the execution of an application to balance (remap) the system's workload. We implemente