๐”– Bobbio Scriptorium
โœฆ   LIBER   โœฆ

On the Scalability of Asynchronous Parallel Computations

โœ Scribed by D.C. Marinescu; J.R. Rice


Publisher
Elsevier Science
Year
1994
Tongue
English
Weight
753 KB
Volume
22
Category
Article
ISSN
0743-7315

No coin nor oath required. For personal study only.

โœฆ Synopsis


This paper investigates the time lost in a parallel computation due to sequential and duplicated work, communication, and blocking, and proposes characterizations of parallel algorithms based upon the communication complexity and the blocking model. It discusses the impact of the processor's architecture upon the measured speedup. It shows that a large speedup may be due to an inefficient sequential computation, rather than to an efficient parallel computation. A model of parallel computation which takes into account sequential and duplicated work, communication and control, and blocking is presented. It parametrizes scalability using three functions of the number (P) of processors: (E(P)=) the number of communication events, (f(P)=) the fraction of sequential and duplicated work plus the algorithmic blocking, (I(P)=) the instruction execution rate. The characteristic function (E(P)) is the most important as in many computations (f(P)) and (I(P)) are nearly constant. The model is used to predict the asymptotic behavior, the maximum speedup, and the optimal number of processors. A 3-D FFT algorithm and a Chebyshev iterative algorithm are used to illustrate the concepts introduced. 1994 Academic Press. Inc.


๐Ÿ“œ SIMILAR VOLUMES


Scalable Parallel Matrix Multiplication
โœ Keqin Li ๐Ÿ“‚ Article ๐Ÿ“… 2001 ๐Ÿ› Elsevier Science ๐ŸŒ English โš– 392 KB

Consider any known sequential algorithm for matrix multiplication over an arbitrary ring with time complexity O(N a ), where 2 < a [ 3. We show that such an algorithm can be parallelized on a distributed memory parallel computer (DMPC) in O(log N) time by using N a /log N processors. Such a parallel

ASYNCHRONOUS PARALLEL COMPUTING OF STRUC
โœ JIAHAO LIN; XICHENG WANG; F. W. WILLIAMS ๐Ÿ“‚ Article ๐Ÿ“… 1997 ๐Ÿ› John Wiley and Sons ๐ŸŒ English โš– 483 KB

An asynchronous parallel algorithm is developed in this paper. It is based on the newly developed sequential pseudo-excitation method for structural non-stationary random responses, used with the high precision direct (HPD-S) integration method. Examples run on a Transtech Paramid MIMD parallel comp

Performance analysis of algorithms on as
โœ R.H. Barlow; D.J. Evans; J. Shanehchi ๐Ÿ“‚ Article ๐Ÿ“… 1982 ๐Ÿ› Elsevier Science ๐ŸŒ English โš– 323 KB

The paper will present a performance analysis of a 4-processor asynchronous parallel computer based on Texas Instruments 990/10 minicomputers, recently commissioned at Loughborough University. A description of the implementation of parallel computing on the system, which was originally 4 independent

Adaptive placement of parallel Java agen
โœ Keren, Arie; Barak, Amnon ๐Ÿ“‚ Article ๐Ÿ“… 1998 ๐Ÿ› John Wiley and Sons ๐ŸŒ English โš– 68 KB ๐Ÿ‘ 2 views

This paper describes a framework for parallel computing in a locally confined, scalable computing cluster (SCC) using Java agents. The framework consists of a programming model with agents and asynchronous invocations, and a scheme for adaptive placement of multiple agents in an SCC. Our scheme is g

Performance and Scalability of Finite El
โœ E. Barragy; G.F. Carey; R. Vandegeijn ๐Ÿ“‚ Article ๐Ÿ“… 1994 ๐Ÿ› Elsevier Science ๐ŸŒ English โš– 896 KB

A two-dimensional \((h, p)\) finite element scheme for distributed parallel computation is developed. The approach is based on an element-by-element domain decomposition and is implemented on the nCUBE2 system. Example problems are used to demonstrate performance of the algorithm for a range of \((h