✦ LIBER ✦

On the Scalability of Asynchronous Parallel Computations

✍ Scribed by D.C. Marinescu; J.R. Rice

Publisher: Elsevier Science
Year: 1994
Tongue: English
Weight: 753 KB
Volume: 22
Category: Article
ISSN: 0743-7315
DOI: 10.1006/jpdc.1994.1109

No coin nor oath required. For personal study only.

✦ Synopsis

This paper investigates the time lost in a parallel computation due to sequential and duplicated work, communication, and blocking, and proposes characterizations of parallel algorithms based upon the communication complexity and the blocking model. It discusses the impact of the processor's architecture upon the measured speedup. It shows that a large speedup may be due to an inefficient sequential computation, rather than to an efficient parallel computation. A model of parallel computation which takes into account sequential and duplicated work, communication and control, and blocking is presented. It parametrizes scalability using three functions of the number (P) of processors: (E(P)=) the number of communication events, (f(P)=) the fraction of sequential and duplicated work plus the algorithmic blocking, (I(P)=) the instruction execution rate. The characteristic function (E(P)) is the most important as in many computations (f(P)) and (I(P)) are nearly constant. The model is used to predict the asymptotic behavior, the maximum speedup, and the optimal number of processors. A 3-D FFT algorithm and a Chebyshev iterative algorithm are used to illustrate the concepts introduced. 1994 Academic Press. Inc.

📜 SIMILAR VOLUMES

Scalable Parallel Matrix Multiplication

Scalable Parallel Matrix Multiplication on Distributed Memory Parallel Computers

✍ Keqin Li 📂 Article 📅 2001 🏛 Elsevier Science 🌐 English ⚖ 392 KB

Consider any known sequential algorithm for matrix multiplication over an arbitrary ring with time complexity O(N a ), where 2 < a [ 3. We show that such an algorithm can be parallelized on a distributed memory parallel computer (DMPC) in O(log N) time by using N a /log N processors. Such a parallel

ASYNCHRONOUS PARALLEL COMPUTING OF STRUC

ASYNCHRONOUS PARALLEL COMPUTING OF STRUCTURAL NON-STATIONARY RANDOM SEISMIC RESPONSES

✍ JIAHAO LIN; XICHENG WANG; F. W. WILLIAMS 📂 Article 📅 1997 🏛 John Wiley and Sons 🌐 English ⚖ 483 KB

An asynchronous parallel algorithm is developed in this paper. It is based on the newly developed sequential pseudo-excitation method for structural non-stationary random responses, used with the high precision direct (HPD-S) integration method. Examples run on a Transtech Paramid MIMD parallel comp

On the Scalability of Data-Parallel Deco

On the Scalability of Data-Parallel Decomposition Algorithms for Stochastic Programs

✍ R.J. Qi; S.A. Zenios 📂 Article 📅 1994 🏛 Elsevier Science 🌐 English ⚖ 461 KB

Performance analysis of algorithms on as

Performance analysis of algorithms on asynchronous parallel processors

✍ R.H. Barlow; D.J. Evans; J. Shanehchi 📂 Article 📅 1982 🏛 Elsevier Science 🌐 English ⚖ 323 KB

The paper will present a performance analysis of a 4-processor asynchronous parallel computer based on Texas Instruments 990/10 minicomputers, recently commissioned at Loughborough University. A description of the implementation of parallel computing on the system, which was originally 4 independent

Adaptive placement of parallel Java agen

Adaptive placement of parallel Java agents in a scalable computing cluster

✍ Keren, Arie; Barak, Amnon 📂 Article 📅 1998 🏛 John Wiley and Sons 🌐 English ⚖ 68 KB 👁 2 views

This paper describes a framework for parallel computing in a locally confined, scalable computing cluster (SCC) using Java agents. The framework consists of a programming model with agents and asynchronous invocations, and a scheme for adaptive placement of multiple agents in an SCC. Our scheme is g

Performance and Scalability of Finite El

Performance and Scalability of Finite Element Analysis for Distributed Parallel Computation

✍ E. Barragy; G.F. Carey; R. Vandegeijn 📂 Article 📅 1994 🏛 Elsevier Science 🌐 English ⚖ 896 KB

A two-dimensional \((h, p)\) finite element scheme for distributed parallel computation is developed. The approach is based on an element-by-element domain decomposition and is implemented on the nCUBE2 system. Example problems are used to demonstrate performance of the algorithm for a range of \((h