𝔖 Bobbio Scriptorium
✦   LIBER   ✦

Towards Structured Parallel Computing on Architecture-Independent Parallel Algorithm Design for Distributed-Memory Architectures

✍ Scribed by Feng Gao


Publisher
Elsevier Science
Year
1996
Tongue
English
Weight
403 KB
Volume
53
Category
Article
ISSN
0022-0000

No coin nor oath required. For personal study only.

✦ Synopsis


This paper introduces an architecture-independent, hierarchical approach to algorithm design on distributed-memory architectures, in contrast to the current trend of tailoring algorithms towards specific architectures. We show that, rather surprisingly, this new approach can achieve uniformity without sacrificing efficiency. In our framework, there are three levels of algorithm design: design of a networkindependent algorithm in a network-independent programming environment, design of virtual networks (virtual architectures) for the algorithm, and design of emulations of the virtual networks on physical networks. In its organizational principle, this methodology is analogous to the abstract data structure approach to sequential algorithm design. We propose the following thesis: architecture-independent optimality can lead to portable optimality. Namely, a single network-independent algorithm, when optimized network-independently, with the support of properly chosen virtual networks, can be implemented on a wide spectrum of physical networks to achieve optimality on each of them with respect to both computation and communication. We illustrate this thesis with an analysis of the example of algorithm design for ordinary matrix multiplication. In a paper by Gao, a general theory of portable optimality of parallel algorithms is presented. Besides its implications to the methodology of parallel algorithm design, our framework also suggests new questions for theoretical research in parallel computation on interconnection networks.


πŸ“œ SIMILAR VOLUMES


General Algorithm for Improved Lattice A
✍ FrΓ©dΓ©ric D.R. Bonnet; Derek B. Leinweber; Anthony G. Williams πŸ“‚ Article πŸ“… 2001 πŸ› Elsevier Science 🌐 English βš– 275 KB

Quantum field theories underlie all of our understanding of the fundamental forces of nature. There are relatively few first-principles approaches to the study of quantum field theories (such as quantum chromodynamics [QCD] relevant to the strong interaction) apart from the perturbative (i.e., weak-

Performance Analysis of the Parallel Kar
✍ GIOVANNI CESARI; ROMAN MAEDER πŸ“‚ Article πŸ“… 1996 πŸ› Elsevier Science 🌐 English βš– 447 KB

We present three parallel implementations of the Karatsuba algorithm for long integer multiplication on a distributed memory architecture and discuss the experimental results obtained on a Paragon computer. The first two implementations have both time complexity O(n) on n log 2 3 processors, but pre

A novel parallel algorithm for large-sca
✍ Hajime Takashima; So Yamada; Shigeru Obara; Kunihiro Kitamura; Shinjiro Inabata; πŸ“‚ Article πŸ“… 2002 πŸ› John Wiley and Sons 🌐 English βš– 186 KB πŸ‘ 1 views

## Abstract We developed a novel parallel algorithm for large‐scale Fock matrix calculation with small locally distributed memory architectures, and named it the β€œ__RT__ parallel algorithm.” The __RT__ parallel algorithm actively involves the concept of integral screening, which is indispensable fo

Run-Time Techniques for Exploiting Irreg
✍ Cong Fu; Tao Yang πŸ“‚ Article πŸ“… 1997 πŸ› Elsevier Science 🌐 English βš– 372 KB

Automatic scheduling for directed acyclic graphs (DAG) and its applications for coarse-grained irregular problems such as large n-body simulation have been studied in the literature. However, solving irregular problems with mixed granularities such as sparse matrix factorization is challenging since