Consider any known sequential algorithm for matrix multiplication over an arbitrary ring with time complexity O(N a ), where 2 < a [ 3. We show that such an algorithm can be parallelized on a distributed memory parallel computer (DMPC) in O(log N) time by using N a /log N processors. Such a parallel
Four-Index transformation on distributed-memory parallel computers
β Scribed by Lawrence A. Covick; Kenneth M. Sando
- Publisher
- John Wiley and Sons
- Year
- 1990
- Tongue
- English
- Weight
- 834 KB
- Volume
- 11
- Category
- Article
- ISSN
- 0192-8651
No coin nor oath required. For personal study only.
β¦ Synopsis
Because it has 0(N5) operations, a low computation to data transfer ratio, and is a compact piece of code, the four-index transformation is a good test case for parallel algorithm development of electronic structure calculations. We present an algorithm primarily designed for distributed-memory machines. Unlike the previous algorithm of Whiteside et al., ours is not designed with a particular architecture in mind. It is a general algorithm in the sense that not only can it be used on some common architectures but it can utilize some of the advantages inherent in each. In addition, we present formulas predicting that there would be a twofold decrease in communication time if our algorithm was used instead of that of Whiteside et al., on a square array of processors and up to an N-fold decrease if the two algorithms were implemented on a hypercube.
π SIMILAR VOLUMES
The four-index transformation has a high ratio of data transfer to computation making it a potential "bottleneck" for parallel correlation energy determination. We present formulas for the communication times on different parallel architectures for an algorithm that is primarily designed for distrib
A parallel distributed implementation of the second-order Mdler-Plesset perturbation theory method, widely used in quantum chemistry, is presented. Parallelization strategy and performance for the HONDO quantum chemistry program running on a network of Unix computers are also discussed. Superlinear
This paper describes the parallel implementation of a numerical model for the simulation of problems from fluid dynamics on distributed memory multiprocessors. The basic procedure is to apply a fully explicit upwind finite difference approximation on a staggered grid. A theoretical time complexity a