𝔖 Bobbio Scriptorium
✦   LIBER   ✦

A new parallel matrix multiplication algorithm on distributed-memory concurrent computers

✍ Scribed by Choi, Jaeyoung


Publisher
John Wiley and Sons
Year
1998
Tongue
English
Weight
139 KB
Volume
10
Category
Article
ISSN
1040-3108

No coin nor oath required. For personal study only.

✦ Synopsis


We present a new fast and scalable matrix multiplication algorithm called DIMMA (distribution-independent matrix multiplication algorithm) for block cyclic data distribution on distributed-memory concurrent computers. The algorithm is based on two new ideas; it uses a modified pipelined communication scheme to overlap computation and communication effectively, and exploits the LCM block concept to obtain the maximum performance of the sequential BLAS (basic linear algebra subprograms) routine in each processor even when the block size is very small or very large. The algorithm is implemented and compared with SUMMA on the Intel Paragon computer.


πŸ“œ SIMILAR VOLUMES


Parallel implementation of a ray tracing
✍ Lee, Tong-Yee; Raghavendra, C. S.; Nicholas, John B. πŸ“‚ Article πŸ“… 1997 πŸ› John Wiley and Sons 🌐 English βš– 145 KB πŸ‘ 3 views

Ray tracing is a well known technique to generate life-like images. Unfortunately, ray tracing complex scenes can require large amounts of CPU time and memory storage. Distributed memory parallel computers with large memory capacities and high processing speeds are ideal candidates to perform ray tr

Numerical simulation of fluid dynamic pr
✍ Pirozzi, Maria A. πŸ“‚ Article πŸ“… 1997 πŸ› John Wiley and Sons 🌐 English βš– 91 KB πŸ‘ 2 views

This paper describes the parallel implementation of a numerical model for the simulation of problems from fluid dynamics on distributed memory multiprocessors. The basic procedure is to apply a fully explicit upwind finite difference approximation on a staggered grid. A theoretical time complexity a

Parallel realizations of Kanerva's spars
✍ HΓ€mΓ€lΓ€inen, Timo; Klapuri, Harri; Saarinen, Jukka; Kaski, Kimmo πŸ“‚ Article πŸ“… 1997 πŸ› John Wiley and Sons 🌐 English βš– 311 KB πŸ‘ 2 views

This paper presents two parallel realizations of sparse distributed memory (SDM) on a treeshaped computer. The original model of SDM is introduced in terms of generalized computer memory and artificial neural networks (ANNs). For parallellization purposes, addressing, storage and retrieval operation