๐”– Bobbio Scriptorium
โœฆ   LIBER   โœฆ

Fast matrix multiplication

โœ Scribed by Carlos F. Bunge; Gerardo Cisneros


Publisher
John Wiley and Sons
Year
1987
Tongue
English
Weight
358 KB
Volume
8
Category
Article
ISSN
0192-8651

No coin nor oath required. For personal study only.

โœฆ Synopsis


Several implementations of matrix multiplication (MMUL) in Fortran and VAX assembly language are discussed. On a VAX-11/780 computer, the most efficient MMUL is achieved through vector-scalarmultiply-and-add (VSMA) operations, rather than by means of dot products. We also discuss optimal MMUL algorithms for use in virtual memory machines when the data overflow the working set.


๐Ÿ“œ SIMILAR VOLUMES


SUMMA: scalable universal matrix multipl
โœ Van De Geijn, R. A.; Watts, J. ๐Ÿ“‚ Article ๐Ÿ“… 1997 ๐Ÿ› John Wiley and Sons ๐ŸŒ English โš– 341 KB

In the paper we give a straightforward, highly efficient, scalable implementation of common matrix multiplication operations. The algorithms are much simpler than previously published methods, yield better performance, and require less work space. MPI implementations are given, as are performance re

A practical algorithm for faster matrix
โœ Igor Kaporin ๐Ÿ“‚ Article ๐Ÿ“… 1999 ๐Ÿ› John Wiley and Sons ๐ŸŒ English โš– 69 KB

The purpose of this paper is to present an algorithm for matrix multiplication based on a formula discovered by Pan [7]. For matrices of order up to 10 000, the nearly optimum tuning of the algorithm results in a rather clear non-recursive one-or two-level structure with the operation count comparab

Approximating Matrix Multiplication for
โœ Edith Cohen; David D Lewis ๐Ÿ“‚ Article ๐Ÿ“… 1999 ๐Ÿ› Elsevier Science ๐ŸŒ English โš– 241 KB

Many pattern recognition tasks, including estimation, classification, and the finding of similar objects, make use of linear models. The fundamental operation in such tasks is the computation of the dot product between a query vector and a large database of instance vectors. Often we are interested

The Combinatorics of Cache Misses during
โœ Philip J. Hanlon; Dean Chung; Siddhartha Chatterjee; Daniela Genius; Alvin R. Le ๐Ÿ“‚ Article ๐Ÿ“… 2001 ๐Ÿ› Elsevier Science ๐ŸŒ English โš– 333 KB

In this paper we construct an analytic model of cache misses during matrix multiplication. The analysis in this paper applies to square matrices of size 2 m where the array layout function is given in terms of a function 3 that interleaves the bits in the binary expansions of the row and column indi

Scalable Parallel Matrix Multiplication
โœ Keqin Li ๐Ÿ“‚ Article ๐Ÿ“… 2001 ๐Ÿ› Elsevier Science ๐ŸŒ English โš– 392 KB

Consider any known sequential algorithm for matrix multiplication over an arbitrary ring with time complexity O(N a ), where 2 < a [ 3. We show that such an algorithm can be parallelized on a distributed memory parallel computer (DMPC) in O(log N) time by using N a /log N processors. Such a parallel