𝔖 Bobbio Scriptorium
✦   LIBER   ✦

Tiling on systems with communication/computation overlap

✍ Scribed by Calland, Pierre-Yves; Dongarra, Jack; Robert, Yves


Book ID
101219470
Publisher
John Wiley and Sons
Year
1999
Tongue
English
Weight
123 KB
Volume
11
Category
Article
ISSN
1040-3108

No coin nor oath required. For personal study only.

✦ Synopsis


In the framework of fully permutable loops, tiling is a compiler technique (also known as 'loop blocking') that has been extensively studied as a source-to-source program transformation. Little work has been devoted to the mapping and scheduling of the tiles on to physical parallel processors. We present several new results in the context of limited computational resources and assuming communication-computation overlap. In particular, under some reasonable assumptions, we derive the optimal mapping and scheduling of tiles to physical processors.


πŸ“œ SIMILAR VOLUMES


A pipelined schedule to minimize complet
✍ N. Koziris; A. Sotiropoulos; G. Goumas πŸ“‚ Article πŸ“… 2003 πŸ› Elsevier Science 🌐 English βš– 292 KB

This paper proposes a new method for the problem of minimizing the execution time of nested for-loops using a tiling transformation. In our approach, we are interested not only in tile size and shape according to the required communication to computation ratio, but also in overall completion time. W

On the Utility of Communication–Computat
✍ Michael J. Quinn; Philip J. Hatcher πŸ“‚ Article πŸ“… 1996 πŸ› Elsevier Science 🌐 English βš– 260 KB

However, the speedup achieved through parallelism is often lower in modern systems. It is no surprise, then, that developers of compilers for data-parallel languages have hypothesized the importance of optimizations that overlap communications with computations in order to reduce execution times and

A computational system for lattice QCD w
✍ Ting-Wai Chiu; Tung-Han Hsieh; Chao-Hsi Huang; Tsung-Ren Huang πŸ“‚ Article πŸ“… 2003 πŸ› Elsevier Science 🌐 English βš– 276 KB

We outline the essential features of a Linux PC cluster which is now being developed at National Taiwan University, and discuss how to optimize its hardware and software for lattice QCD with overlap Dirac quarks. At present, the cluster constitutes of 30 nodes, with each node consisting of one Penti