𝔖 Bobbio Scriptorium
✦   LIBER   ✦

Automatically Partitioning Threads for Multithreaded Architectures

✍ Scribed by Xinan Tang; Guang R. Gao


Book ID
102600531
Publisher
Elsevier Science
Year
1999
Tongue
English
Weight
329 KB
Volume
58
Category
Article
ISSN
0743-7315

No coin nor oath required. For personal study only.

✦ Synopsis


There is an enormous amount of parallelism exposed to fine-grain multithreaded architectures to cover latencies. It is a demanding task for a multithreading programmer to manage such a degree of parallelism by hand. To use multithreaded architectures efficiently it is essential to have compiler support for automatically partitioning programs into threads. This paper solves a fundamental problem in compiling for multithreaded architectures, automatically partitioning a program into threads. The focus of such partitioning is to overlap the remote communication latency and minimize the total execution time. We first formulate the partitioning problem based on a multithreaded execution cost model. Then, we prove such a formulation is NP-hard. Therefore, we propose two heuristic thread-partitioning methods to solve this problem in practice. The advanced partitioning algorithm is a novel extension of list scheduling, and it takes advantage of the cost model to generate near-optimum partitioning results. The remote-path-based partitioning algorithm is a simplified version of the advanced one but it is easy for compiler implementation. The two partitioning algorithms were implemented respectively in a thread partitioning testbed and a research EARTH-C compiler. The experimental results show that both partitioning algorithms are effective to generate efficient threaded code, and code generated by the compiler is comparable to hand-written code.


πŸ“œ SIMILAR VOLUMES


Architectural Support and Mechanisms for
✍ Vijay Karamcheti; Andrew A. Chien πŸ“‚ Article πŸ“… 1999 πŸ› Elsevier Science 🌐 English βš– 544 KB

High-level parallel programming models supporting dynamic fine-grained threads in a global object space are becoming increasingly popular for expressing irregular applications based on sophisticated adaptive algorithms and pointer-based data structures. However, implementing these multithreaded comp