✦ LIBER ✦

Evaluating the Performance of Multithreading and Prefetching in Multiprocessors

✍ Scribed by Ricardo Bianchini; Beng-Hong Lim

Publisher: Elsevier Science
Year: 1996
Tongue: English
Weight: 498 KB
Volume: 37
Category: Article
ISSN: 0743-7315
DOI: 10.1006/jpdc.1996.0109

No coin nor oath required. For personal study only.

✦ Synopsis

This paper presents new analytical models of the performance benefits of multithreading and prefetching, and experimental measurements of parallel applications on the MIT Alewife multiprocessor. For the first time, both techniques are evaluated on a real machine as opposed to simulations. The models determine the region in the parameter space where the techniques are most effective, while the measurements determine the region where the applications lie. We find that these regions do not always overlap significantly. The multithreading model shows that only 2-4 contexts are necessary to maximize this technique's potential benefit in current multiprocessors. For these multiprocessors, multithreading improves execution time by less than 10% for most of the applications that we examined. The model also shows that multithreading can significantly improve the performance of the same applications in multiprocessors with longer latencies. Reducing context-switch overhead is not crucial. The software prefetching model shows that allowing 4 outstanding prefetches is sufficient to achieve most of this technique's potential benefit on current multiprocessors. Prefetching improves performance over a wide range of parameters, and improves execution time by as much as 20-50% even on current multiprocessors. A comparison between the two models shows that prefetching has a significant advantage over multithreading for machines with low memory latencies and/or applications with high cache miss rates, because a prefetch instruction consumes less time than a contextswitch.

📜 SIMILAR VOLUMES

Simulation and performance evaluation of

Simulation and performance evaluation of parallel software on multiprocessor systems

✍ Luigi Rizzo 📂 Article 📅 1989 🏛 Elsevier Science 🌐 English ⚖ 874 KB

The Performance Implications of Locality

The Performance Implications of Locality Information Usage in Shared-Memory Multiprocessors

✍ Frank Bellosa; Martin Steckermeier 📂 Article 📅 1996 🏛 Elsevier Science 🌐 English ⚖ 326 KB

The latest processor generations-e.g., HPPA 8000, MIPS R10000 or Ultra SPARC-include a monitoring unit. A processor monitor can count events like read/write cache misses and processor stall cycles due to load and store operations. This information is usually only used for offline profiling. However,

Implementation and evaluation of a commu

Implementation and evaluation of a communication intensive application on the EARTH multithreaded system

✍ Kevin B. Theobald; Rishi Kumar; Gagan Agrawal; Gerd Heber; Ruppa K. Thulasiram; 📂 Article 📅 2002 🏛 John Wiley and Sons 🌐 English ⚖ 224 KB

VLSI design and evaluation of high-perfo

VLSI design and evaluation of high-performance multiprocessor for state-space digital filters using block-state realization

✍ Yoshitaka Tsunekawa; Kohji Chiba; Jun Nozaki; Mamoru Miura 📂 Article 📅 1999 🏛 John Wiley and Sons 🌐 English ⚖ 271 KB 👁 2 views

The Effects of Precedence and Priority C

The Effects of Precedence and Priority Constraints on the Performance of Scan Scheduling for Hypercube Multiprocessors

✍ Phillip Krueger; Davender Babbar 📂 Article 📅 1996 🏛 Elsevier Science 🌐 English ⚖ 269 KB

range of computer systems, including general-purpose systems and many real-time systems in which some of the above information is not available. Our focus in this paper is on dynamic scheduling. A dynamic scheduler for a hypercube system can be divided into two components: a job scheduler and a pro

Evaluating the performance of forced con

Evaluating the performance of forced convection heat transfer enhancement

✍ Ichimiya, Koichi ;Morimoto, Shunichi ;Miyazawa, Toshiyoshi 📂 Article 📅 1998 🏛 John Wiley and Sons 🌐 English ⚖ 207 KB 👁 2 views

Two methods for assessing thermal performance were evaluated for four kinds of forced convective heat transfer augmentations. On method uses the first law of thermodynamics, i.e., the heat transfer improvement at (1) constant Reynolds number, (2) constant pressure loss, and (3) constant pumping powe